conftrace_

Leo Gao

5 papers · 2022–2025 · 3 conferences · across top CS/AI conferences

Achievements

Jump to papers ↓

+1 more ↓

🌍 Conference Polyglot (3) 🌈 Renaissance Researcher (5) 🌉 Interdisciplinary Bridge 🧭 Keyword Pioneer 🐝 Cross-Pollinator (15)

👥 Mega-Team (40)

Conferences

ICLR (2) ICML (2) ACL (1)

Top co-authors

Ilya Sutskever (2) Jeffrey Wu (2) Jan Leike (2) Stella Biderman (2) Andrea Santilli (1) Thomas Wang (1) Albert Webson (1) Jan Hendrik Kirchner (1) Arnaud Stiegler (1) Ryan Teehan (1)

Keywords

transformer architecture (1) reinforcement learning (1) reinforcement learning from human feedback (1) human feedback (1) synthetic datum (1) reward model (1) scaling law (1) model weight (1) dense model (1) autoregressive language model (1) open-source model (1) large language model (1) reward model overoptimization (1) goodhart's law (1)

Papers

Scaling and evaluating sparse autoencoders ICLR 2025 Weak-to-Strong Generalization: Eliciting Strong Capabilities With Weak Supervision ICML 2024 Scaling Laws for Reward Model Overoptimization ICML 2023 GPT-NeoX-20B: An Open-Source Autoregressive Language Model ACL 2022 Multitask Prompted Training Enables Zero-Shot Task Generalization ICLR 2022