Yair Carmon

27 papers · 2017–2025 · 5 conferences · across top CS/AI conferences

Achievements

+11 more ↓

🏃 Academic Marathon (8) 🌉 Interdisciplinary Bridge 🧭 Keyword Pioneer 🌍 Conference Polyglot (5) 🐝 Cross-Pollinator (8)

🌍 Conference Polyglot (5) 🏃 Academic Marathon (8) 🌈 Renaissance Researcher (5) 👥 Mega-Team (60) 👑 Triple Crown 🔬 Deep Specialist (13) ⚡ Prolific Year (6) 📈 Trend Setter 💎 Century Club (27) 🗃️ Keyword Collector (98) 🔥 Unstoppable (9)

Conferences

NIPS (12) COLT (6) ICML (6) ICLR (2) EMNLP (1)

Top co-authors

Aaron Sidford (8) Ludwig Schmidt (7) Yujia Jin (6) Arun Jambulapati (5) Oliver Hinder (5) John C. Duchi (5) Mitchell Wortsman (5) Jenia Jitsev (4) Vaishaal Shankar (4) Gabriel Ilharco (4)

Keywords

convex optimization (7) stochastic convex optimization (4) gradient descent (4) accelerated gradient (3) oracle complexity (3) parameter-free optimization (3) neural network optimization (3) data filtering (2) second-order method (2) multilevel monte carlo (2) stochastic gradient descent (2) distributionally robust optimization (2) stochastic method (2) lipschitz function (2) loss landscape (2) language model (2) non-convex optimization (2) stochastic optimization (2) convex loss (2) scaling law (2)

Papers

Language models scale reliably with over-training and on downstream tasks ICLR 2025 The Price of Adaptivity in Stochastic Convex Optimization COLT 2024 Resolving Discrepancies in Compute-Optimal Scaling of Language Models NIPS 2024 DataComp-LM: In search of the next generation of training sets for language models NIPS 2024 Accelerated Parameter-Free Stochastic Optimization COLT 2024 Malign Overfitting: Interpolation and Invariance are Fundamentally at Odds ICLR 2023 Gradient Descent Monotonically Decreases the Sharpness of Gradient Flow Solutions in Scalar Networks and Beyond ICML 2023 DoG is SGD’s Best Friend: A Parameter-Free Dynamic Step Size Schedule ICML 2023 DataComp: In search of the next generation of multimodal datasets NIPS 2023 Scaling Laws Under the Microscope: Predicting Transformer Performance from Small Scale Experiments EMNLP 2022 Optimal and Adaptive Monteiro-Svaiter Acceleration NIPS 2022 Model soups: averaging weights of multiple fine-tuned models improves accuracy without increasing inference time ICML 2022 RECAPP: Crafting a More Efficient Catalyst for Convex Optimization ICML 2022 Making SGD Parameter-Free COLT 2022 Distributionally Robust Optimization via Ball Oracle Acceleration NIPS 2022 Never Go Full Batch (in Stochastic Convex Optimization) NIPS 2021 Accuracy on the Line: on the Strong Correlation Between Out-of-Distribution and In-Distribution Generalization ICML 2021 Stochastic Bias-Reduced Gradient Methods NIPS 2021 Thinking Inside the Ball: Near-Optimal Minimization of the Maximal Loss COLT 2021 Second-Order Information in Non-Convex Stochastic Optimization: Power and Limitations COLT 2020 Large-Scale Methods for Distributionally Robust Optimization NIPS 2020 Acceleration with a Ball Optimization Oracle NIPS 2020 A Rank-1 Sketch for Matrix Multiplicative Weights COLT 2019 Variance Reduction for Matrix Games NIPS 2019 Unlabeled Data Improves Adversarial Robustness NIPS 2019 Analysis of Krylov Subspace Solutions of Regularized Non-Convex Quadratic Problems NIPS 2018 “Convex Until Proven Guilty”: Dimension-Free Acceleration of Gradient Descent on Non-Convex Functions ICML 2017