Papers
79 papers found
Scaling Laws for Associative Memories
Vivien Cabannes, Elvis Dohmatob, Alberto Bietti
Scaling Laws of RoPE-based Extrapolation
Xiaoran Liu, Hang Yan, Chenxin An et al.
Scaling Laws for Sparsely-Connected Foundation Models
Elias Frantar, Carlos Riquelme Ruiz, Neil Houlsby et al.
Physics of Language Models: Part 3.3, Knowledge Capacity Scaling Laws
Zeyuan Allen-Zhu, Yuanzhi Li
Scaling Laws for Precision
Tanishq Kumar, Zachary Ankner, Benjamin Frederick Spector et al.
Analyzing Neural Scaling Laws in Two-Layer Networks with Power-Law Data Spectra
Roman Worschech, Bernd Rosenow
The Journey Matters: Average Parameter Count over Pre-training Unifies Sparse and Dense Scaling Laws
Tian Jin, Ahmed Imtiaz Humayun, Utku Evci et al.
Towards Neural Scaling Laws for Time Series Foundation Models
Qingren Yao, Chao-Han Huck Yang, Renhe Jiang et al.
High-dimensional Analysis of Knowledge Distillation: Weak-to-Strong Generalization and Scaling Laws
Muhammed Emrullah Ildiz, Halil Alperen Gozeten, Ege Onur Taga et al.
Inference Scaling Laws: An Empirical Analysis of Compute-Optimal Inference for LLM Problem-Solving
Yangzhen Wu, Zhiqing Sun, Shanda Li et al.
Data Scaling Laws in Imitation Learning for Robotic Manipulation
Fanqi Lin, Yingdong Hu, Pingyue Sheng et al.
A Solvable Attention for Neural Scaling Laws
Bochen Lyu, Di Wang, Zhanxing Zhu
Adaptive Data Optimization: Dynamic Sample Selection with Scaling Laws
Yiding Jiang, Allan Zhou, Zhili Feng et al.
Scaling Laws for Downstream Task Performance in Machine Translation
Berivan Isik, Natalia Ponomareva, Hussein Hazimeh et al.
How Much is a Noisy Image Worth? Data Scaling Laws for Ambient Diffusion.
Giannis Daras, Yeshwanth Cherapanamjeri, Constantinos Costis Daskalakis
How Feature Learning Can Improve Neural Scaling Laws
Blake Bordelon, Alexander Atanasov, Cengiz Pehlevan
Breaking Neural Network Scaling Laws with Modularity
Akhilan Boopathy, Sunshine Jiang, William Yue et al.
Data Scaling Laws in NMT: The Effect of Noise and Architecture
Yamini Bansal, Behrooz Ghorbani, Ankush Garg et al.
Unified Scaling Laws for Routed Language Models
Aidan Clark, Diego De Las Casas, Aurelia Guy et al.
Scaling Laws for Generative Mixed-Modal Language Models
Armen Aghajanyan, Lili Yu, Alexis Conneau et al.
The case for 4-bit precision: k-bit Inference Scaling Laws
Tim Dettmers, Luke Zettlemoyer
Scaling Laws for Multilingual Neural Machine Translation
Patrick Fernandes, Behrooz Ghorbani, Xavier Garcia et al.
Scaling Laws for Reward Model Overoptimization
Leo Gao, John Schulman, Jacob Hilton
TAN Without a Burn: Scaling Laws of DP-SGD
Tom Sander, Pierre Stock, Alexandre Sablayrolles