Papers
The Evolution of Statistical Induction Heads: In-Context Learning Markov Chains
Ezra Edelman, Nikolaos Tsilivis, Benjamin L. Edelman et al.
The Expressive Capacity of State Space Models: A Formal Language Perspective
Yash Sarrof, Yana Veitsman, Michael Hahn
The Factorization Curse: Which Tokens You Predict Underlie the Reversal Curse and More
Ouail Kitouni, Niklas Nolte, Diane Bouchacourt et al.
The Fairness-Quality Tradeoff in Clustering
Rashida Hakim, Ana-Andreea Stoica, Christos H. Papadimitriou et al.
The Feature Speed Formula: a flexible approach to scale hyper-parameters of deep neural networks
Lénaïc Chizat, Praneeth Netrapalli
The Fine-Grained Complexity of Gradient Computation for Training Large Language Models
Josh Alman, Zhao Song
The FineWeb Datasets: Decanting the Web for the Finest Text Data at Scale
Guilherme Penedo, Hynek Kydlíček, Loubna Ben allal et al.
The Fragility of Fairness: Causal Sensitivity Analysis for Fair Machine Learning
Jake Fawkes, Nic Fishman, Mel Andrews et al.
The GAN is dead; long live the GAN! A Modern GAN Baseline
Yiwen Huang, Aaron Gokaslan, Volodymyr Kuleshov et al.
The Group Robustness is in the Details: Revisiting Finetuning under Spurious Correlations
Tyler LaBonte, John C. Hill, Xinchen Zhang et al.
The High Line: Exact Risk and Learning Rate Curves of Stochastic Adaptive Learning Rate Algorithms
Elizabeth Collins-Woodfin, Inbar Seroussi, Begoña García Malaxechebarría et al.
The Impact of Geometric Complexity on Neural Collapse in Transfer Learning
Michael Munn, Benoit Dherin, Javier Gonzalvo
The Impact of Initialization on LoRA Finetuning Dynamics
Soufiane Hayou, Nikhil Ghosh, Bin Yu
The Implicit Bias of Adam on Separable Data
Chenyang Zhang, Difan Zou, Yuan Cao
The Implicit Bias of Gradient Descent on Separable Multiclass Data
Hrithik Ravi, Clayton Scott, Daniel Soudry et al.
The Implicit Bias of Gradient Descent toward Collaboration between Layers: A Dynamic Analysis of Multilayer Perceptions
Zheng Wang, Geyong Min, Wenjie Ruan
The Implicit Bias of Heterogeneity towards Invariance: A Study of Multi-Environment Matrix Sensing
Yang Xu, Yihong Gu, Cong Fang
The Importance of Online Data: Understanding Preference Fine-tuning via Coverage
Yuda Song, Gokul Swamy, Aarti Singh et al.
The iNaturalist Sounds Dataset
Mustafa Chasmai, Alexander Shepard, Subhransu Maji et al.
The Intelligible and Effective Graph Neural Additive Network
Maya Bechler-Speicher, Amir Globerson, Ran Gilad-Bachrach
The Iterative Optimal Brain Surgeon: Faster Sparse Recovery by Leveraging Second-Order Information
Diyuan Wu, Ionut-Vlad Modoranu, Mher Safaryan et al.
The Ladder in Chaos: Improving Policy Learning by Harnessing the Parameter Evolving Path in A Low-dimensional Space
Hongyao Tang, Min Zhang, Chen Chen et al.
The Limits of Differential Privacy in Online Learning
Bo Li, Wei Wang, Peng Ye
The Limits of Transfer Reinforcement Learning with Latent Low-rank Structure
Tyler Sam, Yudong Chen, Christina Lee Yu