Nathan Srebro

86 papers · 2008–2026 · 8 conferences · across top CS/AI conferences

Achievements

+16 more ↓

🗺️ Taxonomy Completionist (28) 🧭 Keyword Pioneer 🌉 Interdisciplinary Bridge 🌈 Renaissance Researcher (6) 🐣 Hot Topic Early Bird

🏃 Academic Marathon (17) 🌈 Renaissance Researcher (6) 🌉 Interdisciplinary Bridge 🏠 Conference Loyalist (20) 🌟 Keyword Trendsetter Combo (3) 🏆 Keyword Champion (2) 👑 Triple Crown 🔬 Deep Specialist (11) 🤝 Dynamic Duo (15) 💎 Century Club (84) 🗃️ Keyword Collector (125) 📈 Trend Setter 🚀 Conference Pioneer 🔥 Unstoppable (11) ⚡ Prolific Year (10) ❓ The Questioner (4)

Conferences

COLT (20) ICML (20) NIPS (17) AISTATS (11) ICLR (7) ALT (5) JMLR (5) IJCAI (1)

Top co-authors

Daniel Soudry (16) Blake Woodworth (9) Ohad Shamir (9) Mor Shpigel Nacson (8) Gal Vardi (8) Suriya Gunasekar (7) Omar Montasser (5) Itay Evron (5) Karthik Sridharan (5) Zhiyuan Li (4)

Research topics

Statistics (1)

Keywords

convex optimization (13) gradient descent (11) neural network (8) sample complexity (8) stochastic gradient descent (8) implicit bia (6) stochastic optimization (6) representation learning (5) learning theory (5) distributed optimization (4) benign overfitting (4) separable datum (4) empirical risk minimization (4) communication efficiency (4) linear classifier (4) adversarial robustness (3) distributed learning (3) pac learning (3) stochastic convex optimization (3) federated learning (3)

Papers

From Continual Learning to SGD and Back: Better Rates for Continual Linear Models ALT 2026 On the Hardness of Learning Regular Expressions ALT 2026 PENCIL: Long Thoughts with Short Memory ICML 2025 A Theory of Learning with Autoregressive Chain of Thought COLT 2025 Quantifying Overfitting along the Regularization Path for Two-Part-Code MDL in Supervised Classification COLT 2025 Weak-to-Strong Generalization Even in Random Feature Networks, Provably ICML 2025 Noisy Interpolation Learning with Shallow Univariate ReLU Networks ICLR 2024 Overfitting Behaviour of Gaussian Kernel Ridgeless Regression: Varying Bandwidth or Dimensionality NIPS 2024 Provable Tempered Overfitting of Minimal Nets and Typical Nets NIPS 2024 On the Complexity of Learning Sparse Functions with Statistical and Gradient Queries NIPS 2024 The Price of Implicit Bias in Adversarially Robust Generalization NIPS 2024 The Limits and Potentials of Local SGD for Distributed Heterogeneous Learning with Intermittent Communication COLT 2024 An Agnostic View on the Cost of Overfitting in (Kernel) Ridge Regression ICLR 2024 Depth Separation in Norm-Bounded Infinite-Width Neural Networks COLT 2024 Metalearning with Very Few Samples Per Task COLT 2024 How Uniform Random Weights Induce Non-uniform Bias: Typical Interpolating Neural Networks Generalize with Narrow Teachers ICML 2024 Shortest Program Interpolation Learning COLT 2023 Implicit Bias in Leaky ReLU Networks Trained on High-Dimensional Data ICLR 2023 Continual Learning in Linear Classification on Separable Data ICML 2023 Federated Online and Bandit Convex Optimization ICML 2023 Benign Overfitting in Linear Classifiers and Leaky ReLU Networks from KKT Conditions for Margin Maximization COLT 2023 How catastrophic can catastrophic forgetting be in linear regression? COLT 2022 Transductive Robust Learning Guarantees AISTATS 2022 Implicit Bias of the Step Size in Linear Diagonal Neural Networks ICML 2022 The Min-Max Complexity of Distributed Stochastic Convex Optimization with Intermittent Communication (Extended Abstract) IJCAI 2022 Dropout: Explicit Forms and Capacity Control ICML 2021 On the Implicit Bias of Initialization Shape: Beyond Infinitesimal Mirror Descent ICML 2021 Representation Costs of Linear Neural Networks: Analysis and Design NIPS 2021 Mirrorless Mirror Descent: A Natural Derivation of Mirror Descent AISTATS 2021 Does Invariant Risk Minimization Capture Invariance? AISTATS 2021 A Stochastic Newton Algorithm for Distributed Convex Optimization NIPS 2021 On the Power of Differentiable Learning versus PAC and SQ Learning NIPS 2021 Uniform Convergence of Interpolators: Gaussian Width, Norm Bounds and Benign Overfitting NIPS 2021 An Even More Optimal Stochastic Optimization Algorithm: Minibatching and Interpolation Learning NIPS 2021 Adversarially Robust Learning with Unknown Perturbation Sets COLT 2021 The Min-Max Complexity of Distributed Stochastic Convex Optimization with Intermittent Communication COLT 2021 Quantifying the Benefit of Using Differentiable Learning over Tangent Kernels ICML 2021 Fast margin maximization via dual acceleration ICML 2021 A Tight Convergence Analysis for Stochastic Gradient Descent with Delayed Updates ALT 2020 Is Local SGD Better than Minibatch SGD? ICML 2020 Fair Learning with Private Demographic Data ICML 2020 Efficiently Learning Adversarially Robust Halfspaces with Noise ICML 2020 A Function Space View of Bounded Norm Infinite Width ReLU Nets: The Multivariate Case ICLR 2020 Approximate is Good Enough: Probabilistic Variants of Dimensional and Margin Complexity COLT 2020 Kernel and Rich Regimes in Overparametrized Models COLT 2020 Guaranteed Validity for Empirical Approaches to Adaptive Data Analysis AISTATS 2020 The role of over-parametrization in generalization of neural networks ICLR 2019 Stochastic Gradient Descent on Separable Data: Exact Convergence with a Fixed Learning Rate AISTATS 2019 Convergence of Gradient Descent on Separable Data AISTATS 2019 Stochastic Nonconvex Optimization with Large Minibatches ALT 2019 The Complexity of Making the Gradient Small in Stochastic Convex Optimization COLT 2019 VC Classes are Adversarially Robustly Learnable, but Only Improperly COLT 2019 How do infinite width bounded norm networks look in function space? COLT 2019 Open Problem: The Oracle Complexity of Convex Optimization with Limited Memory COLT 2019 Training Well-Generalizing Classifiers for Fairness Metrics and Other Data-Dependent Constraints ICML 2019 Semi-Cyclic Stochastic Gradient Descent ICML 2019 Lexicographic and Depth-Sensitive Margins in Homogeneous and Non-Homogeneous Deep Models ICML 2019 Stochastic Canonical Correlation Analysis JMLR 2019 Efficient coordinate-wise leading eigenvector computation ALT 2018 Characterizing Implicit Bias in Terms of Optimization Geometry ICML 2018 The Implicit Bias of Gradient Descent on Separable Data JMLR 2018 The Implicit Bias of Gradient Descent on Separable Data ICLR 2018 A PAC-Bayesian Approach to Spectrally-Normalized Margin Bounds for Neural Networks ICLR 2018 Learning Non-Discriminatory Predictors COLT 2017 Efficient Distributed Learning with Sparsity ICML 2017 Communication-efficient Algorithms for Distributed Stochastic Principal Component Analysis ICML 2017 Memory and Communication Efficient Distributed Stochastic Optimization with Minibatch Prox COLT 2017 Fast and Scalable Structural SVM with Slack Rescaling AISTATS 2016 On Symmetric and Asymmetric LSHs for Inner Product Search ICML 2015 Norm-Based Capacity Control in Neural Networks COLT 2015 Efficient Training of Structured SVMs via Soft Constraints AISTATS 2015 Learning Sparse Low-Threshold Linear Classifiers JMLR 2015 Distribution-Dependent Sample Complexity of Large Margin Learning JMLR 2013 Approximate Inference by Intersecting Semidefinite Bound and Local Polytope AISTATS 2012 Matrix reconstruction with the local max norm NIPS 2012 Sparse Prediction with the $k$-Support Norm NIPS 2012 Error Analysis of Laplacian Eigenmaps for Semi-supervised Learning AISTATS 2011 Concentration-Based Guarantees for Low-Rank Matrix Reconstruction COLT 2011 Smoothness, Low Noise and Fast Rates NIPS 2010 Learnability, Stability and Uniform Convergence JMLR 2010 Practical Large-Scale Optimization for Max-norm Regularization NIPS 2010 Collaborative Filtering in a Non-Uniform World: Learning with the Weighted Trace Norm NIPS 2010 Tight Sample Complexity of Large-Margin Learning NIPS 2010 Reducing Label Complexity by Learning From Bags AISTATS 2010 Statistical Analysis of Semi-Supervised Learning: The Limit of Infinite Unlabelled Data NIPS 2009 Fast Rates for Regularized Objectives NIPS 2008