Jeffrey Pennington

35 papers · 2011–2025 · 5 conferences · across top CS/AI conferences

Achievements

+14 more ↓

🌍 Conference Polyglot (5) 🏃 Academic Marathon (14) 🧭 Keyword Pioneer 🌉 Interdisciplinary Bridge 🐝 Cross-Pollinator (4)

🐝 Cross-Pollinator (4) 🌈 Renaissance Researcher (6) 🗺️ Taxonomy Completionist (37) 🤝 Dynamic Duo (14) 👑 Triple Crown 🔬 Deep Specialist (12) 🏆 Keyword Champion (2) 🚀 Conference Pioneer 🗃️ Keyword Collector (117) 📈 Trend Setter 💎 Century Club (35) 🔥 Unstoppable (9) ❓ The Questioner ⚡ Prolific Year (6)

Conferences

NIPS (12) ICML (10) ICLR (8) AISTATS (3) EMNLP (2)

Top co-authors

Lechao Xiao (14) Jascha Sohl-dickstein (10) Ben Adlam (9) Roman Novak (8) Jaehoon Lee (7) Samuel Schoenholz (7) Yasaman Bahri (6) Elliot Paquette (2) Jiri Hron (2) Courtney Paquette (2)

Keywords

neural tangent kernel (5) random matrix theory (5) neural network (4) neural network architecture (3) dynamical isometry (3) kernel methods (3) double descent (3) free probability theory (2) random feature regression (2) neural network gaussian process (2) hessian eigenvalue (2) convolutional neural network (2) stochastic gradient descent (2) spectral distribution (2) activation function (2) high-dimensional analysis (2) mean field theory (2) kernel regression (2) neural network optimization (2) generalization error (2)

Papers

Scaling Collapse Reveals Universal Dynamics in Compute-Optimally Trained Neural Networks ICML 2025 4+3 Phases of Compute-Optimal Neural Scaling Laws NIPS 2024 Small-scale proxies for large-scale Transformer training instabilities ICLR 2024 Scaling Exponents Across Parameterizations and Optimizers ICML 2024 Second-order regression models exhibit progressive sharpening to the edge of stability ICML 2023 Anisotropic Random Feature Regression in High Dimensions ICLR 2022 Precise Learning Curves and Higher-Order Scalings for Dot-product Kernel Regression NIPS 2022 Implicit Regularization or Implicit Conditioning? Exact Risk Trajectories of SGD in High Dimensions NIPS 2022 A Random Matrix Perspective on Mixtures of Nonlinearities in High Dimensions AISTATS 2022 Synergy and Symmetry in Deep Learning: Interactions between the Data, Model, and Inference Algorithm ICML 2022 Wide Bayesian neural networks have a simple weight posterior: theory and accelerated sampling ICML 2022 Overparameterization Improves Robustness to Covariate Shift in High Dimensions NIPS 2021 Exploring the Uncertainty Properties of Neural Networks’ Implicit Priors in the Infinite-Width Limit ICLR 2021 The Neural Tangent Kernel in High Dimensions: Triple Descent and a Multi-Scale Theory of Generalization ICML 2020 Understanding Double Descent Requires A Fine-Grained Bias-Variance Decomposition NIPS 2020 Provable Benefit of Orthogonal Initialization in Optimizing Deep Linear Networks ICLR 2020 Finite Versus Infinite Neural Networks: an Empirical Study NIPS 2020 The Surprising Simplicity of the Early-Time Learning Dynamics of Neural Networks NIPS 2020 Disentangling Trainability and Generalization in Deep Neural Networks ICML 2020 Wide Neural Networks of Any Depth Evolve as Linear Models Under Gradient Descent NIPS 2019 A Mean Field Theory of Batch Normalization ICLR 2019 Bayesian Deep Convolutional Networks with Many Channels are Gaussian Processes ICLR 2019 KAMA-NNs: Low-dimensional Rotation Based Neural Networks AISTATS 2019 Deep Neural Networks as Gaussian Processes ICLR 2018 The Spectrum of the Fisher Information Matrix of a Single-Hidden-Layer Neural Network NIPS 2018 The emergence of spectral universality in deep networks AISTATS 2018 Sensitivity and Generalization in Neural Networks: an Empirical Study ICLR 2018 Dynamical Isometry and a Mean Field Theory of RNNs: Gating Enables Signal Propagation in Recurrent Neural Networks ICML 2018 Dynamical Isometry and a Mean Field Theory of CNNs: How to Train 10,000-Layer Vanilla Convolutional Neural Networks ICML 2018 Geometry of Neural Network Loss Surfaces via Random Matrix Theory ICML 2017 Resurrecting the sigmoid in deep learning through dynamical isometry: theory and practice NIPS 2017 Nonlinear random matrix theory for deep learning NIPS 2017 Spherical Random Features for Polynomial Kernels NIPS 2015 GloVe: Global Vectors for Word Representation EMNLP 2014 Semi-Supervised Recursive Autoencoders for Predicting Sentiment Distributions EMNLP 2011