← Optimization & Theory

Deep Learning › Optimization & Theory ›

Theory

1072 directly classified papers

Papers per year

Papers

Whitening Convergence Rate of Coupling-based Normalizing Flows NIPS 2022

Principal Components Bias in Over-parameterized Linear Models, and its Manifestation in Deep Neural Networks JMLR 2022

Learning Rates as a Function of Batch Size: A Random Matrix Theory Approach to Neural Network Training JMLR 2022

Softmax Bottleneck Makes Language Models Unable to Represent Multi-mode Word Distributions ACL 2022

ODE Transformer: An Ordinary Differential Equation-Inspired Model for Sequence Generation ACL 2022

Emergent Structures and Training Dynamics in Large Language Models ACL 2022

On Isotropy Calibration of Transformer Models ACL 2022

Transformers from an Optimization Perspective NIPS 2022

The price of ignorance: how much does it cost to forget noise structure in low-rank matrix estimation? NIPS 2022

Do Residual Neural Networks discretize Neural Ordinary Differential Equations? NIPS 2022

Robustness in deep learning: The good (width), the bad (depth), and the ugly (initialization) NIPS 2022

Implicit Regularization or Implicit Conditioning? Exact Risk Trajectories of SGD in High Dimensions NIPS 2022

On the Parameterization and Initialization of Diagonal State Space Models NIPS 2022

The Pitfalls of Regularization in Off-Policy TD Learning NIPS 2022

Deep Architecture Connectivity Matters for Its Convergence: A Fine-Grained Analysis NIPS 2022

Listen to Interpret: Post-hoc Interpretability for Audio Networks with NMF NIPS 2022

Understanding the Generalization Benefit of Normalization Layers: Sharpness Reduction NIPS 2022

Towards Understanding Grokking: An Effective Theory of Representation Learning NIPS 2022

Invariance-Aware Randomized Smoothing Certificates NIPS 2022

On the generalization of learning algorithms that do not converge NIPS 2022

A Quantitative Geometric Approach to Neural-Network Smoothness NIPS 2022

A Practical, Progressively-Expressive GNN NIPS 2022

Learning dynamics of deep linear networks with multiple pathways NIPS 2022

Real-Valued Backpropagation is Unsuitable for Complex-Valued Neural Networks NIPS 2022

Exponential Separations in Symmetric Neural Networks NIPS 2022