← Optimization & Theory

Deep Learning › Optimization & Theory ›

Theory

1072 directly classified papers

Papers per year

Papers

Synthetic data for model selection ICML 2023

Gradient Descent in Neural Networks as Sequential Learning in Reproducing Kernel Banach Space ICML 2023

Fundamental Limits of Two-layer Autoencoders, and Achieving Them with Gradient Methods ICML 2023

ModelDiff: A Framework for Comparing Learning Algorithms ICML 2023

Provably and Practically Efficient Neural Contextual Bandits ICML 2023

Neural networks trained with SGD learn distributions of increasing complexity ICML 2023

Feature learning in deep classifiers through Intermediate Neural Collapse ICML 2023

How much does Initialization Affect Generalization? ICML 2023

Spurious Valleys and Clustering Behavior of Neural Networks ICML 2023

Linear CNNs Discover the Statistical Structure of the Dataset Using Only the Most Dominant Frequencies ICML 2023

Certifying Ensembles: A General Certification Theory with S-Lipschitzness ICML 2023

Stochastic Gradient Descent-Induced Drift of Representation in a Two-Layer Neural Network ICML 2023

Diffusion Models are Minimax Optimal Distribution Estimators ICML 2023

Neural signature kernels as infinite-width-depth-limits of controlled ResNets ICML 2023

Optimal Sets and Solution Paths of ReLU Networks ICML 2023

On the Convergence of Gradient Flow on Multi-layer Linear Models ICML 2023

Neural Network Approximations of PDEs Beyond Linearity: A Representational Perspective ICML 2023

A Kernel-Based View of Language Model Fine-Tuning ICML 2023

Same Pre-training Loss, Better Downstream: Implicit Bias Matters for Language Models ICML 2023

Scalable Transformer for PDE Surrogate Modeling NIPS 2023

An Inductive Bias for Tabular Deep Learning NIPS 2023

Softmax Bottleneck Makes Language Models Unable to Represent Multi-mode Word Distributions ACL 2022

On the Effectiveness of Iterative Learning Control L4DC 2022

Size and depth of monotone neural networks: interpolation and approximation NIPS 2022

A PAC-Bayesian Generalization Bound for Equivariant Networks NIPS 2022