← Optimization & Theory

Deep Learning › Optimization & Theory ›

Neural Network Optimization

902 directly classified papers

Papers per year

Papers

On the Explicit Role of Initialization on the Convergence and Implicit Bias of Overparametrized Linear Networks ICML 2021

GNNAdvisor: An Adaptive and Efficient Runtime System for GNN Acceleration on GPUs OSDI 2021

Does Preprocessing Help Training Over-parameterized Neural Networks? NIPS 2021

HardCoRe-NAS: Hard Constrained diffeRentiable Neural Architecture Search ICML 2021

Enhancing Robustness of Neural Networks through Fourier Stabilization ICML 2021

Descending through a Crowded Valley - Benchmarking Deep Learning Optimizers ICML 2021

Nondeterminism and Instability in Neural Network Optimization ICML 2021

A Modular Analysis of Provable Acceleration via Polyak’s Momentum: Training a Wide ReLU Network and a Deep Linear Network ICML 2021

Learning Neural Network Subspaces ICML 2021

Positive-Negative Momentum: Manipulating Stochastic Gradient Noise to Improve Generalization ICML 2021

Tensor Programs IIb: Architectural Universality Of Neural Tangent Kernel Training Dynamics ICML 2021

Towards Trustworthy Predictions from Deep Neural Networks with Fast Adversarial Calibration AAAI 2021

AutoDropout: Learning Dropout Patterns to Regularize Deep Networks AAAI 2021

Elastic Consistency: A Practical Consistency Model for Distributed Stochastic Gradient Descent AAAI 2021

Adaptive Knowledge Driven Regularization for Deep Neural Networks AAAI 2021

TRQ: Ternary Neural Networks With Residual Quantization AAAI 2021

Learning Graph Neural Networks with Approximate Gradient Descent AAAI 2021

A Recipe for Global Convergence Guarantee in Deep Neural Networks AAAI 2021

Numerical influence of ReLU’(0) on backpropagation NIPS 2021

ARCH: Efficient Adversarial Regularized Training with Caching EMNLP 2021

Reconsidering the Past: Optimizing Hidden States in Language Models EMNLP 2021

Benign Overfitting of Constant-Stepsize SGD for Linear Regression COLT 2021

HR-NAS: Searching Efficient High-Resolution Neural Architectures With Lightweight Transformers CVPR 2021

Optimizing Millions of Hyperparameters by Implicit Differentiation AISTATS 2020

Understanding the Difficulty of Training Transformers EMNLP 2020