conftrace_

← Optimization & Theory

Machine Learning › Optimization & Theory ›

Neural Network Optimization

3,648 papers

Papers per year

Papers

How Many Layers and Why? An Analysis of the Model Depth in Transformers IJCNLP 2021

Improved, Deterministic Smoothing for L_1 Certified Robustness ICML 2021

Descending through a Crowded Valley - Benchmarking Deep Learning Optimizers ICML 2021

When Attention Meets Fast Recurrence: Training Language Models with Reduced Compute EMNLP 2021

Relative Flatness and Generalization NIPS 2021

CLUZH at SIGMORPHON 2021 Shared Task on Multilingual Grapheme-to-Phoneme Conversion: Variations on a Baseline ACL 2021

On the Adequacy of Untuned Warmup for Adaptive Optimization AAAI 2021

On Linear Stability of SGD and Input-Smoothness of Neural Networks NIPS 2021

Empirical Evaluation of Pre-trained Transformers for Human-Level NLP: The Role of Sample Size and Dimensionality NAACL 2021

Effective Sparsification of Neural Networks With Global Sparsity Constraint CVPR 2021

Gradient Methods Never Overfit On Separable Data JMLR 2021

A Theoretical Analysis of Catastrophic Forgetting through the NTK Overlap Matrix AISTATS 2021

Optimization with Momentum: Dynamical, Control-Theoretic, and Symplectic Perspectives JMLR 2021

Prefix-Tuning: Optimizing Continuous Prompts for Generation ACL 2021

GradInit: Learning to Initialize Neural Networks for Stable and Efficient Training NIPS 2021

SGD for Structured Nonconvex Functions: Learning Rates, Minibatching and Interpolation AISTATS 2021

Prioritized Architecture Sampling With Monto-Carlo Tree Search CVPR 2021

Evaluating the Extrapolation Capabilities of Neural Vocoders to Extreme Pitch Values INTERSPEECH 2021

On the Periodic Behavior of Neural Network Training with Batch Normalization and Weight Decay NIPS 2021

Symplectic Adjoint Method for Exact Gradient of Neural ODE with Minimal Memory NIPS 2021

Convergence Rates of Stochastic Gradient Descent under Infinite Noise Variance NIPS 2021

Catformer: Designing Stable Transformers via Sensitivity Analysis ICML 2021

Latent Equilibrium: A unified learning theory for arbitrarily fast computation with arbitrarily slow neurons NIPS 2021

ADAHESSIAN: An Adaptive Second Order Optimizer for Machine Learning AAAI 2021

Benign Overfitting of Constant-Stepsize SGD for Linear Regression COLT 2021