Research Explorer
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Achievements
About
Methodology
← Optimization & Theory
Machine Learning
›
Optimization & Theory
›
Neural Network Optimization
3648 directly classified papers
Papers per year
2001: 1
2003: 1
2005: 2
2006: 3
2007: 6
2008: 1
2009: 7
2010: 5
2011: 7
2012: 9
2013: 17
2014: 18
2015: 40
2016: 76
2017: 113
2018: 214
2019: 324
2020: 414
2021: 489
2022: 445
2023: 524
2024: 469
2025: 386
2026: 77
Papers
Top-KAST: Top-K Always Sparse Training
NIPS 2020
The Generalization-Stability Tradeoff In Neural Network Pruning
NIPS 2020
How does Weight Correlation Affect Generalisation Ability of Deep Neural Networks?
NIPS 2020
On the distance between two neural networks and the stability of learning
NIPS 2020
ExpandNets: Linear Over-parameterization to Train Compact Convolutional Networks
NIPS 2020
Ultra-Low Precision 4-bit Training of Deep Neural Networks
NIPS 2020
Self-Distillation Amplifies Regularization in Hilbert Space
NIPS 2020
On the training dynamics of deep networks with $L_2$ regularization
NIPS 2020
Training Linear Finite-State Machines
NIPS 2020
Analytic Characterization of the Hessian in Shallow ReLU Models: A Tale of Symmetry
NIPS 2020
A mathematical model for automatic differentiation in machine learning
NIPS 2020
Estimating Training Data Influence by Tracing Gradient Descent
NIPS 2020
Learning to solve TV regularised problems with unrolled algorithms
NIPS 2020
Robust Recovery via Implicit Bias of Discrepant Learning Rates for Double Over-parameterization
NIPS 2020
Optimal Lottery Tickets via Subset Sum: Logarithmic Over-Parameterization is Sufficient
NIPS 2020
Curriculum Learning by Dynamic Instance Hardness
NIPS 2020
O(n) Connections are Expressive Enough: Universal Approximability of Sparse Transformers
NIPS 2020
STEER : Simple Temporal Regularization For Neural ODE
NIPS 2020
Stochastic Optimization with Heavy-Tailed Noise via Accelerated Gradient Clipping
NIPS 2020
Stochastic Recursive Gradient Descent Ascent for Stochastic Nonconvex-Strongly-Concave Minimax Problems
NIPS 2020
Limits to Depth Efficiencies of Self-Attention
NIPS 2020
Fast Transformers with Clustered Attention
NIPS 2020
Improving First-Order Optimization Algorithms (Student Abstract)
AAAI 2020
Do Subsampled Newton Methods Work for High-Dimensional Data?
AAAI 2020
Synchronous Double-channel Recurrent Network for Aspect-Opinion Pair Extraction
ACL 2020
<
1
…
108
109
110
…
146
>