Research Explorer
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Achievements
About
Methodology
← Optimization & Theory
Machine Learning
›
Optimization & Theory
›
Neural Network Optimization
3648 directly classified papers
Papers per year
2001: 1
2003: 1
2005: 2
2006: 3
2007: 6
2008: 1
2009: 7
2010: 5
2011: 7
2012: 9
2013: 17
2014: 18
2015: 40
2016: 76
2017: 113
2018: 214
2019: 324
2020: 414
2021: 489
2022: 445
2023: 524
2024: 469
2025: 386
2026: 77
Papers
Super-efficiency of automatic differentiation for functions defined as a minimum
ICML 2020
Dynamical Systems as Temporal Feature Spaces
JMLR 2020
AdaGrad stepsizes: Sharp convergence over nonconvex landscapes
JMLR 2020
Asymptotic Analysis via Stochastic Differential Equations of Gradient Descent Algorithms in Statistical and Computational Paradigms
JMLR 2020
Estimate Sequences for Stochastic Composite Optimization: Variance Reduction, Acceleration, and Robustness to Noise
JMLR 2020
Chaining Meets Chain Rule: Multilevel Entropic Regularization and Training of Neural Networks
JMLR 2020
Convergence Rates for the Stochastic Gradient Descent Method for Non-Convex Objective Functions
JMLR 2020
A Unified Framework of Online Learning Algorithms for Training Recurrent Neural Networks
JMLR 2020
Directional convergence and alignment in deep learning
NIPS 2020
Noise Isn’t Always Negative: Countering Exposure Bias in Sequence-to-Sequence Inflection Models
COLING 2020
NITK NLP at FinCausal-2020 Task 1 Using BERT and Linear models.
COLING 2020
syrapropa at SemEval-2020 Task 11: BERT-based Models Design for Propagandistic Technique and Span Detection
COLING 2020
Enhancing Urban Flow Maps via Neural ODEs
IJCAI 2020
Optimization Learning: Perspective, Method, and Applications
IJCAI 2020
Rethinking Skip Connection with Layer Normalization
COLING 2020
Context-Aware Cross-Attention for Non-Autoregressive Translation
COLING 2020
Accelerating Stratified Sampling SGD by Reconstructing Strata
IJCAI 2020
Marthe: Scheduling the Learning Rate Via Online Hypergradients
IJCAI 2020
Proximal Gradient Algorithm with Momentum and Flexible Parameter Restart for Nonconvex Optimization
IJCAI 2020
Porous Lattice Transformer Encoder for Chinese NER
COLING 2020
The Transference Architecture for Automatic Post-Editing
COLING 2020
Closing the Generalization Gap of Adaptive Gradient Methods in Training Deep Neural Networks
IJCAI 2020
Generalization bound of globally optimal non-convex neural network training: Transportation map estimation by infinite dimensional Langevin dynamics
NIPS 2020
Towards Theoretically Understanding Why Sgd Generalizes Better Than Adam in Deep Learning
NIPS 2020
Beyond Lazy Training for Over-parameterized Tensor Decomposition
NIPS 2020
<
1
…
103
104
105
…
146
>