Research Explorer
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Achievements
About
Methodology
← Optimization & Theory
Machine Learning
›
Optimization & Theory
›
Neural Network Optimization
3648 directly classified papers
Papers per year
2001: 1
2003: 1
2005: 2
2006: 3
2007: 6
2008: 1
2009: 7
2010: 5
2011: 7
2012: 9
2013: 17
2014: 18
2015: 40
2016: 76
2017: 113
2018: 214
2019: 324
2020: 414
2021: 489
2022: 445
2023: 524
2024: 469
2025: 386
2026: 77
Papers
Edit Probability for Scene Text Recognition
CVPR 2018
Is it Time to Swish? Comparing Deep Learning Activation Functions Across NLP tasks
EMNLP 2018
Multi-Head Attention with Disagreement Regularization
EMNLP 2018
N-ary Relation Extraction using Graph-State LSTM
EMNLP 2018
Breaking the Activation Function Bottleneck through Adaptive Parameterization
NIPS 2018
Deep Convolutional Neural Networks with Merge-and-Run Mappings
IJCAI 2018
Building Sparse Deep Feedforward Networks using Tree Receptive Fields
IJCAI 2018
Algorithmic Regularization in Over-parameterized Matrix Sensing and Neural Networks with Quadratic Activations
COLT 2018
Accelerated Gradient Descent Escapes Saddle Points Faster than Gradient Descent
COLT 2018
Decoupling Gradient-Like Learning Rules from Representations
ICML 2018
Accelerating Natural Gradient with Higher-Order Invariance
ICML 2018
Fast and Scalable Bayesian Deep Learning by Weight-Perturbation in Adam
ICML 2018
Understanding and Simplifying One-Shot Architecture Search
ICML 2018
Differentiable Dynamic Programming for Structured Prediction and Attention
ICML 2018
Towards Binary-Valued Gates for Robust LSTM Training
ICML 2018
Fast Gradient-Based Methods with Exponential Rate: A Hybrid Control Framework
ICML 2018
Gradient Primal-Dual Algorithm Converges to Second-Order Stationary Solution for Nonconvex Distributed Optimization Over Networks
ICML 2018
Characterizing Implicit Bias in Terms of Optimization Geometry
ICML 2018
Faster Derivative-Free Stochastic Algorithm for Shared Memory Machines
ICML 2018
Entropy-SGD optimizes the prior of a PAC-Bayes bound: Generalization properties of Entropy-SGD and data-dependent priors
ICML 2018
Randomized Block Cubic Newton Method
ICML 2018
On Acceleration with Noise-Corrupted Gradients
ICML 2018
Long-Term Human Motion Prediction by Modeling Motion Context and Enhancing Motion Dynamics
IJCAI 2018
SADAGRAD: Strongly Adaptive Stochastic Gradient Methods
ICML 2018
GradNorm: Gradient Normalization for Adaptive Loss Balancing in Deep Multitask Networks
ICML 2018
<
1
…
126
127
128
…
146
>