Research Explorer
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Achievements
About
Methodology
← Optimization & Theory
Machine Learning
›
Optimization & Theory
›
Neural Network Optimization
3648 directly classified papers
Papers per year
2001: 1
2003: 1
2005: 2
2006: 3
2007: 6
2008: 1
2009: 7
2010: 5
2011: 7
2012: 9
2013: 17
2014: 18
2015: 40
2016: 76
2017: 113
2018: 214
2019: 324
2020: 414
2021: 489
2022: 445
2023: 524
2024: 469
2025: 386
2026: 77
Papers
How Two-Layer Neural Networks Learn, One (Giant) Step at a Time
JMLR 2024
QT-ViT: Improving Linear Attention in ViT with Quadratic Taylor Expansion
NIPS 2024
Deep linear networks for regression are implicitly regularized towards flat minima
NIPS 2024
The Closeness of In-Context Learning and Weight Shifting for Softmax Regression
NIPS 2024
Towards Geometric Normalization Techniques in SE(3) Equivariant Graph Neural Networks for Physical Dynamics Simulations
IJCAI 2024
Global Convergence in Training Large-Scale Transformers
NIPS 2024
Understanding the Generalization Benefits of Late Learning Rate Decay
AISTATS 2024
Neural Echos: Depthwise Convolutional Filters Replicate Biological Receptive Fields
WACV 2024
Physics-Informed Neural Networks: Minimizing Residual Loss with Wide Networks and Effective Activations
IJCAI 2024
Over-Reasoning and Redundant Calculation of Large Language Models
EACL 2024
Vision Mamba Mender
NIPS 2024
Provable Acceleration of Nesterov’s Accelerated Gradient Method over Heavy Ball Method in Training Over-Parameterized Neural Networks
IJCAI 2024
Extreme Fine-tuning: A Novel and Fast Fine-tuning Approach for Text Classification
EACL 2024
Causality-enhanced Discreted Physics-informed Neural Networks for Predicting Evolutionary Equations
IJCAI 2024
The Implicit Bias of Adam on Separable Data
NIPS 2024
On the Sparsity of the Strong Lottery Ticket Hypothesis
NIPS 2024
Activating Self-Attention for Multi-Scene Absolute Pose Regression
NIPS 2024
Dynamic Masking Rate Schedules for MLM Pretraining
EACL 2024
Open Problem: Black-Box Reductions and Adaptive Gradient Methods for Nonconvex Optimization
COLT 2024
SynSP: Synergy of Smoothness and Precision in Pose Sequences Refinement
CVPR 2024
Local to Global: Learning Dynamics and Effect of Initialization for Transformers
NIPS 2024
Exact Mean Square Linear Stability Analysis for SGD
COLT 2024
InceptionNeXt: When Inception Meets ConvNeXt
CVPR 2024
Ordered Momentum for Asynchronous SGD
NIPS 2024
Deep model-free KKL observer: A switching approach
L4DC 2024
<
1
…
19
20
21
…
146
>