conftrace
_
Papers
Trends
Conferences
Explore
Authors
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Achievements
← Optimization & Theory
Machine Learning
›
Optimization & Theory
›
Neural Network Optimization
3,648 papers
Papers per year
2001: 1
2003: 1
2005: 2
2006: 3
2007: 6
2008: 1
2009: 7
2010: 5
2011: 7
2012: 9
2013: 17
2014: 18
2015: 40
2016: 76
2017: 113
2018: 214
2019: 324
2020: 414
2021: 489
2022: 445
2023: 524
2024: 469
2025: 386
2026: 77
Papers
BiSLS/SPS: Auto-tune Step Sizes for Stable Bi-level Optimization
NIPS 2023
Minimum norm interpolation by perceptra: Explicit regularization and implicit bias
NIPS 2023
Linear Convergence of Gradient Descent For Finite Width Over-parametrized Linear Networks With General Initialization
AISTATS 2023
Modeling Stroke Mask for End-to-End Text Erasing
WACV 2023
A Guide Through the Zoo of Biased SGD
NIPS 2023
Are Straight-Through Gradients and Soft-Thresholding All You Need for Sparse Training?
WACV 2023
Practical Sharpness-Aware Minimization Cannot Converge All the Way to Optima
NIPS 2023
Strong Lottery Ticket Hypothesis with $\varepsilon$–perturbation
AISTATS 2023
Low-Variance Gradient Estimation in Unrolled Computation Graphs with ES-Single
ICML 2023
FIANCEE: Faster Inference of Adversarial Networks via Conditional Early Exits
CVPR 2023
Kronecker-Factored Approximate Curvature for Modern Neural Network Architectures
NIPS 2023
A Spectral Viewpoint on Continual Relation Extraction
EMNLP 2023
SOL: Sampling-based Optimal Linear bounding of arbitrary scalar functions
NIPS 2023
The Power of Preconditioning in Overparameterized Low-Rank Matrix Sensing
ICML 2023
Train Faster, Perform Better: Modular Adaptive Training in Over-Parameterized Models
NIPS 2023
Searching Efficient Neural Architecture With Multi-Resolution Fusion Transformer for Appearance-Based Gaze Estimation
WACV 2023
The Geometry of Neural Nets' Parameter Spaces Under Reparametrization
NIPS 2023
Gradient Descent Monotonically Decreases the Sharpness of Gradient Flow Solutions in Scalar Networks and Beyond
ICML 2023
Dynamic Neural Network for Multi-Task Learning Searching Across Diverse Network Topologies
CVPR 2023
HOTNAS: Hierarchical Optimal Transport for Neural Architecture Search
CVPR 2023
Generalized Polyak Step Size for First Order Optimization with Momentum
ICML 2023
Neural networks trained with SGD learn distributions of increasing complexity
ICML 2023
Lon-eå at SemEval-2023 Task 11: A Comparison of Activation Functions for Soft and Hard Label Prediction
SEMEVAL 2023
Challenging the “One Single Vector per Token” Assumption
CONLL 2023
Mitigating Over-smoothing in Transformers via Regularized Nonlocal Functionals
NIPS 2023
<
1
…
52
53
54
…
146
>