Research Explorer
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Achievements
About
Methodology
← Optimization & Theory
Deep Learning
›
Optimization & Theory
›
Optimization
1638 directly classified papers
Papers per year
2006: 5
2007: 2
2008: 4
2009: 2
2010: 2
2011: 3
2012: 8
2013: 25
2014: 19
2015: 22
2016: 31
2017: 42
2018: 68
2019: 104
2020: 148
2021: 174
2022: 178
2023: 209
2024: 345
2025: 244
2026: 3
Papers
Global Convergence in Training Large-Scale Transformers
NIPS 2024
Pre-trained Large Language Models Use Fourier Features to Compute Addition
NIPS 2024
BAdam: A Memory Efficient Full Parameter Optimization Method for Large Language Models
NIPS 2024
Robust and Faster Zeroth-Order Minimax Optimization: Complexity and Applications
NIPS 2024
Globally Q-linear Gauss-Newton Method for Overparameterized Non-convex Matrix Sensing
NIPS 2024
SOI: Scaling Down Computational Complexity by Estimating Partial States of the Model
NIPS 2024
Evaluating the design space of diffusion-based generative models
NIPS 2024
ESPACE: Dimensionality Reduction of Activations for Model Compression
NIPS 2024
ST$_k$: A Scalable Module for Solving Top-k Problems
NIPS 2024
Understanding Progressive Training Through the Framework of Randomized Coordinate Descent
AISTATS 2024
The Road Less Scheduled
NIPS 2024
Rethinking Fourier Transform from A Basis Functions Perspective for Long-term Time Series Forecasting
NIPS 2024
In-Context Learning State Vector with Inner and Momentum Optimization
NIPS 2024
Sketching for Distributed Deep Learning: A Sharper Analysis
NIPS 2024
Ladder: Enabling Efficient Low-Precision Deep Learning Computing through Hardware-aware Tensor Transformation
OSDI 2024
GRAWA: Gradient-based Weighted Averaging for Distributed Training of Deep Learning Models
AISTATS 2024
Towards Scalable and Stable Parallelization of Nonlinear RNNs
NIPS 2024
Hardness of Learning Neural Networks under the Manifold Hypothesis
NIPS 2024
Fine-Tuning and Prompt Optimization: Two Great Steps that Work Better Together
EMNLP 2024
Topological Generalization Bounds for Discrete-Time Stochastic Optimization Algorithms
NIPS 2024
Weight decay induces low-rank attention layers
NIPS 2024
MOSEL: Inference Serving Using Dynamic Modality Selection
EMNLP 2024
Memory-Efficient Gradient Unrolling for Large-Scale Bi-level Optimization
NIPS 2024
Explicit Eigenvalue Regularization Improves Sharpness-Aware Minimization
NIPS 2024
Stochastic Amortization: A Unified Approach to Accelerate Feature and Data Attribution
NIPS 2024
<
1
…
10
11
12
…
66
>