Research Explorer
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Achievements
About
Methodology
← Optimization & Theory
Machine Learning
›
Optimization & Theory
›
Neural Network Optimization
3648 directly classified papers
Papers per year
2001: 1
2003: 1
2005: 2
2006: 3
2007: 6
2008: 1
2009: 7
2010: 5
2011: 7
2012: 9
2013: 17
2014: 18
2015: 40
2016: 76
2017: 113
2018: 214
2019: 324
2020: 414
2021: 489
2022: 445
2023: 524
2024: 469
2025: 386
2026: 77
Papers
Characterizing Dynamical Stability of Stochastic Gradient Descent in Overparameterized Learning
JMLR 2025
An Enhanced Levenberg--Marquardt Method via Gram Reduction
AAAI 2025
LoSiA: Efficient High-Rank Fine-Tuning via Subnet Localization and Optimization
EMNLP 2025
Adapting IndicTrans2 for Legal Domain MT via QLoRA Fine-Tuning at JUST-NLP 2025
IJCNLP 2025
Efficient Diffusion as Low Light Enhancer
CVPR 2025
When Long Helps Short: How Context Length in Supervised Fine-tuning Affects Behavior of Large Language Models
EMNLP 2025
On Local Overfitting and Forgetting in Deep Neural Networks
AAAI 2025
Mitigating Forgetting in Continual Learning with Selective Gradient Projection
IJCNLP 2025
depyf: Open the Opaque Box of PyTorch Compiler for Machine Learning Researchers
JMLR 2025
On the Effects of Fine-tuning Language Models for Text-Based Reinforcement Learning
COLING 2025
PCAN: A Pandemic-Compatible Attentive Neural Network for Retail Sales Forecasting
IJCAI 2025
Continual Quantization-Aware Pre-Training: When to transition from 16-bit to 1.58-bit pre-training for BitNet language models?
ACL 2025
Investigating the Role of Weight Decay in Enhancing Nonconvex SGD
CVPR 2025
Extreme Fine-tuning: A Novel and Fast Fine-tuning Approach for Text Classification
EACL 2024
Deep Learning for Computing Convergence Rates of Markov Chains
NIPS 2024
Dynamic Masking Rate Schedules for MLM Pretraining
EACL 2024
Understanding and Minimising Outlier Features in Transformer Training
NIPS 2024
Deep Equilibrium Algorithmic Reasoning
NIPS 2024
Personalized Abstractive Summarization by Tri-agent Generation Pipeline
EACL 2024
Beyond the Limits: A Survey of Techniques to Extend the Context Length in Large Language Models
IJCAI 2024
The Implicit Bias of Gradient Descent on Separable Multiclass Data
NIPS 2024
Dissecting the Interplay of Attention Paths in a Statistical Mechanics Theory of Transformers
NIPS 2024
Separation and Bias of Deep Equilibrium Models on Expressivity and Learning Dynamics
NIPS 2024
Scaling Up to Excellence: Practicing Model Scaling for Photo-Realistic Image Restoration In the Wild
CVPR 2024
Should I try multiple optimizers when fine-tuning a pre-trained Transformer for NLP tasks? Should I tune their hyperparameters?
EACL 2024
<
1
…
18
19
20
…
146
>