Research Explorer
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Achievements
About
Methodology
← Optimization & Theory
Machine Learning
›
Optimization & Theory
›
Neural Network Optimization
3648 directly classified papers
Papers per year
2001: 1
2003: 1
2005: 2
2006: 3
2007: 6
2008: 1
2009: 7
2010: 5
2011: 7
2012: 9
2013: 17
2014: 18
2015: 40
2016: 76
2017: 113
2018: 214
2019: 324
2020: 414
2021: 489
2022: 445
2023: 524
2024: 469
2025: 386
2026: 77
Papers
Cold Starts and Hard Cases: A Two-Stage SFT-RLVR Approach for Legal Machine Translation (Just-NLP L-MT shared task)
IJCNLP 2025
Early Alignment in Two-Layer Networks Training is a Two-Edged Sword
JMLR 2025
Efficiently Escaping Saddle Points in Bilevel Optimization
JMLR 2025
Revisiting Gradient Normalization and Clipping for Nonconvex SGD under Heavy-Tailed Noise: Necessity, Sufficiency, and Acceleration
JMLR 2025
Equivariant Manifold Neural ODEs and Differential Invariants
JMLR 2025
Neural Operators Can Play Dynamic Stackelberg Games
JMLR 2025
depyf: Open the Opaque Box of PyTorch Compiler for Machine Learning Researchers
JMLR 2025
Delta-NAS: Difference of Architecture Encoding for Predictor-Based Evolutionary Neural Architecture Search
WACV 2025
YuLan-Mini: Pushing the Limits of Open Data-efficient Language Model
ACL 2025
ThoughtProbe: Classifier-Guided LLM Thought Space Exploration via Probing Representations
EMNLP 2025
MT2ST: Adaptive Multi-Task to Single-Task Learning
ACL 2025
VersaTune: An Efficient Data Composition Framework for Training Multi-Capability LLMs
EMNLP 2025
LoSiA: Efficient High-Rank Fine-Tuning via Subnet Localization and Optimization
EMNLP 2025
DMPT: Decoupled Modality-Aware Prompt Tuning for Multi-Modal Object Re-Identification
WACV 2025
Variance Sensitivity Induces Attention Entropy Collapse and Instability in Transformers
EMNLP 2025
A Training-Free Length Extrapolation Approach for LLMs: Greedy Attention Logit Interpolation
EMNLP 2025
Accelerate Parallelizable Reasoning via Parallel Decoding within One Sequence
EMNLP 2025
AlphaOne: Reasoning Models Thinking Slow and Fast at Test Time
EMNLP 2025
CLaSp: In-Context Layer Skip for Self-Speculative Decoding
ACL 2025
CoMMIT: Coordinated Multimodal Instruction Tuning
EMNLP 2025
Exploring the Limitations of Mamba in COPY and CoT Reasoning
EMNLP 2025
E2LLM: Encoder Elongated Large Language Models for Long-Context Understanding and Reasoning
EMNLP 2025
Answer Convergence as a Signal for Early Stopping in Reasoning
EMNLP 2025
s1: Simple test-time scaling
EMNLP 2025
SADDLe: Sharpness-Aware Decentralized Deep Learning with Heterogeneous Data
WACV 2025
<
1
…
13
14
15
…
146
>