conftrace
_
Papers
Trends
Conferences
Explore
Authors
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Achievements
← Optimization & Theory
Machine Learning
›
Optimization & Theory
›
Neural Network Optimization
3,648 papers
Papers per year
2001: 1
2003: 1
2005: 2
2006: 3
2007: 6
2008: 1
2009: 7
2010: 5
2011: 7
2012: 9
2013: 17
2014: 18
2015: 40
2016: 76
2017: 113
2018: 214
2019: 324
2020: 414
2021: 489
2022: 445
2023: 524
2024: 469
2025: 386
2026: 77
Papers
Sharpened Lazy Incremental Quasi-Newton Method
AISTATS 2024
Data Driven Threshold and Potential Initialization for Spiking Neural Networks
AISTATS 2024
Revisiting the Noise Model of Stochastic Gradient Descent
AISTATS 2024
Parameter-Agnostic Optimization under Relaxed Smoothness
AISTATS 2024
Provable Accelerated Convergence of Nesterov’s Momentum for Deep ReLU Neural Networks
ALT 2024
Alternating minimization for generalized rank one matrix sensing: Sharp predictions from a random initialization
ALT 2024
Improving Adaptive Online Learning Using Refined Discretization
ALT 2024
CEPT: A Contrast-Enhanced Prompt-Tuning Framework for Emotion Recognition in Conversation
COLING 2024
Context-Aware Non-Autoregressive Document-Level Translation with Sentence-Aligned Connectionist Temporal Classification
COLING 2024
Depth-Wise Attention (DWAtt): A Layer Fusion Method for Data-Efficient Classification
COLING 2024
DET: A Dual-Encoding Transformer for Relational Graph Embedding
COLING 2024
EFTNAS: Searching for Efficient Language Models in First-Order Weight-Reordered Super-Networks
COLING 2024
Enhancing Distantly Supervised Named Entity Recognition with Strong Label Guided Lottery Training
COLING 2024
Enhancing Parameter-efficient Fine-tuning with Simple Calibration Based on Stable Rank
COLING 2024
Information Extraction with Differentiable Beam Search on Graph RNNs
COLING 2024
Jump to Conclusions: Short-Cutting Transformers with Linear Transformations
COLING 2024
On the Relationship between Skill Neurons and Robustness in Prompt Tuning
COLING 2024
The Open-World Lottery Ticket Hypothesis for OOD Intent Classification
COLING 2024
Linguistic Fingerprint in Transformer Models: How Language Variation Influences Parameter Selection in Irony Detection
COLING 2024
Efficiently Learning One-Hidden-Layer ReLU Networks via SchurPolynomials
COLT 2024
Exact Mean Square Linear Stability Analysis for SGD
COLT 2024
Training Dynamics of Multi-Head Softmax Attention for In-Context Learning: Emergence, Convergence, and Optimality (extended abstract)
COLT 2024
Open Problem: Black-Box Reductions and Adaptive Gradient Methods for Nonconvex Optimization
COLT 2024
Gradient-based Parameter Selection for Efficient Fine-Tuning
CVPR 2024
From Activation to Initialization: Scaling Insights for Optimizing Neural Fields
CVPR 2024
<
1
…
29
30
31
…
146
>