← Optimization & Theory

Machine Learning › Optimization & Theory ›

Neural Network Optimization

3648 directly classified papers

Papers per year

Papers

MotionLab: Unified Human Motion Generation and Editing via the Motion-Condition-Motion Paradigm ICCV 2025

Forecasting Continuous Non-Conservative Dynamical Systems in SO(3) ICCV 2025

Differential Mamba IJCNLP 2025

Boosting Adversarial Transferability via Residual Perturbation Attack ICCV 2025

Language Fusion for Parameter-Efficient Cross-lingual Transfer ACL 2025

DMPT: Decoupled Modality-Aware Prompt Tuning for Multi-Modal Object Re-Identification WACV 2025

Ignore the KL Penalty! Boosting Exploration on Critical Tokens to Enhance RL Fine-Tuning NAACL 2025

Accelerating Large Language Model Pretraining via LFR Pedagogy: Learn, Focus, and Review ACL 2025

Outlier-Safe Pre-Training for Robust 4-Bit Quantization of Large Language Models ACL 2025

On Global and Local Convergence of Iterative Linear Quadratic Optimization Algorithms for Discrete Time Nonlinear Control JMLR 2025

Inner Thinking Transformer: Leveraging Dynamic Depth Scaling to Foster Adaptive Internal Thinking ACL 2025

Breaking the Stage Barrier: A Novel Single-Stage Approach to Long Context Extension for Large Language Models COLING 2025

Large Learning Rates Simultaneously Achieve Robustness to Spurious Correlations and Compressibility ICCV 2025

ViG: Linear-complexity Visual Sequence Learning with Gated Linear Attention AAAI 2025

Mitigating Selection Bias with Node Pruning and Auxiliary Options ACL 2025

BUINUS at IWSLT: Evaluating the Impact of Data Augmentation and QLoRA-based Fine-Tuning for Maltese to English Speech Translation ACL 2025

Mitigating Confounding in Speech-Based Dementia Detection through Weight Masking ACL 2025

Efficiently Escaping Saddle Points in Bilevel Optimization JMLR 2025

Forward Knows Efficient Backward Path: Saliency-Guided Memory-Efficient Fine-tuning of Large Language Models ACL 2025

SYSTRAN @ IWSLT 2025 Low-resource track ACL 2025

Hallucination Detox: Sensitivity Dropout (SenD) for Large Language Model Training ACL 2025

YuLan-Mini: Pushing the Limits of Open Data-efficient Language Model ACL 2025

A Priori Estimation of the Approximation, Optimization and Generalization Errors of Random Neural Networks for Solving Partial Differential Equations IJCAI 2025

Equivariant Manifold Neural ODEs and Differential Invariants JMLR 2025

ECHO-LLaMA: Efficient Caching for High-Performance LLaMA Training EMNLP 2025