← Optimization & Theory

Machine Learning › Optimization & Theory ›

Neural Network Optimization

3648 directly classified papers

Papers per year

Papers

Characterizing Dynamical Stability of Stochastic Gradient Descent in Overparameterized Learning JMLR 2025

UCSC at SemEval-2025 Task 8: Question Answering over Tabular Data SEMEVAL 2025

Exploring the Limitations of Mamba in COPY and CoT Reasoning EMNLP 2025

MossNet: Mixture of State-Space Experts is a Multi-Head Attention IJCNLP 2025

Dynamic Rank Adjustment in Diffusion Policies for Efficient and Flexible Training RSS 2025

Curved Worlds, Clear Boundaries: Generalizing Speech Deepfake Detection using Hyperbolic and Spherical Geometry Spaces AACL 2025

Debiasing 6-DOF IMU via Hierarchical Learning of Continuous Bias Dynamics RSS 2025

One-Pass to Reason: Token Duplication and Block-Sparse Mask for Efficient Fine-Tuning on Multi-Turn Reasoning IJCNLP 2025

ECHO-LLaMA: Efficient Caching for High-Performance LLaMA Training EMNLP 2025

PoseMamba: Monocular 3D Human Pose Estimation with Bidirectional Global-Local Spatio-Temporal State Space Model AAAI 2025

MossNet: Mixture of State-Space Experts is a Multi-Head Attention AACL 2025

SYSTRAN @ IWSLT 2025 Low-resource track ACL 2025

Sensitivity-LoRA : Low-Load Sensitivity-Based Fine-Tuning for Large Language Models EMNLP 2025

Learning from Streaming Video with Orthogonal Gradients CVPR 2025

Robust and Adaptive AI Models for Medication Usage Forecasting Using ICD-9/10 Code (Student Abstract) AAAI 2025

Layer Duplication in LLMs EMNLP 2025

Understanding the Language Model to Solve the Symbolic Multi-Step Reasoning Problem from the Perspective of Buffer Mechanism EMNLP 2025

SSE-SAM: Balancing Head and Tail Classes Gradually Through Stage-Wise SAM AAAI 2025

FastVLM: Self-Speculative Decoding for Fast Vision-Language Model Inference AACL 2025

Accelerating Large Language Model Pretraining via LFR Pedagogy: Learn, Focus, and Review ACL 2025

AdamS: Momentum Itself Can Be A Normalizer for LLM Pretraining and Post-training EMNLP 2025

Cold Starts and Hard Cases: A Two-Stage SFT-RLVR Approach for Legal Machine Translation (Just-NLP L-MT shared task) IJCNLP 2025

depyf: Open the Opaque Box of PyTorch Compiler for Machine Learning Researchers JMLR 2025

Steering Knowledge Selection Behaviours in LLMs via SAE-Based Representation Engineering NAACL 2025

Boosting Policy and Process Reward Models with Monte Carlo Tree Search in Open-Domain QA ACL 2025