← Optimization & Theory

Machine Learning › Optimization & Theory ›

Neural Network Optimization

3648 directly classified papers

Papers per year

Papers

Mitigating Attention Localization in Small Scale: Self-Attention Refinement via One-step Belief Propagation EMNLP 2025

Accelerate Parallelizable Reasoning via Parallel Decoding within One Sequence EMNLP 2025

DoDS-IITPKD:Submissions to the WMT25 Low-Resource Indic Language Translation Task EMNLP 2025

A Training-Free Length Extrapolation Approach for LLMs: Greedy Attention Logit Interpolation EMNLP 2025

Exploring and Controlling Diversity in LLM-Agent Conversation EMNLP 2025

Rethinking the Role of Text Complexity in Language Model Pretraining EMNLP 2025

UniMath-CoT: A Unified Framework for Multimodal Mathematical Reasoning with Re-Inference Affirmation EMNLP 2025

Disentangling Mathematical Reasoning in LLMs: A Methodological Investigation of Internal Mechanisms EMNLP 2025

Think Clearly: Improving Reasoning via Redundant Token Pruning EMNLP 2025

SwiftPrune: Hessian-Free Weight Pruning for Large Language Models EMNLP 2025

Language Models Can Easily Learn to Reason from Demonstrations EMNLP 2025

Prejudge-Before-Think: Enhancing Large Language Models at Test-Time by Process Prejudge Reasoning EMNLP 2025

Training compute-optimal transformer encoder models EMNLP 2025

Fin-ExBERT: User Intent based Text Extraction in Financial Context using Graph-Augmented BERT and trainable Plugin EMNLP 2025

TL-Training: A Task-Feature-Based Framework for Training Large Language Models in Tool Use EMNLP 2025

How Does DPO Reduce Toxicity? A Mechanistic Neuron-Level Analysis EMNLP 2025

ThinkTuning: Instilling Cognitive Reflections without Distillation EMNLP 2025

E2LLM: Encoder Elongated Large Language Models for Long-Context Understanding and Reasoning EMNLP 2025

AdamS: Momentum Itself Can Be A Normalizer for LLM Pretraining and Post-training EMNLP 2025

Towards Infinite-Long Prefix in Transformer EMNLP 2025

MotionLab: Unified Human Motion Generation and Editing via the Motion-Condition-Motion Paradigm ICCV 2025

iTool: Reinforced Fine-Tuning with Dynamic Deficiency Calibration for Advanced Tool Use EMNLP 2025

MEPT: Mixture of Expert Prompt Tuning as a Manifold Mapper EMNLP 2025

Breaking the Attention Trap in Code LLMs: A Rejection Sampling Approach to Enhance Code Execution Prediction EMNLP 2025

ModRWKV: Transformer Multimodality in Linear Time EMNLP 2025