← Optimization & Theory

Machine Learning › Optimization & Theory ›

Theory

4950 directly classified papers

Papers per year

Papers

Benchmarking Language Model Creativity: A Case Study on Code Generation NAACL 2025

Self-Improvement Towards Pareto Optimality: Mitigating Preference Conflicts in Multi-Objective Alignment ACL 2025

Examining False Positives under Inference Scaling for Mathematical Reasoning EMNLP 2025

On the Distortion of Committee Election with 1-Euclidean Preferences and Few Distance Queries AAAI 2025

Hedging and Approximate Truthfulness in Traditional Forecasting Competitions AAAI 2025

Understanding the Information Propagation Effects of Communication Topologies in LLM-based Multi-Agent Systems EMNLP 2025

On the Relation Between Fine-Tuning, Topological Properties, and Task Performance in Sense-Enhanced Embeddings ACL 2025

ReEvalMed: Rethinking Medical Report Evaluation by Aligning Metrics with Real-World Clinical Judgment EMNLP 2025

Formal Synthesis of Safe Kolmogorov-Arnold Network Controllers with Barrier Certificates IJCAI 2025

VCSearch: Bridging the Gap Between Well-Defined and Ill-Defined Problems in Mathematical Reasoning EMNLP 2025

Metric Calculating Benchmark: Code-Verifiable Complicate Instruction Following Benchmark for Large Language Models EMNLP 2025

Stress-Testing the Reasoning Competence of Language Models With Formal Proofs EMNLP 2025

Do Influence Functions Work on Large Language Models? EMNLP 2025

The Role of Deductive and Inductive Reasoning in Large Language Models ACL 2025

VerifiAgent: a Unified Verification Agent in Language Model Reasoning EMNLP 2025

URO-Bench: Towards Comprehensive Evaluation for End-to-End Spoken Dialogue Models EMNLP 2025

Scaling Laws Are Unreliable for Downstream Tasks: A Reality Check EMNLP 2025

Can LLMs Reason Abstractly Over Math Word Problems Without CoT? Disentangling Abstract Formulation From Arithmetic Computation EMNLP 2025

ESGenius: Benchmarking LLMs on Environmental, Social, and Governance (ESG) and Sustainability Knowledge EMNLP 2025

Current Semantic-change Quantification Methods Struggle with Discovery in the Wild EMNLP 2025

Dynamic Pseudo Labeling via Gradient Cutting for High-Low Entropy Exploration CVPR 2025

Benchmark Profiling: Mechanistic Diagnosis of LLM Benchmarks EMNLP 2025

Calibration Across Layers: Understanding Calibration Evolution in LLMs EMNLP 2025

Reliable Evaluation and Benchmarks for Statement Autoformalization EMNLP 2025

O(d/T) Convergence Theory for Diffusion Probabilistic Models under Minimal Assumptions JMLR 2025