← Learning Types

Machine Learning › Learning Types ›

Reinforcement Learning

2932 directly classified papers

Papers per year

Papers

ConciseRL: Conciseness-Guided Reinforcement Learning for Efficient Reasoning Models EMNLP 2025

Hierarchical Reward Modeling for Fault Localization in Large Code Repositories EMNLP 2025

INREACT: An Inspire-Then-Reinforce Training Framework For Multimodal GUI Agent EMNLP 2025

RAISE: Reinforced Adaptive Instruction Selection For Large Language Models EMNLP 2025

Graph-Reward-SQL: Execution-Free Reinforcement Learning for Text-to-SQL via Graph Matching and Stepwise Reward EMNLP 2025

Legal Mathematical Reasoning with LLMs: Procedural Alignment through Two-Stage Reinforcement Learning EMNLP 2025

Beyond the First Error: Process Reward Models for Reflective Mathematical Reasoning EMNLP 2025

The Hallucination Tax of Reinforcement Finetuning EMNLP 2025

DCRM: A Heuristic to Measure Response Pair Quality in Preference Optimization EMNLP 2025

Alpha-GPT: Human-AI Interactive Alpha Mining for Quantitative Investment EMNLP 2025

Auto-Weighted Group Relative Preference Optimization for Multi-Objective Text Generation Tasks EMNLP 2025

CTR-Guided Generative Query Suggestion in Conversational Search EMNLP 2025

Beyond Correctness: Confidence-Aware Reward Modeling for Enhancing Large Language Model Reasoning EMNLP 2025

ThinkTuning: Instilling Cognitive Reflections without Distillation EMNLP 2025

Step-level Verifier-guided Hybrid Test-Time Scaling for Large Language Models EMNLP 2025

Beyond Static Testbeds: An Interaction-Centric Agent Simulation Platform for Dynamic Recommender Systems EMNLP 2025

PLLuM-Align: Polish Preference Dataset for Large Language Model Alignment EMNLP 2025

RAG-Zeval: Enhancing RAG Responses Evaluator through End-to-End Reasoning and Ranking-Based Reinforcement Learning EMNLP 2025

SMART: Simulated Students Aligned with Item Response Theory for Question Difficulty Prediction EMNLP 2025

AgentPro: Enhancing LLM Agents with Automated Process Supervision EMNLP 2025

iTool: Reinforced Fine-Tuning with Dynamic Deficiency Calibration for Advanced Tool Use EMNLP 2025

Governance in Motion: Co-evolution of Constitutions and AI models for Scalable Safety EMNLP 2025

Learning Like Humans: Advancing LLM Reasoning Capabilities via Adaptive Difficulty Curriculum Learning and Expert-Guided Self-Reformulation EMNLP 2025

WebAgent-R1: Training Web Agents via End-to-End Multi-Turn Reinforcement Learning EMNLP 2025

RAGferee: Building Contextual Reward Models for Retrieval-Augmented Generation EMNLP 2025