Research Explorer
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Achievements
About
Methodology
← Learning Types
Machine Learning
›
Learning Types
›
Reinforcement Learning
2932 directly classified papers
Papers per year
2003: 1
2006: 11
2007: 18
2008: 23
2009: 14
2010: 22
2011: 24
2012: 34
2013: 26
2014: 24
2015: 14
2016: 23
2017: 79
2018: 182
2019: 255
2020: 284
2021: 333
2022: 319
2023: 315
2024: 457
2025: 419
2026: 55
Papers
ConciseRL: Conciseness-Guided Reinforcement Learning for Efficient Reasoning Models
EMNLP 2025
Hierarchical Reward Modeling for Fault Localization in Large Code Repositories
EMNLP 2025
INREACT: An Inspire-Then-Reinforce Training Framework For Multimodal GUI Agent
EMNLP 2025
RAISE: Reinforced Adaptive Instruction Selection For Large Language Models
EMNLP 2025
Graph-Reward-SQL: Execution-Free Reinforcement Learning for Text-to-SQL via Graph Matching and Stepwise Reward
EMNLP 2025
Legal Mathematical Reasoning with LLMs: Procedural Alignment through Two-Stage Reinforcement Learning
EMNLP 2025
Beyond the First Error: Process Reward Models for Reflective Mathematical Reasoning
EMNLP 2025
The Hallucination Tax of Reinforcement Finetuning
EMNLP 2025
DCRM: A Heuristic to Measure Response Pair Quality in Preference Optimization
EMNLP 2025
Alpha-GPT: Human-AI Interactive Alpha Mining for Quantitative Investment
EMNLP 2025
Auto-Weighted Group Relative Preference Optimization for Multi-Objective Text Generation Tasks
EMNLP 2025
CTR-Guided Generative Query Suggestion in Conversational Search
EMNLP 2025
Beyond Correctness: Confidence-Aware Reward Modeling for Enhancing Large Language Model Reasoning
EMNLP 2025
ThinkTuning: Instilling Cognitive Reflections without Distillation
EMNLP 2025
Step-level Verifier-guided Hybrid Test-Time Scaling for Large Language Models
EMNLP 2025
Beyond Static Testbeds: An Interaction-Centric Agent Simulation Platform for Dynamic Recommender Systems
EMNLP 2025
PLLuM-Align: Polish Preference Dataset for Large Language Model Alignment
EMNLP 2025
RAG-Zeval: Enhancing RAG Responses Evaluator through End-to-End Reasoning and Ranking-Based Reinforcement Learning
EMNLP 2025
SMART: Simulated Students Aligned with Item Response Theory for Question Difficulty Prediction
EMNLP 2025
AgentPro: Enhancing LLM Agents with Automated Process Supervision
EMNLP 2025
iTool: Reinforced Fine-Tuning with Dynamic Deficiency Calibration for Advanced Tool Use
EMNLP 2025
Governance in Motion: Co-evolution of Constitutions and AI models for Scalable Safety
EMNLP 2025
Learning Like Humans: Advancing LLM Reasoning Capabilities via Adaptive Difficulty Curriculum Learning and Expert-Guided Self-Reformulation
EMNLP 2025
WebAgent-R1: Training Web Agents via End-to-End Multi-Turn Reinforcement Learning
EMNLP 2025
RAGferee: Building Contextual Reward Models for Retrieval-Augmented Generation
EMNLP 2025
<
1
2
3
4
5
…
118
>