Research Explorer
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Achievements
About
Methodology
← Methods
Reinforcement Learning
›
Methods
›
Deep RL
3861 directly classified papers
Papers per year
2005: 1
2006: 9
2007: 14
2008: 15
2009: 9
2010: 21
2011: 27
2012: 32
2013: 21
2014: 17
2015: 10
2016: 33
2017: 102
2018: 222
2019: 399
2020: 450
2021: 533
2022: 478
2023: 532
2024: 513
2025: 326
2026: 97
Papers
Partial Identifiability in Inverse Reinforcement Learning for Agents with Non-Exponential Discounting
AAAI 2025
CodeTool: Enhancing Programmatic Tool Invocation of LLMs via Process Supervision
ACL 2025
The Indoor-Training Effect: Unexpected Gains from Distribution Shifts in the Transition Function
AAAI 2025
A Reinforcement Learning Framework for Cross-Lingual Stance Detection Using Chain-of-Thought Alignment
ACL 2025
Adversarial Preference Learning for Robust LLM Alignment
ACL 2025
Removing Prompt-template Bias in Reinforcement Learning from Human Feedback
ACL 2025
Should I Trust You? Detecting Deception in Negotiations using Counterfactual RL
ACL 2025
Text2World: Benchmarking Large Language Models for Symbolic World Model Generation
ACL 2025
Direct Repair Optimization: Training Small Language Models For Educational Program Repair Improves Feedback
ACL 2025
Team XSZ at BioLaySumm2025: Section-Wise Summarization, Retrieval-Augmented LLM, and Reinforcement Learning Fine-Tuning for Lay Summaries
ACL 2025
RL + Transformer = A General-Purpose Problem Solver
ACL 2025
Sparks of Tabular Reasoning via Text2SQL Reinforcement Learning
ACL 2025
Enhancing AMR Parsing with Group Relative Policy Optimization
ACL 2025
LLMSR@XLLM25: A Language Model-Based Pipeline for Structured Reasoning Data Construction
ACL 2025
Self-Rewarding Large Vision-Language Models for Optimizing Prompts in Text-to-Image Generation
ACL 2025
JARVIS-VLA: Post-Training Large-Scale Vision Language Models to Play Visual Games with Keyboards and Mouse
ACL 2025
Efficient and Robust Reinforcement Learning from Human Feedback
AAAI 2025
Deep Implicit Imitation Reinforcement Learning in Heterogeneous Action Settings
AAAI 2025
EQA-RM: A Generative Embodied Reward Model with Test-time Scaling
EMNLP 2025
ActionStudio: A Lightweight Framework for Data and Training of Large Action Models
EMNLP 2025
Novelty-Guided Data Reuse for Efficient and Diversified Multi-Agent Reinforcement Learning
AAAI 2025
DUO: Diverse, Uncertain, On-Policy Query Generation and Selection for Reinforcement Learning from Human Feedback
AAAI 2025
Sketch-to-Skill: Bootstrapping Robot Learning with Human Drawn Trajectory Sketches
RSS 2025
Safety with Agency: Human-Centered Safety Filter with Application to AI-Assisted Motorsports
RSS 2025
IGL-Nav: Incremental 3D Gaussian Localization for Image-goal Navigation
ICCV 2025
<
1
…
14
15
16
…
155
>