Research Explorer
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Achievements
About
Methodology
← Methods
Reinforcement Learning
›
Methods
›
Deep RL
3861 directly classified papers
Papers per year
2005: 1
2006: 9
2007: 14
2008: 15
2009: 9
2010: 21
2011: 27
2012: 32
2013: 21
2014: 17
2015: 10
2016: 33
2017: 102
2018: 222
2019: 399
2020: 450
2021: 533
2022: 478
2023: 532
2024: 513
2025: 326
2026: 97
Papers
RL + Transformer = A General-Purpose Problem Solver
ACL 2025
Stabilizing and Accelerating Autofocus with Expert Trajectory Regularized Deep Reinforcement Learning
CVPR 2025
Vid2Sim: Realistic and Interactive Simulation from Video for Urban Navigation
CVPR 2025
NavQ: Learning a Q-Model for Foresighted Vision-and-Language Navigation
ICCV 2025
IGL-Nav: Incremental 3D Gaussian Localization for Image-goal Navigation
ICCV 2025
Table-R1: Inference-Time Scaling for Table Reasoning Tasks
EMNLP 2025
ReviewRL: Towards Automated Scientific Review with RL
EMNLP 2025
Diffusion Guided Adaptive Augmentation for Generalization in Visual Reinforcement Learning
ICCV 2025
Med-VRAgent: A Framework for Medical Visual Reasoning-Enhanced Agents
EMNLP 2025
Enhancing RLHF with Human Gaze Modeling
EMNLP 2025
Steering LLM Reasoning Through Bias-Only Adaptation
EMNLP 2025
FlightGPT: Towards Generalizable and Interpretable UAV Vision-and-Language Navigation with Vision-Language Models
EMNLP 2025
Look Again, Think Slowly: Enhancing Visual Reflection in Vision-Language Models
EMNLP 2025
GRPO-LEAD: A Difficulty-Aware Reinforcement Learning Approach for Concise Mathematical Reasoning in Language Models
EMNLP 2025
ULTHO: Ultra-Lightweight yet Efficient Hyperparameter Optimization in Deep Reinforcement Learning
ICCV 2025
Identification of Multiple Logical Interpretations in Counter-Arguments
EMNLP 2025
Let’s Reason Formally: Natural-Formal Hybrid Reasoning Enhances LLM’s Math Capability
EMNLP 2025
NL2Lean: Translating Natural Language into Lean 4 through Multi-Aspect Reinforcement Learning
EMNLP 2025
GAPO: Learning Preferential Prompt through Generative Adversarial Policy Optimization
ACL 2025
Optimal Viewpoint Selection for Autonomous Photography Using Reinforcement Learning
AAAI 2025
CodeTool: Enhancing Programmatic Tool Invocation of LLMs via Process Supervision
ACL 2025
Active Geospatial Search for Efficient Tenant Eviction Outreach
AAAI 2025
A Case for Validation Buffer in Pessimistic Actor-Critic
IJCAI 2025
Towards Robust, Efficient, and Practical Decision-Making: From Reward-Maximizing Deep Reinforcement Learning to Reward-Matching GFlowNets
AAAI 2025
Dialogue Systems for Emotional Support via Value Reinforcement
ACL 2025
<
1
…
5
6
7
…
155
>