Reinforcement Learning › Methods ›

Deep RL

3861 directly classified papers

Papers per year

Papers

RL + Transformer = A General-Purpose Problem Solver ACL 2025

Stabilizing and Accelerating Autofocus with Expert Trajectory Regularized Deep Reinforcement Learning CVPR 2025

Vid2Sim: Realistic and Interactive Simulation from Video for Urban Navigation CVPR 2025

NavQ: Learning a Q-Model for Foresighted Vision-and-Language Navigation ICCV 2025

IGL-Nav: Incremental 3D Gaussian Localization for Image-goal Navigation ICCV 2025

Table-R1: Inference-Time Scaling for Table Reasoning Tasks EMNLP 2025

ReviewRL: Towards Automated Scientific Review with RL EMNLP 2025

Diffusion Guided Adaptive Augmentation for Generalization in Visual Reinforcement Learning ICCV 2025

Med-VRAgent: A Framework for Medical Visual Reasoning-Enhanced Agents EMNLP 2025

Enhancing RLHF with Human Gaze Modeling EMNLP 2025

Steering LLM Reasoning Through Bias-Only Adaptation EMNLP 2025

FlightGPT: Towards Generalizable and Interpretable UAV Vision-and-Language Navigation with Vision-Language Models EMNLP 2025

Look Again, Think Slowly: Enhancing Visual Reflection in Vision-Language Models EMNLP 2025

GRPO-LEAD: A Difficulty-Aware Reinforcement Learning Approach for Concise Mathematical Reasoning in Language Models EMNLP 2025

ULTHO: Ultra-Lightweight yet Efficient Hyperparameter Optimization in Deep Reinforcement Learning ICCV 2025

Identification of Multiple Logical Interpretations in Counter-Arguments EMNLP 2025

Let’s Reason Formally: Natural-Formal Hybrid Reasoning Enhances LLM’s Math Capability EMNLP 2025

NL2Lean: Translating Natural Language into Lean 4 through Multi-Aspect Reinforcement Learning EMNLP 2025

GAPO: Learning Preferential Prompt through Generative Adversarial Policy Optimization ACL 2025

Optimal Viewpoint Selection for Autonomous Photography Using Reinforcement Learning AAAI 2025

CodeTool: Enhancing Programmatic Tool Invocation of LLMs via Process Supervision ACL 2025

Active Geospatial Search for Efficient Tenant Eviction Outreach AAAI 2025

A Case for Validation Buffer in Pessimistic Actor-Critic IJCAI 2025

Towards Robust, Efficient, and Practical Decision-Making: From Reward-Maximizing Deep Reinforcement Learning to Reward-Matching GFlowNets AAAI 2025

Dialogue Systems for Emotional Support via Value Reinforcement ACL 2025