Research Explorer
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Achievements
About
Methodology
← Methods
Reinforcement Learning
›
Methods
›
Deep RL
3861 directly classified papers
Papers per year
2005: 1
2006: 9
2007: 14
2008: 15
2009: 9
2010: 21
2011: 27
2012: 32
2013: 21
2014: 17
2015: 10
2016: 33
2017: 102
2018: 222
2019: 399
2020: 450
2021: 533
2022: 478
2023: 532
2024: 513
2025: 326
2026: 97
Papers
Logic-Q: Improving Deep Reinforcement Learning-based Quantitative Trading via Program Sketch-based Tuning
AAAI 2025
Query-efficient Attack for Black-box Image Inpainting Forensics via Reinforcement Learning
AAAI 2025
CTD4 – a Deep Continuous Distributional Actor-Critic Agent with a Kalman Fusion of Multiple Critics
AAAI 2025
Noise-Resilient Symbolic Regression with Dynamic Gating Reinforcement Learning
AAAI 2025
Latent Reward: LLM-Empowered Credit Assignment in Episodic Reinforcement Learning
AAAI 2025
Epistemic Bellman Operators
AAAI 2025
Agent-Aware Training for Agent-Agnostic Action Advising in Deep Reinforcement Learning
AAAI 2025
SMoSE: Sparse Mixture of Shallow Experts for Interpretable Reinforcement Learning in Continuous Control Tasks
AAAI 2025
ACECODER: Acing Coder RL via Automated Test-Case Synthesis
ACL 2025
Intelligent OPC Engineer Assistant for Semiconductor Manufacturing
AAAI 2025
SrSv: Integrating Sequential Rollouts with Sequential Value Estimation for Multi-agent Reinforcement Learning
AAAI 2025
Curiosity-Driven Reinforcement Learning from Human Feedback
ACL 2025
ReflectDiffu: Reflect between Emotion-intent Contagion and Mimicry for Empathetic Response Generation via a RL-Diffusion Framework
ACL 2025
OpenWebVoyager: Building Multimodal Web Agents via Iterative Real-World Exploration, Feedback and Optimization
ACL 2025
NavQ: Learning a Q-Model for Foresighted Vision-and-Language Navigation
ICCV 2025
RAVEN: Robust Advertisement Video Violation Temporal Grounding via Reinforcement Reasoning
ACL 2025
LLM-Enhanced Self-Evolving Reinforcement Learning for Multi-Step E-Commerce Payment Fraud Risk Detection
ACL 2025
Learning Joint Behaviors with Large Variations
AAAI 2025
FLAG-TRADER: Fusion LLM-Agent with Gradient-based Reinforcement Learning for Financial Trading
ACL 2025
bea-jh at BEA 2025 Shared Task: Evaluating AI-powered Tutors through Pedagogically-Informed Reasoning
ACL 2025
DecEx-RAG: Boosting Agentic Retrieval-Augmented Generation with Decision and Execution Optimization via Process Supervision
EMNLP 2025
Unearthing Gems from Stones: Policy Optimization with Negative Sample Augmentation for LLM Reasoning
EMNLP 2025
IGL-Nav: Incremental 3D Gaussian Localization for Image-goal Navigation
ICCV 2025
Training Language Models to Critique With Multi-agent Feedback
EMNLP 2025
Hierarchical Decision Making Based on Structural Information Principles
JMLR 2025
<
1
…
13
14
15
…
155
>