Research Explorer
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Achievements
About
Methodology
← Methods
Reinforcement Learning
›
Methods
›
Deep RL
3861 directly classified papers
Papers per year
2005: 1
2006: 9
2007: 14
2008: 15
2009: 9
2010: 21
2011: 27
2012: 32
2013: 21
2014: 17
2015: 10
2016: 33
2017: 102
2018: 222
2019: 399
2020: 450
2021: 533
2022: 478
2023: 532
2024: 513
2025: 326
2026: 97
Papers
Robust Adaptive Multi-Step Predictive Shielding (Student Abstract)
AAAI 2026
Memory-Augmented Representation for Efficient Event-based Visuomotor Policy Learning with Adaptive Perception and Control
WACV 2026
G-UBS: Towards Robust Understanding of Implicit Feedback via Group-Aware User Behavior Simulation
AAAI 2026
GRDC: A Unified Graph-Driven Framework for Role Discovery and Communication in Multi-Agent Reinforcement Learning
AAAI 2026
DiffOP: Reinforcement Learning of Optimization-Based Control Policies via Implicit Policy Gradients
AAAI 2026
A Differential Perspective on Distributional Reinforcement Learning
AAAI 2026
Improving Stochastic Action-Constrained Reinforcement Learning via Truncated Distributions
AAAI 2026
MrCoM: A Meta-Regularized World-Model Generalizing Across Multi-Scenarios
AAAI 2026
Variance Reduction via Resampling and Experience Replay
AAAI 2026
Gradient-Protected Value Decomposition for Cooperative Multi-Agent Reinforcement Learning
AAAI 2026
Beyond Monotonicity: Revisiting Factorization Principles in Multi-Agent Q-Learning
AAAI 2026
Learning Branching Policies for MILPs with Proximal Policy Optimization
AAAI 2026
T4NMTD: Transition-Centric Reinforcement Learning for Non-Markovian Task Decomposition
AAAI 2026
Object-Centric World Models for Causality-Aware Reinforcement Learning
AAAI 2026
Hybrid PPO–DQN for Multi-Objective Adaptive Cruise Control in Eco-Driving: Reward Shaping Toward Safety and Sustainability (Student Abstract)
AAAI 2026
ESCA: An Emotional Support Conversation Agent for Enhancing Reasonable Strategy Planning and Effective Expression
AAAI 2026
BLM-Guard: Explainable Multimodal Ad Moderation with Chain-of-Thought and Policy-Aligned Rewards
AAAI 2026
Turn-PPO: Turn-Level Advantage Estimation with PPO for Improved Multi-Turn RL in Agentic LLMs
EACL 2026
ReACT: Reward-informed Autoregressive Decision CAD Transformer
AAAI 2026
Universal Compressed Image Restoration via Codec-Aware Conditioning with Reinforcement Learning
AAAI 2026
Where and What Matters: Sensitivity-Aware Task Vectors for Many-Shot Multimodal In-Context Learning
AAAI 2026
Arabic Dialect Translation with Small LLMs: Enhancing through Reasoning-Oriented Reinforcement Learning
EACL 2026
A Reinforcement Learning Framework for Cross-Lingual Stance Detection Using Chain-of-Thought Alignment
ACL 2025
Highly Imperceptible Black-Box Graph Injection Attacks with Reinforcement Learning
AAAI 2025
GeoExplorer: Active Geo-localization with Curiosity-Driven Exploration
ICCV 2025
<
1
2
3
4
5
…
155
>