Research Explorer
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Achievements
About
Methodology
← Learning Types
Machine Learning
›
Learning Types
›
Reinforcement Learning
2932 directly classified papers
Papers per year
2003: 1
2006: 11
2007: 18
2008: 23
2009: 14
2010: 22
2011: 24
2012: 34
2013: 26
2014: 24
2015: 14
2016: 23
2017: 79
2018: 182
2019: 255
2020: 284
2021: 333
2022: 319
2023: 315
2024: 457
2025: 419
2026: 55
Papers
Learning Safety Constraints from Demonstrations with Unknown Rewards
AISTATS 2024
Carve3D: Improving Multi-view Reconstruction Consistency for Diffusion Models with RL Finetuning
CVPR 2024
DiffPhyCon: A Generative Approach to Control Complex Physical Systems
NIPS 2024
Graph Diffusion Policy Optimization
NIPS 2024
Implicit Curriculum in Procgen Made Explicit
NIPS 2024
RGMDT: Return-Gap-Minimizing Decision Tree Extraction in Non-Euclidean Metric Space
NIPS 2024
A Fairness-Driven Method for Learning Human-Compatible Negotiation Strategies
EMNLP 2024
Coevolving with the Other You: Fine-Tuning LLM with Sequential Cooperative Multi-Agent Reinforcement Learning
NIPS 2024
Enhancing Efficiency of Safe Reinforcement Learning via Sample Manipulation
NIPS 2024
Zero-Shot Reinforcement Learning from Low Quality Data
NIPS 2024
BoNBoN Alignment for Large Language Models and the Sweetness of Best-of-n Sampling
NIPS 2024
Robust Reinforcement Learning with General Utility
NIPS 2024
Statistical Efficiency of Distributional Temporal Difference Learning
NIPS 2024
Policy Learning from Tutorial Books via Understanding, Rehearsing and Introspecting
NIPS 2024
Rewarding What Matters: Step-by-Step Reinforcement Learning for Task-Oriented Dialogue
EMNLP 2024
Autoregressive Multi-trait Essay Scoring via Reinforcement Learning with Scoring-aware Multiple Rewards
EMNLP 2024
Occupancy-based Policy Gradient: Estimation, Convergence, and Optimality
NIPS 2024
AlphaMath Almost Zero: Process Supervision without Process
NIPS 2024
ABLE: Personalized Disability Support with Politeness and Empathy Integration
EMNLP 2024
Reinforcement Learning with Adaptive Regularization for Safe Control of Critical Systems
NIPS 2024
Amortized Active Causal Induction with Deep Reinforcement Learning
NIPS 2024
Learning an Actionable Discrete Diffusion Policy via Large-Scale Actionless Video Pre-Training
NIPS 2024
Identifying Selections for Unsupervised Subtask Discovery
NIPS 2024
Time-Constrained Robust MDPs
NIPS 2024
Reward Modeling Requires Automatic Adjustment Based on Data Quality
EMNLP 2024
<
1
…
28
29
30
…
118
>