Research Explorer
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Achievements
About
Methodology
← Learning Types
Machine Learning
›
Learning Types
›
Reinforcement Learning
2932 directly classified papers
Papers per year
2003: 1
2006: 11
2007: 18
2008: 23
2009: 14
2010: 22
2011: 24
2012: 34
2013: 26
2014: 24
2015: 14
2016: 23
2017: 79
2018: 182
2019: 255
2020: 284
2021: 333
2022: 319
2023: 315
2024: 457
2025: 419
2026: 55
Papers
Flipping-based Policy for Chance-Constrained Markov Decision Processes
NIPS 2024
Imitate the Good and Avoid the Bad: An Incremental Approach to Safe Reinforcement Learning
AAAI 2024
Reinforcement Learning with Lookahead Information
NIPS 2024
Provable Partially Observable Reinforcement Learning with Privileged Information
NIPS 2024
Get a Head Start: On-Demand Pedagogical Policy Selection in Intelligent Tutoring
AAAI 2024
ReST-MCTS*: LLM Self-Training via Process Reward Guided Tree Search
NIPS 2024
Sequential Decision Making with Expert Demonstrations under Unobserved Heterogeneity
NIPS 2024
Periodic agent-state based Q-learning for POMDPs
NIPS 2024
Beyond Expected Return: Accounting for Policy Reproducibility When Evaluating Reinforcement Learning Algorithms
AAAI 2024
Regularizing Hidden States Enables Learning Generalizable Reward Model for LLMs
NIPS 2024
SUF: Stabilized Unconstrained Fine-Tuning for Offline-to-Online Reinforcement Learning
AAAI 2024
SurgicAI: A Hierarchical Platform for Fine-Grained Surgical Policy Learning and Benchmarking
NIPS 2024
Beyond Optimism: Exploration With Partially Observable Rewards
NIPS 2024
A PAC Learning Algorithm for LTL and Omega-Regular Objectives in MDPs
AAAI 2024
Learning to Cooperate with Humans using Generative Agents
NIPS 2024
BadRL: Sparse Targeted Backdoor Attack against Reinforcement Learning
AAAI 2024
RA-PbRL: Provably Efficient Risk-Aware Preference-Based Reinforcement Learning
NIPS 2024
Model-based Diffusion for Trajectory Optimization
NIPS 2024
Exact Policy Recovery in Offline RL with Both Heavy-Tailed Rewards and Data Corruption
AAAI 2024
Exploring the Edges of Latent State Clusters for Goal-Conditioned Reinforcement Learning
NIPS 2024
On the Model-Misspecification in Reinforcement Learning
AISTATS 2024
Offline Model-Based Optimization via Policy-Guided Gradient Search
AAAI 2024
Geometric-Averaged Preference Optimization for Soft Preference Labels
NIPS 2024
Rethinking Exploration in Reinforcement Learning with Effective Metric-Based Exploration Bonus
NIPS 2024
Regret Analysis of Policy Gradient Algorithm for Infinite Horizon Average Reward Markov Decision Processes
AAAI 2024
<
1
…
23
24
25
…
118
>