Research Explorer
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Achievements
About
Methodology
← Learning Types
Machine Learning
›
Learning Types
›
Reinforcement Learning
2932 directly classified papers
Papers per year
2003: 1
2006: 11
2007: 18
2008: 23
2009: 14
2010: 22
2011: 24
2012: 34
2013: 26
2014: 24
2015: 14
2016: 23
2017: 79
2018: 182
2019: 255
2020: 284
2021: 333
2022: 319
2023: 315
2024: 457
2025: 419
2026: 55
Papers
Q-Distribution guided Q-learning for offline reinforcement learning: Uncertainty penalized Q-value via consistency model
NIPS 2024
How does Inverse RL Scale to Large State Spaces? A Provably Efficient Approach
NIPS 2024
Recursive Introspection: Teaching Language Model Agents How to Self-Improve
NIPS 2024
When to Act and When to Ask: Policy Learning With Deferral Under Hidden Confounding
NIPS 2024
Iteratively Refined Behavior Regularization for Offline Reinforcement Learning
NIPS 2024
GTA: Generative Trajectory Augmentation with Guidance for Offline Reinforcement Learning
NIPS 2024
Geometric-Averaged Preference Optimization for Soft Preference Labels
NIPS 2024
Rethinking Exploration in Reinforcement Learning with Effective Metric-Based Exploration Bonus
NIPS 2024
Model-based Diffusion for Trajectory Optimization
NIPS 2024
Learning to Cooperate with Humans using Generative Agents
NIPS 2024
RA-PbRL: Provably Efficient Risk-Aware Preference-Based Reinforcement Learning
NIPS 2024
Exploring the Edges of Latent State Clusters for Goal-Conditioned Reinforcement Learning
NIPS 2024
Worst-Case Offline Reinforcement Learning with Arbitrary Data Support
NIPS 2024
Periodic agent-state based Q-learning for POMDPs
NIPS 2024
Regularizing Hidden States Enables Learning Generalizable Reward Model for LLMs
NIPS 2024
SurgicAI: A Hierarchical Platform for Fine-Grained Surgical Policy Learning and Benchmarking
NIPS 2024
Provable Partially Observable Reinforcement Learning with Privileged Information
NIPS 2024
Flipping-based Policy for Chance-Constrained Markov Decision Processes
NIPS 2024
Reinforcement Learning with Lookahead Information
NIPS 2024
ReST-MCTS*: LLM Self-Training via Process Reward Guided Tree Search
NIPS 2024
Beyond Optimism: Exploration With Partially Observable Rewards
NIPS 2024
Sequential Decision Making with Expert Demonstrations under Unobserved Heterogeneity
NIPS 2024
Learning Versatile Skills with Curriculum Masking
NIPS 2024
Adaptive Exploration for Data-Efficient General Value Function Evaluations
NIPS 2024
Physics-Informed Representation and Learning: Control and Risk Quantification
AAAI 2024
<
1
…
32
33
34
…
118
>