Research Explorer
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Achievements
About
Methodology
← Learning Types
Machine Learning
›
Learning Types
›
Reinforcement Learning
2932 directly classified papers
Papers per year
2003: 1
2006: 11
2007: 18
2008: 23
2009: 14
2010: 22
2011: 24
2012: 34
2013: 26
2014: 24
2015: 14
2016: 23
2017: 79
2018: 182
2019: 255
2020: 284
2021: 333
2022: 319
2023: 315
2024: 457
2025: 419
2026: 55
Papers
Improved off-policy training of diffusion samplers
NIPS 2024
The surprising efficiency of temporal difference learning for rare event prediction
NIPS 2024
QGFN: Controllable Greediness with Action Values
NIPS 2024
Mitigating Reward Overoptimization via Lightweight Uncertainty Estimation
NIPS 2024
A Structure-Aware Framework for Learning Device Placements on Computation Graphs
NIPS 2024
Online Iterative Reinforcement Learning from Human Feedback with General Preference Model
NIPS 2024
Leveraging Separated World Model for Exploration in Visually Distracted Environments
NIPS 2024
CE-NAS: An End-to-End Carbon-Efficient Neural Architecture Search Framework
NIPS 2024
RL in Latent MDPs is Tractable: Online Guarantees via Off-Policy Evaluation
NIPS 2024
Trajectory Data Suffices for Statistically Efficient Learning in Offline RL with Linear $q^\pi$-Realizability and Concentrability
NIPS 2024
One-Shot Safety Alignment for Large Language Models via Optimal Dualization
NIPS 2024
Sub-optimal Experts mitigate Ambiguity in Inverse Reinforcement Learning
NIPS 2024
ORPO: Monolithic Preference Optimization without Reference Model
EMNLP 2024
Prior-dependent analysis of posterior sampling reinforcement learning with function approximation
AISTATS 2024
BPO: Staying Close to the Behavior LLM Creates Better Online LLM Alignment
EMNLP 2024
CleanDiffuser: An Easy-to-use Modularized Library for Diffusion Models in Decision Making
NIPS 2024
Enhancing Reinforcement Learning with Dense Rewards from Language Model Critic
EMNLP 2024
Adaptive $Q$-Aid for Conditional Supervised Learning in Offline Reinforcement Learning
NIPS 2024
Towards Achieving Sub-linear Regret and Hard Constraint Violation in Model-free RL
AISTATS 2024
Causal Imitation for Markov Decision Processes: a Partial Identification Approach
NIPS 2024
LIONs: An Empirically Optimized Approach to Align Language Models
EMNLP 2024
MetaReflection: Learning Instructions for Language Agents using Past Reflections
EMNLP 2024
Efficient Contextual LLM Cascades through Budget-Constrained Policy Learning
NIPS 2024
When Your AIs Deceive You: Challenges of Partial Observability in Reinforcement Learning from Human Feedback
NIPS 2024
Adaptive Important Region Selection with Reinforced Hierarchical Search for Dense Object Detection
NIPS 2024
<
1
…
34
35
36
…
118
>