Research Explorer
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Achievements
About
Methodology
← Learning Types
Deep Learning
›
Learning Types
›
Reinforcement Learning
1263 directly classified papers
Papers per year
2006: 1
2007: 2
2008: 3
2009: 2
2010: 1
2011: 2
2012: 3
2013: 2
2014: 3
2015: 2
2016: 8
2017: 44
2018: 95
2019: 134
2020: 123
2021: 131
2022: 143
2023: 127
2024: 194
2025: 240
2026: 3
Papers
Filtered Direct Preference Optimization
EMNLP 2024
Learning Efficient and Robust Multi-Agent Communication via Graph Information Bottleneck
AAAI 2024
MACAROON: Training Vision-Language Models To Be Your Engaged Partners
EMNLP 2024
Inverse-Q*: Token Level Reinforcement Learning for Aligning Large Language Models Without Preference Data
EMNLP 2024
Enhancing Alignment using Curriculum Learning & Ranked Preferences
EMNLP 2024
Navigating Noisy Feedback: Enhancing Reinforcement Learning with Error-Prone Language Models
EMNLP 2024
Training Diffusion Models Towards Diverse Image Generation with Reinforcement Learning
CVPR 2024
No Prior Mask: Eliminate Redundant Action for Deep Reinforcement Learning
AAAI 2024
CrystalBox: Future-Based Explanations for Input-Driven Deep RL Systems
AAAI 2024
Mean-Field Approximation of Cooperative Constrained Multi-Agent Reinforcement Learning (CMARL)
JMLR 2024
Hierarchical Diffusion Policy for Kinematics-Aware Multi-Task Robotic Manipulation
CVPR 2024
InstructVideo: Instructing Video Diffusion Models with Human Feedback
CVPR 2024
Graph-Based Prediction and Planning Policy Network (GP3Net) for Scalable Self-Driving in Dynamic Environments Using Deep Reinforcement Learning
AAAI 2024
DGPO: Discovering Multiple Strategies with Diversity-Guided Policy Optimization
AAAI 2024
ChatScene: Knowledge-Enabled Safety-Critical Scenario Generation for Autonomous Vehicles
CVPR 2024
Focus-Then-Decide: Segmentation-Assisted Reinforcement Learning
AAAI 2024
Offline Model-Based Optimization via Policy-Guided Gradient Search
AAAI 2024
Rethinking Discount Regularization: New Interpretations, Unintended Consequences, and Solutions for Regularization in Reinforcement Learning
JMLR 2024
PRDP: Proximal Reward Difference Prediction for Large-Scale Reward Finetuning of Diffusion Models
CVPR 2024
Sample Complexity of Neural Policy Mirror Descent for Policy Optimization on Low-Dimensional Manifolds
JMLR 2024
Diffusion Model Alignment Using Direct Preference Optimization
CVPR 2024
Action Gaps and Advantages in Continuous-Time Distributional Reinforcement Learning
NIPS 2024
When to Trust LLMs: Aligning Confidence with Response Quality
ACL 2024
AlignSAM: Aligning Segment Anything Model to Open Context via Reinforcement Learning
CVPR 2024
LM2: A Simple Society of Language Models Solves Complex Reasoning
EMNLP 2024
<
1
…
15
16
17
…
51
>