Research Explorer
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Achievements
About
Methodology
← Learning Types
Deep Learning
›
Learning Types
›
Reinforcement Learning
1263 directly classified papers
Papers per year
2006: 1
2007: 2
2008: 3
2009: 2
2010: 1
2011: 2
2012: 3
2013: 2
2014: 3
2015: 2
2016: 8
2017: 44
2018: 95
2019: 134
2020: 123
2021: 131
2022: 143
2023: 127
2024: 194
2025: 240
2026: 3
Papers
Self-Training with Direct Preference Optimization Improves Chain-of-Thought Reasoning
ACL 2024
Enhancing Numerical Reasoning with the Guidance of Reliable Reasoning Processes
ACL 2024
Trial and Error: Exploration-Based Trajectory Optimization of LLM Agents
ACL 2024
AlignSAM: Aligning Segment Anything Model to Open Context via Reinforcement Learning
CVPR 2024
ChiMed-GPT: A Chinese Medical Large Language Model with Full Training Regime and Better Alignment to Human Preferences
ACL 2024
Instance-aware Exploration-Verification-Exploitation for Instance ImageGoal Navigation
CVPR 2024
Training Language Models to Generate Text with Citations via Fine-grained Rewards
ACL 2024
InstructVideo: Instructing Video Diffusion Models with Human Feedback
CVPR 2024
M-RAG: Reinforcing Large Language Model Performance through Retrieval-Augmented Generation with Multiple Partitions
ACL 2024
Aligning Large Language Models via Fine-grained Supervision
ACL 2024
Inverse-Q*: Token Level Reinforcement Learning for Aligning Large Language Models Without Preference Data
EMNLP 2024
MACAROON: Training Vision-Language Models To Be Your Engaged Partners
EMNLP 2024
Proofread: Fixes All Errors with One Tap
ACL 2024
Carbon Footprint Reduction for Sustainable Data Centers in Real-Time
AAAI 2024
Reward Certification for Policy Smoothed Reinforcement Learning
AAAI 2024
Optimizing Language Models with Fair and Stable Reward Composition in Reinforcement Learning
EMNLP 2024
Two-Stage Evolutionary Reinforcement Learning for Enhancing Exploration and Exploitation
AAAI 2024
Robust Communicative Multi-Agent Reinforcement Learning with Active Defense
AAAI 2024
Learn to Follow: Decentralized Lifelong Multi-Agent Pathfinding via Planning and Learning
AAAI 2024
GDPO: Learning to Directly Align Language Models with Diversity Using GFlowNets
EMNLP 2024
PMAC: Personalized Multi-Agent Communication
AAAI 2024
TAPE: Leveraging Agent Topology for Cooperative Multi-Agent Policy Gradient
AAAI 2024
ConcaveQ: Non-monotonic Value Function Factorization via Concave Representations in Deep Multi-Agent Reinforcement Learning
AAAI 2024
Coffee-Gym: An Environment for Evaluating and Improving Natural Language Feedback on Erroneous Code
EMNLP 2024
Situation-Dependent Causal Influence-Based Cooperative Multi-Agent Reinforcement Learning
AAAI 2024
<
1
…
14
15
16
…
51
>