Research Explorer
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Achievements
About
Methodology
← Learning Types
Machine Learning
›
Learning Types
›
Reinforcement Learning
2932 directly classified papers
Papers per year
2003: 1
2006: 11
2007: 18
2008: 23
2009: 14
2010: 22
2011: 24
2012: 34
2013: 26
2014: 24
2015: 14
2016: 23
2017: 79
2018: 182
2019: 255
2020: 284
2021: 333
2022: 319
2023: 315
2024: 457
2025: 419
2026: 55
Papers
Exploring Chain-of-Thought Reasoning for Steerable Pluralistic Alignment
EMNLP 2025
Direct Preference Optimization of Video Large Multimodal Models from Language Model Reward
NAACL 2025
Learning to Summarize from LLM-generated Feedback
NAACL 2025
Modality-Fair Preference Optimization for Trustworthy MLLM Alignment
IJCAI 2025
ROSE: A Reward-Oriented Data Selection Framework for LLM Task-Specific Instruction Tuning
EMNLP 2025
Language Model Based Text-to-Audio Generation: Anti-Causally Aligned Collaborative Residual Transformers
EMNLP 2025
HW-TSC at Multilingual Counterspeech Generation
COLING 2025
Differentiable Information Enhanced Model-Based Reinforcement Learning
AAAI 2025
In-Context Policy Adaptation via Cross-Domain Skill Diffusion
AAAI 2025
PLLuM-Align: Polish Preference Dataset for Large Language Model Alignment
EMNLP 2025
ReachAgent: Enhancing Mobile Agent via Page Reaching and Operation
NAACL 2025
Enhancing Predictive Healthcare Using AI-Driven Early Warning Systems
AAAI 2025
Towards Robust, Efficient, and Practical Decision-Making: From Reward-Maximizing Deep Reinforcement Learning to Reward-Matching GFlowNets
AAAI 2025
CulFiT: A Fine-grained Cultural-aware LLM Training Paradigm via Multilingual Critique Data Synthesis
ACL 2025
ConvSearch-R1: Enhancing Query Reformulation for Conversational Search with Reasoning via Reinforcement Learning
EMNLP 2025
Beyond Static Testbeds: An Interaction-Centric Agent Simulation Platform for Dynamic Recommender Systems
EMNLP 2025
Breaking the Self-Evaluation Barrier: Reinforced Neuro-Symbolic Planning with Large Language Models
IJCAI 2025
Towards Human Understanding of Paraphrase Types in Large Language Models
COLING 2025
Why Does ChatGPT “Delve” So Much? Exploring the Sources of Lexical Overrepresentation in Large Language Models
COLING 2025
Dynamic Uncertainty Ranking: Enhancing Retrieval-Augmented In-Context Learning for Long-Tail Knowledge in LLMs
NAACL 2025
Token-Level Accept or Reject: A Micro Alignment Approach for Large Language Models
IJCAI 2025
FFCG: Effective and Fast Family Column Generation for Solving Large-Scale Linear Program
AAAI 2025
The Indoor-Training Effect: Unexpected Gains from Distribution Shifts in the Transition Function
AAAI 2025
Bounded Rationality Equilibrium Learning in Mean Field Games
AAAI 2025
Subwords as Skills: Tokenization for Sparse-Reward Reinforcement Learning
NIPS 2024
<
1
…
18
19
20
…
118
>