Research Explorer
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Achievements
About
Methodology
← Learning Types
Machine Learning
›
Learning Types
›
Reinforcement Learning
2932 directly classified papers
Papers per year
2003: 1
2006: 11
2007: 18
2008: 23
2009: 14
2010: 22
2011: 24
2012: 34
2013: 26
2014: 24
2015: 14
2016: 23
2017: 79
2018: 182
2019: 255
2020: 284
2021: 333
2022: 319
2023: 315
2024: 457
2025: 419
2026: 55
Papers
An Analysis of Scoring Methods for Reranking in Large Language Model Story Generation
NAACL 2025
Interaction-Required Suggestions for Control, Ownership, and Awareness in Human-AI Co-Writing
NAACL 2025
Sparks of Tabular Reasoning via Text2SQL Reinforcement Learning
ACL 2025
ROSE: A Reward-Oriented Data Selection Framework for LLM Task-Specific Instruction Tuning
EMNLP 2025
Faster Machine Translation Ensembling with Reinforcement Learning and Competitive Correction
NAACL 2025
2D-DPO: Scaling Direct Preference Optimization with 2-Dimensional Supervision
NAACL 2025
TAROT: Task-Oriented Authorship Obfuscation Using Policy Optimization Methods
NAACL 2025
LLMSR@XLLM25: A Language Model-Based Pipeline for Structured Reasoning Data Construction
ACL 2025
Learning Structured World Models From and For Physical Interactions
AAAI 2025
SeqPO-SiMT: Sequential Policy Optimization for Simultaneous Machine Translation
ACL 2025
WebAgent-R1: Training Web Agents via End-to-End Multi-Turn Reinforcement Learning
EMNLP 2025
Enhancing Reward Models for High-quality Image Generation: Beyond Text-Image Alignment
ICCV 2025
Visual-RFT: Visual Reinforcement Fine-Tuning
ICCV 2025
Cycle Consistency as Reward: Learning Image-Text Alignment without Human Preferences
ICCV 2025
Trial-Oriented Visual Rearrangement
ICCV 2025
MOERL: When Mixture-of-Experts Meet Reinforcement Learning for Adverse Weather Image Restoration
ICCV 2025
GTR: Guided Thought Reinforcement Prevents Thought Collapse in RL-based VLM Agent Training
ICCV 2025
ULTHO: Ultra-Lightweight yet Efficient Hyperparameter Optimization in Deep Reinforcement Learning
ICCV 2025
Reinforcement Learning-Guided Data Selection via Redundancy Assessment
ICCV 2025
Disentangled World Models: Learning to Transfer Semantic Knowledge from Distracting Videos for Reinforcement Learning
ICCV 2025
DSO: Aligning 3D Generators with Simulation Feedback for Physical Soundness
ICCV 2025
Mitigating Object Hallucinations via Sentence-Level Early Intervention
ICCV 2025
EvolvingGrasp: Evolutionary Grasp Generation via Efficient Preference Alignment
ICCV 2025
Training-free Generation of Temporally Consistent Rewards from VLMs
ICCV 2025
Enhancing Machine Translation with Self-Supervised Preference Data
ACL 2025
<
1
…
6
7
8
…
118
>