Research Explorer
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Achievements
About
Methodology
← Learning Types
Deep Learning
›
Learning Types
›
Reinforcement Learning
1263 directly classified papers
Papers per year
2006: 1
2007: 2
2008: 3
2009: 2
2010: 1
2011: 2
2012: 3
2013: 2
2014: 3
2015: 2
2016: 8
2017: 44
2018: 95
2019: 134
2020: 123
2021: 131
2022: 143
2023: 127
2024: 194
2025: 240
2026: 3
Papers
DiffChat: Learning to Chat with Text-to-Image Synthesis Models for Interactive Image Creation
ACL 2024
ALaRM: Align Language Models via Hierarchical Rewards Modeling
ACL 2024
Just Ask One More Time! Self-Agreement Improves Reasoning of Language Models in (Almost) All Scenarios
ACL 2024
RePALM: Popular Quote Tweet Generation via Auto-Response Augmentation
ACL 2024
Applying RLAIF for Code Generation with API-usage in Lightweight LLMs
ACL 2024
Self-Training with Direct Preference Optimization Improves Chain-of-Thought Reasoning
ACL 2024
Enhancing Numerical Reasoning with the Guidance of Reliable Reasoning Processes
ACL 2024
Trial and Error: Exploration-Based Trajectory Optimization of LLM Agents
ACL 2024
ChiMed-GPT: A Chinese Medical Large Language Model with Full Training Regime and Better Alignment to Human Preferences
ACL 2024
Training Language Models to Generate Text with Citations via Fine-grained Rewards
ACL 2024
M-RAG: Reinforcing Large Language Model Performance through Retrieval-Augmented Generation with Multiple Partitions
ACL 2024
Unveiling the Art of Heading Design: A Harmonious Blend of Summarization, Neology, and Algorithm
ACL 2024
PACE: Improving Prompt with Actor-Critic Editing for Large Language Model
ACL 2024
Reinforcement Learning-Driven LLM Agent for Automated Attacks on LLMs
ACL 2024
Enhancing Reinforcement Learning with Dense Rewards from Language Model Critic
EMNLP 2024
PrefPaint: Aligning Image Inpainting Diffusion Model with Human Preference
NIPS 2024
Learning Formal Mathematics From Intrinsic Motivation
NIPS 2024
Controllable Citation Sentence Generation with Language Models
ACL 2024
Carbon Footprint Reduction for Sustainable Data Centers in Real-Time
AAAI 2024
Reward Certification for Policy Smoothed Reinforcement Learning
AAAI 2024
Two-Stage Evolutionary Reinforcement Learning for Enhancing Exploration and Exploitation
AAAI 2024
Controlled maximal variability along with reliable performance in recurrent neural networks
NIPS 2024
Robust Communicative Multi-Agent Reinforcement Learning with Active Defense
AAAI 2024
Learn to Follow: Decentralized Lifelong Multi-Agent Pathfinding via Planning and Learning
AAAI 2024
PMAC: Personalized Multi-Agent Communication
AAAI 2024
<
1
…
13
14
15
…
51
>