conftrace
_
Papers
Trends
Conferences
Explore
More
Authors
Topics
Keywords
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Achievements
← Keywords
reinforcement learning
4122 papers
Explore in graph
Also known as
RL
REINFORCE
Co-occurring keywords
large language model
(12755)
policy learning
(699)
markov decision process
(788)
policy gradient
(518)
policy optimization
(630)
deep reinforcement learning
(903)
multi-agent system
(1743)
imitation learning
(741)
regret bound
(1918)
language model
(4573)
Papers
Token-Level Accept or Reject: A Micro Alignment Approach for Large Language Models
IJCAI 2025
AdsQA: Towards Advertisement Video Understanding
ICCV 2025
StoryLLaVA: Enhancing Visual Storytelling with Multi-Modal Large Language Models
COLING 2025
Procedural Environment Generation for Tool-Use Agents
EMNLP 2025
Steering LLM Reasoning Through Bias-Only Adaptation
EMNLP 2025
R2D2: Remembering, Replaying and Dynamic Decision Making with a Reflective Agentic Memory
ACL 2025
Comparing Bad Apples to Good Oranges Aligning Large Language Models via Joint Preference Optimization
ACL 2025
Domain Randomization is Sample Efficient for Linear Quadratic Control
L4DC 2025
Group-Aware Reinforcement Learning for Output Diversity in Large Language Models
EMNLP 2025
DocThinker: Explainable Multimodal Large Language Models with Rule-based Reinforcement Learning for Document Understanding
ICCV 2025
SORREL: Suboptimal-Demonstration-Guided Reinforcement Learning for Learning to Branch
AAAI 2025
Tag-Instruct: Controlled Instruction Complexity Enhancement through Structure-based Augmentation
ACL 2025
FRACTAL: Fine-Grained Scoring from Aggregate Text Labels
ACL 2025
Light-R1: Curriculum SFT, DPO and RL for Long COT from Scratch and Beyond
ACL 2025
The Evolving Landscape of LLM- and VLM-Integrated Reinforcement Learning
IJCAI 2025
RL + Transformer = A General-Purpose Problem Solver
ACL 2025
Team XSZ at BioLaySumm2025: Section-Wise Summarization, Retrieval-Augmented LLM, and Reinforcement Learning Fine-Tuning for Lay Summaries
ACL 2025
Efficient and Robust Reinforcement Learning from Human Feedback
AAAI 2025
Grounding Open-Domain Knowledge from LLMs to Real-World Reinforcement Learning Tasks: A Survey
IJCAI 2025
Overview of the BioLaySumm 2025 Shared Task on Lay Summarization of Biomedical Research Articles and Radiology Reports
ACL 2025
bea-jh at BEA 2025 Shared Task: Evaluating AI-powered Tutors through Pedagogically-Informed Reasoning
ACL 2025
EFormer: An Effective Edge-based Transformer for Vehicle Routing Problems
IJCAI 2025
Direct Repair Optimization: Training Small Language Models For Educational Program Repair Improves Feedback
ACL 2025
Text2World: Benchmarking Large Language Models for Symbolic World Model Generation
ACL 2025
PRED: Performance-oriented Random Early Detection for Consistently Stable Performance in Datacenters
NSDI 2025
<
1
…
12
13
14
…
165
>