Reinforcement Learning › Methods ›

Deep RL

3861 directly classified papers

Papers per year

Papers

Partial Identifiability in Inverse Reinforcement Learning for Agents with Non-Exponential Discounting AAAI 2025

CodeTool: Enhancing Programmatic Tool Invocation of LLMs via Process Supervision ACL 2025

The Indoor-Training Effect: Unexpected Gains from Distribution Shifts in the Transition Function AAAI 2025

A Reinforcement Learning Framework for Cross-Lingual Stance Detection Using Chain-of-Thought Alignment ACL 2025

Adversarial Preference Learning for Robust LLM Alignment ACL 2025

Removing Prompt-template Bias in Reinforcement Learning from Human Feedback ACL 2025

Should I Trust You? Detecting Deception in Negotiations using Counterfactual RL ACL 2025

Text2World: Benchmarking Large Language Models for Symbolic World Model Generation ACL 2025

Direct Repair Optimization: Training Small Language Models For Educational Program Repair Improves Feedback ACL 2025

Team XSZ at BioLaySumm2025: Section-Wise Summarization, Retrieval-Augmented LLM, and Reinforcement Learning Fine-Tuning for Lay Summaries ACL 2025

RL + Transformer = A General-Purpose Problem Solver ACL 2025

Sparks of Tabular Reasoning via Text2SQL Reinforcement Learning ACL 2025

Enhancing AMR Parsing with Group Relative Policy Optimization ACL 2025

LLMSR@XLLM25: A Language Model-Based Pipeline for Structured Reasoning Data Construction ACL 2025

Self-Rewarding Large Vision-Language Models for Optimizing Prompts in Text-to-Image Generation ACL 2025

JARVIS-VLA: Post-Training Large-Scale Vision Language Models to Play Visual Games with Keyboards and Mouse ACL 2025

Efficient and Robust Reinforcement Learning from Human Feedback AAAI 2025

Deep Implicit Imitation Reinforcement Learning in Heterogeneous Action Settings AAAI 2025

EQA-RM: A Generative Embodied Reward Model with Test-time Scaling EMNLP 2025

ActionStudio: A Lightweight Framework for Data and Training of Large Action Models EMNLP 2025

Novelty-Guided Data Reuse for Efficient and Diversified Multi-Agent Reinforcement Learning AAAI 2025

DUO: Diverse, Uncertain, On-Policy Query Generation and Selection for Reinforcement Learning from Human Feedback AAAI 2025

Sketch-to-Skill: Bootstrapping Robot Learning with Human Drawn Trajectory Sketches RSS 2025

Safety with Agency: Human-Centered Safety Filter with Application to AI-Assisted Motorsports RSS 2025

IGL-Nav: Incremental 3D Gaussian Localization for Image-goal Navigation ICCV 2025