← Learning Types

Machine Learning › Learning Types ›

Reinforcement Learning

2932 directly classified papers

Papers per year

Papers

Training Language Model to Critique for Better Refinement ACL 2025

Visual-RFT: Visual Reinforcement Fine-Tuning ICCV 2025

bea-jh at BEA 2025 Shared Task: Evaluating AI-powered Tutors through Pedagogically-Informed Reasoning ACL 2025

Cycle Consistency as Reward: Learning Image-Text Alignment without Human Preferences ICCV 2025

Henry at BEA 2025 Shared Task: Improving AI Tutor’s Guidance Evaluation Through Context-Aware Distillation ACL 2025

Trial-Oriented Visual Rearrangement ICCV 2025

Shallow Preference Signals: Large Language Model Aligns Even Better with Truncated Data? ACL 2025

MOERL: When Mixture-of-Experts Meet Reinforcement Learning for Adverse Weather Image Restoration ICCV 2025

The Power of Simplicity in LLM-Based Event Forecasting ACL 2025

GTR: Guided Thought Reinforcement Prevents Thought Collapse in RL-based VLM Agent Training ICCV 2025

Optimising Factual Consistency in Summarisation via Preference Learning from Multiple Imperfect Metrics EMNLP 2025

ULTHO: Ultra-Lightweight yet Efficient Hyperparameter Optimization in Deep Reinforcement Learning ICCV 2025

DeMAC: Enhancing Multi-Agent Coordination with Dynamic DAG and Manager-Player Feedback EMNLP 2025

Reinforcement Learning-Guided Data Selection via Redundancy Assessment ICCV 2025

Continual SFT Matches Multimodal RLHF with Negative Supervision CVPR 2025

Disentangled World Models: Learning to Transfer Semantic Knowledge from Distracting Videos for Reinforcement Learning ICCV 2025

When2Call: When (not) to Call Tools NAACL 2025

DSO: Aligning 3D Generators with Simulation Feedback for Physical Soundness ICCV 2025

Aesthetic Post-Training Diffusion Models from Generic Preferences with Step-by-step Preference Optimization CVPR 2025

Mitigating Object Hallucinations via Sentence-Level Early Intervention ICCV 2025

Narrative Studio: Visual narrative exploration using LLMs and Monte Carlo Tree Search NAACL 2025

EvolvingGrasp: Evolutionary Grasp Generation via Efficient Preference Alignment ICCV 2025

A Survey on the Feedback Mechanism of LLM-based AI Agents IJCAI 2025

Training-free Generation of Temporally Consistent Rewards from VLMs ICCV 2025

Language Model Based Text-to-Audio Generation: Anti-Causally Aligned Collaborative Residual Transformers EMNLP 2025