Reinforcement Learning › Methods ›

Deep RL

3861 directly classified papers

Papers per year

Papers

JARVIS-VLA: Post-Training Large-Scale Vision Language Models to Play Visual Games with Keyboards and Mouse ACL 2025

Identification of Multiple Logical Interpretations in Counter-Arguments EMNLP 2025

CodeTool: Enhancing Programmatic Tool Invocation of LLMs via Process Supervision ACL 2025

M^3PC: Test-time Model Predictive Control using Pretrained Masked Trajectory Model ICLR 2025

Dialogue Systems for Emotional Support via Value Reinforcement ACL 2025

FlightGPT: Towards Generalizable and Interpretable UAV Vision-and-Language Navigation with Vision-Language Models EMNLP 2025

Enhancing RLHF with Human Gaze Modeling EMNLP 2025

All-Optical Nonlinear Diffractive Deep Network for Ultrafast Image Denoising CVPR 2025

Combining Deep Reinforcement Learning and Search with Generative Models for Game-Theoretic Opponent Modeling IJCAI 2025

NavQ: Learning a Q-Model for Foresighted Vision-and-Language Navigation ICCV 2025

Diffusion Guided Adaptive Augmentation for Generalization in Visual Reinforcement Learning ICCV 2025

IGL-Nav: Incremental 3D Gaussian Localization for Image-goal Navigation ICCV 2025

PROGRESSOR: A Perceptually Guided Reward Estimator with Self-Supervised Online Refinement ICCV 2025

Embodied Navigation with Auxiliary Task of Action Description Prediction ICCV 2025

ULTHO: Ultra-Lightweight yet Efficient Hyperparameter Optimization in Deep Reinforcement Learning ICCV 2025

Active Geospatial Search for Efficient Tenant Eviction Outreach AAAI 2025

Faster Machine Translation Ensembling with Reinforcement Learning and Competitive Correction NAACL 2025

Reinforcement Learning for Infinite-Dimensional Systems JMLR 2025

Reward-Directed Score-Based Diffusion Models via q-Learning JMLR 2025

Using LLMs to improve RL policies in personalized health adaptive interventions NAACL 2025

ConvSearch-R1: Enhancing Query Reformulation for Conversational Search with Reasoning via Reinforcement Learning EMNLP 2025

Generation of Geodesics with Actor-Critic Reinforcement Learning to Predict Midpoints JMLR 2025

A Case for Validation Buffer in Pessimistic Actor-Critic IJCAI 2025

Towards Robust, Efficient, and Practical Decision-Making: From Reward-Maximizing Deep Reinforcement Learning to Reward-Matching GFlowNets AAAI 2025

PRED: Performance-oriented Random Early Detection for Consistently Stable Performance in Datacenters NSDI 2025