Reinforcement Learning › Methods ›

Deep RL

3861 directly classified papers

Papers per year

Papers

Logic-Q: Improving Deep Reinforcement Learning-based Quantitative Trading via Program Sketch-based Tuning AAAI 2025

Query-efficient Attack for Black-box Image Inpainting Forensics via Reinforcement Learning AAAI 2025

CTD4 – a Deep Continuous Distributional Actor-Critic Agent with a Kalman Fusion of Multiple Critics AAAI 2025

Noise-Resilient Symbolic Regression with Dynamic Gating Reinforcement Learning AAAI 2025

Latent Reward: LLM-Empowered Credit Assignment in Episodic Reinforcement Learning AAAI 2025

Epistemic Bellman Operators AAAI 2025

Agent-Aware Training for Agent-Agnostic Action Advising in Deep Reinforcement Learning AAAI 2025

SMoSE: Sparse Mixture of Shallow Experts for Interpretable Reinforcement Learning in Continuous Control Tasks AAAI 2025

ACECODER: Acing Coder RL via Automated Test-Case Synthesis ACL 2025

Intelligent OPC Engineer Assistant for Semiconductor Manufacturing AAAI 2025

SrSv: Integrating Sequential Rollouts with Sequential Value Estimation for Multi-agent Reinforcement Learning AAAI 2025

Curiosity-Driven Reinforcement Learning from Human Feedback ACL 2025

ReflectDiffu: Reflect between Emotion-intent Contagion and Mimicry for Empathetic Response Generation via a RL-Diffusion Framework ACL 2025

OpenWebVoyager: Building Multimodal Web Agents via Iterative Real-World Exploration, Feedback and Optimization ACL 2025

NavQ: Learning a Q-Model for Foresighted Vision-and-Language Navigation ICCV 2025

RAVEN: Robust Advertisement Video Violation Temporal Grounding via Reinforcement Reasoning ACL 2025

LLM-Enhanced Self-Evolving Reinforcement Learning for Multi-Step E-Commerce Payment Fraud Risk Detection ACL 2025

Learning Joint Behaviors with Large Variations AAAI 2025

FLAG-TRADER: Fusion LLM-Agent with Gradient-based Reinforcement Learning for Financial Trading ACL 2025

bea-jh at BEA 2025 Shared Task: Evaluating AI-powered Tutors through Pedagogically-Informed Reasoning ACL 2025

DecEx-RAG: Boosting Agentic Retrieval-Augmented Generation with Decision and Execution Optimization via Process Supervision EMNLP 2025

Unearthing Gems from Stones: Policy Optimization with Negative Sample Augmentation for LLM Reasoning EMNLP 2025

IGL-Nav: Incremental 3D Gaussian Localization for Image-goal Navigation ICCV 2025

Training Language Models to Critique With Multi-agent Feedback EMNLP 2025

Hierarchical Decision Making Based on Structural Information Principles JMLR 2025