reinforcement learning

4122 papers

Explore in graph

Also known as

RLVR HARL GRPO RL PPO REINFORCE RFT DRL RL NULL LQR RLHF

Co-occurring keywords

large language model (12755) policy learning (699) markov decision process (788) policy gradient (518) policy optimization (630) deep reinforcement learning (903) multi-agent system (1743) imitation learning (741) regret bound (1918) language model (4573)

Papers

ADPFedGNN: Adaptive Decoupling Personalized Federated Graph Neural Network IJCAI 2025

AlphaGAT: A Two-Stage Learning Approach for Adaptive Portfolio Selection IJCAI 2025

Counterfactual Explanations for Continuous Action Reinforcement Learning IJCAI 2025

Continuous-Time Reward Machines IJCAI 2025

Online Learning Defense against Iterative Jailbreak Attacks via Prompt Optimization AACL 2025

MOERL: When Mixture-of-Experts Meet Reinforcement Learning for Adverse Weather Image Restoration ICCV 2025

Do LLMs Need Inherent Reasoning Before Reinforcement Learning? A Study in Korean Self-Correction AACL 2025

InnateCoder: Learning Programmatic Options with Foundation Models IJCAI 2025

EMO-RL: Emotion-Rule-Based Reinforcement Learning Enhanced Audio-Language Model for Generalized Speech Emotion Recognition EMNLP 2025

Can LLMs Clarify? Investigation and Enhancement of Large Language Models on Argument Claim Optimization COLING 2025

ReGraph: Learning to Reformulate Graph Encodings with Large Language Models AACL 2025

Simulate, Refine and Integrate: Strategy Synthesis for Efficient SMT Solving IJCAI 2025

Token-Level Accept or Reject: A Micro Alignment Approach for Large Language Models IJCAI 2025

HOMIE: Humanoid Loco-Manipulation with Isomorphic Exoskeleton Cockpit RSS 2025

ConvSearch-R1: Enhancing Query Reformulation for Conversational Search with Reasoning via Reinforcement Learning EMNLP 2025

One fish, two fish, but not the whole sea: Alignment reduces language models’ conceptual diversity NAACL 2025

FLAG-TRADER: Fusion LLM-Agent with Gradient-based Reinforcement Learning for Financial Trading ACL 2025

Exploration-Driven Reinforcement Learning for Expert Routing Improvement in Mixture-of-Experts Language Models EMNLP 2025

Enhancing Reasoning Abilities of Small LLMs with Cognitive Alignment EMNLP 2025

Real-Time Recurrent Reinforcement Learning AAAI 2025

Multi-Teacher Knowledge Distillation with Reinforcement Learning for Visual Recognition AAAI 2025

Training-free Generation of Temporally Consistent Rewards from VLMs ICCV 2025

ACING: Actor-Critic for Instruction Learning in Black-Box LLMs EMNLP 2025

Convert Language Model into a Value-based Strategic Planner ACL 2025

Optimize Battery Control: A Multi-Objective Evolutionary Ensemble Reinforcement Learning Approach IJCAI 2025