reinforcement learning

4122 papers

Explore in graph

Also known as

RLVR HARL GRPO RL PPO REINFORCE RFT DRL RL NULL LQR RLHF

Co-occurring keywords

large language model (12755) policy learning (699) markov decision process (788) policy gradient (518) policy optimization (630) deep reinforcement learning (903) multi-agent system (1743) imitation learning (741) regret bound (1918) language model (4573)

Papers

Continuous Versatile Jumping Using Learned Action Residuals L4DC 2023

Learning Score-based Grasping Primitive for Human-assisting Dexterous Grasping NIPS 2023

Diverse Conventions for Human-AI Collaboration NIPS 2023

Online Nonstochastic Model-Free Reinforcement Learning NIPS 2023

Regret Guarantees for Online Deep Control L4DC 2023

Hierarchical Policy Blending As Optimal Transport L4DC 2023

A Minimal Approach for Natural Language Action Space in Text-based Games CONLL 2023

Belief Projection-Based Reinforcement Learning for Environments with Delayed Feedback NIPS 2023

Information Design in Multi-Agent Reinforcement Learning NIPS 2023

Automatic Intrinsic Reward Shaping for Exploration in Deep Reinforcement Learning ICML 2023

Breadcrumbs to the Goal: Goal-Conditioned Exploration from Human-in-the-Loop Feedback NIPS 2023

Reinforcement Learning Approaches for Traffic Signal Control under Missing Data IJCAI 2023

Explore to Generalize in Zero-Shot RL NIPS 2023

Learning from Active Human Involvement through Proxy Value Propagation NIPS 2023

State2Explanation: Concept-Based Explanations to Benefit Agent Learning and User Understanding NIPS 2023

Aligning Factual Consistency for Clinical Studies Summarization through Reinforcement Learning ACL 2023

SustainGym: Reinforcement Learning Environments for Sustainable Energy Systems NIPS 2023

User Simulator Assisted Open-ended Conversational Recommendation System ACL 2023

Guiding Large Language Models via Directional Stimulus Prompting NIPS 2023

DPOK: Reinforcement Learning for Fine-tuning Text-to-Image Diffusion Models NIPS 2023

Rewiring Neurons in Non-Stationary Environments NIPS 2023

Reinforced Active Learning for Low-Resource, Domain-Specific, Multi-Label Text Classification ACL 2023

Yes, this Way! Learning to Ground Referring Expressions into Actions with Intra-episodic Feedback from Supportive Teachers ACL 2023

Boosting Event Extraction with Denoised Structure-to-Text Augmentation ACL 2023

Triplet-Free Knowledge-Guided Response Generation ACL 2023