reinforcement learning
4122 papers
Also known as
RLVR
HARL
GRPO
RL
PPO
REINFORCE
RFT
DRL
RL NULL
LQR
RLHF
Co-occurring keywords
Papers
Contrastive Policy Gradient: Aligning LLMs on sequence-level scores in a supervised-friendly fashion
EMNLP 2024
ToolPlanner: A Tool Augmented LLM for Multi Granularity Instructions with Path Planning and Feedback
EMNLP 2024
Weak Reward Model Transforms Generative Models into Robust Causal Event Extraction Systems
EMNLP 2024
Teaching Embodied Reinforcement Learning Agents: Informativeness and Diversity of Language Use
EMNLP 2024