reinforcement learning
4122 papers
Also known as
RLVR
HARL
GRPO
RL
PPO
REINFORCE
RFT
DRL
RL NULL
LQR
RLHF
Co-occurring keywords
Papers
Towards Pareto-Efficient RLHF: Paying Attention to a Few High-Reward Samples with Reward Dropout
EMNLP 2024
Navigating Noisy Feedback: Enhancing Reinforcement Learning with Error-Prone Language Models
EMNLP 2024