reinforcement learning
4122 papers
Also known as
RLVR
HARL
GRPO
RL
PPO
REINFORCE
RFT
DRL
RL NULL
LQR
RLHF
Co-occurring keywords
Papers
Self-Rewarding Large Vision-Language Models for Optimizing Prompts in Text-to-Image Generation
ACL 2025
Hierarchical Multi-Agent Framework for Carbon-Efficient Liquid-Cooled Data Center Clusters
AAAI 2025
SDGO: Self-Discrimination-Guided Optimization for Consistent Safety in Large Language Models
EMNLP 2025
UAlign: Leveraging Uncertainty Estimations for Factuality Alignment on Large Language Models
ACL 2025