Co-occurring keywords
Papers
Gradient-Adaptive Policy Optimization: Towards Multi-Objective Alignment of Large Language Models
ACL 2025
RLHF Algorithms Ranked: An Extensive Evaluation Across Diverse Tasks, Rewards, and Hyperparameters
EMNLP 2025
GARLIC: GPT-Augmented Reinforcement Learning with Intelligent Control for Vehicle Dispatching
AAAI 2025