Co-occurring keywords
Papers
From General Reward to Targeted Reward: Improving Open-ended Long-context Generation Models
EMNLP 2025
Do LLMs Need Inherent Reasoning Before Reinforcement Learning? A Study in Korean Self-Correction
IJCNLP 2025
Continuous-Time Reward Machines
IJCAI 2025
Causal-aware Large Language Models: Enhancing Decision-Making Through Learning, Adapting and Acting
IJCAI 2025