Co-occurring keywords
Papers
Coffee-Gym: An Environment for Evaluating and Improving Natural Language Feedback on Erroneous Code
EMNLP 2024
Contrastive Policy Gradient: Aligning LLMs on sequence-level scores in a supervised-friendly fashion
EMNLP 2024
ToolPlanner: A Tool Augmented LLM for Multi Granularity Instructions with Path Planning and Feedback
EMNLP 2024
Weak Reward Model Transforms Generative Models into Robust Causal Event Extraction Systems
EMNLP 2024
LLM-AutoDA: Large Language Model-Driven Automatic Data Augmentation for Long-tailed Problems
NIPS 2024
On the Sample Complexity and Metastability of Heavy-tailed Policy Search in Continuous Control
JMLR 2024