Rafael Rafailov
22 papers · 2021–2026 · 10 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+9 more ↓ Show less ↑
π Conference Polyglot (9) π Cross-Pollinator (13) π Interdisciplinary Bridge π§ Keyword Pioneer π Academic Marathon (5)
πΊοΈ
Taxonomy Completionist
(31)
π
Conference Polyglot
(9)
π
Renaissance Researcher
(5)
π€
Dynamic Duo
(18)
π
Triple Crown
π₯
Unstoppable
(5)
π
Century Club
(21)
ποΈ
Keyword Collector
(71)
β‘
Prolific Year
(10)
Conferences
NIPS (5)
ICLR (4)
ICML (3)
L4DC (3)
CORL (2)
ACL (1)
COLING (1)
CVPR (1)
EACL (1)
EMNLP (1)
Top co-authors
Keywords
offline reinforcement learning
(6)
direct preference optimization
(4)
reward model
(4)
model-based reinforcement learning
(4)
reinforcement learning from human feedback
(4)
large language model
(3)
language model alignment
(3)
text generation
(2)
reinforcement learning
(2)
preference modeling
(2)
language model
(2)
preference optimization
(1)
self-supervised learning
(1)
few-shot learning
(1)
preference learning
(1)
transfer learning
(1)
sample efficiency
(1)
imitation learning
(1)
robot learning
(1)
text-to-image generation
(1)
Papers
LitBench: A Benchmark and Dataset for Reliable Evaluation of Creative Writing
EACL 2026
Collapse or Thrive: Perils and Promises of Synthetic Data in a Self-Generating World
ICML 2025
PERSONA: A Reproducible Testbed for Pluralistic Alignment
COLING 2025
Language Model Detectors Are Easily Optimized Against
ICLR 2024
Contrastive Preference Learning: Learning from Human Feedback without Reinforcement Learning
ICLR 2024
OpenVLA: An Open-Source Vision-Language-Action Model
CORL 2024
Self-Supervised Alignment with Mutual Information: Learning to Follow Principles without Preference Labels
NIPS 2024
Disentangling Length from Quality in Direct Preference Optimization
ACL 2024
Preference Fine-Tuning of LLMs Should Leverage Suboptimal, On-Policy Data
ICML 2024
Diffusion Model Alignment Using Direct Preference Optimization
CVPR 2024
Scaling Laws for Reward Model Overoptimization in Direct Alignment Algorithms
NIPS 2024
Efficient imitation learning with conservative world models
L4DC 2024
An Emulator for Fine-tuning Large Language Models using Small Language Models
ICLR 2024
Direct Preference Optimization: Your Language Model is Secretly a Reward Model
NIPS 2023
MOTO: Offline Pre-training to Online Fine-tuning for Model-based Robot Learning
CORL 2023
Just Ask for Calibration: Strategies for Eliciting Calibrated Confidence Scores from Language Models Fine-Tuned with Human Feedback
EMNLP 2023
Contrastive Example-Based Control
L4DC 2023
Vision-Based Manipulators Need to Also See from Their Hands
ICLR 2022
Offline Reinforcement Learning from Images with Latent Space Models
L4DC 2021
Offline Meta-Reinforcement Learning with Advantage Weighting
ICML 2021
Visual Adversarial Imitation Learning using Variational Models
NIPS 2021
COMBO: Conservative Offline Model-Based Policy Optimization
NIPS 2021