preference learning

411 papers

Explore in graph

Also known as

DPO PL

Co-occurring keywords

large language model (12755) reinforcement learning (4122) direct preference optimization (317) reinforcement learning from human feedback (261) language model alignment (142) reward model (251) human feedback (161) reward modeling (159) model alignment (219) human preference (120)

Papers

SoFA: Shielded On-the-fly Alignment via Priority Rule Following ACL 2024

Aligning Vision Models with Human Aesthetics in Retrieval: Benchmarks and Algorithms NIPS 2024

Panacea: Pareto Alignment via Preference Adaptation for LLMs NIPS 2024

Queueing Matching Bandits with Preference Feedback NIPS 2024

DMoERM: Recipes of Mixture-of-Experts for Effective Reward Modeling ACL 2024

Self-Supervised Alignment with Mutual Information: Learning to Follow Principles without Preference Labels NIPS 2024

Enhancing Preference-based Linear Bandits via Human Response Time NIPS 2024

Direct Large Language Model Alignment Through Self-Rewarding Contrastive Prompt Distillation ACL 2024

What are the Generator Preferences for End-to-end Task-Oriented Dialog System? EMNLP 2024

Knowledge-to-SQL: Enhancing SQL Generation with Data Expert LLM ACL 2024

Learning Conditional Preference Networks: An Approach Based on the Minimum Description Length Principle IJCAI 2024

A Preference-driven Paradigm for Enhanced Translation with Large Language Models NAACL 2024

Contrastive Preference Learning for Neural Machine Translation NAACL 2024

CURATRON: Complete and Robust Preference Data for Rigorous Alignment of Large Language Models NAACL 2024

The Paradox of Preference: A Study on LLM Alignment Algorithms and Data Acquisition Methods NAACL 2024

Preference-Aware Constrained Multi-Objective Bayesian Optimization (Student Abstract) AAAI 2024

Learning GAI-Decomposable Utility Models for Multiattribute Decision Making AAAI 2024

Optimal Design for Human Preference Elicitation NIPS 2024

Not All Preference Pairs Are Created Equal: A Recipe for Annotation-Efficient Iterative Preference Learning EMNLP 2024

Bandits with Preference Feedback: A Stackelberg Game Perspective NIPS 2024

Direct Preference Optimization with an Offset ACL 2024

HelpSteer 2: Open-source dataset for training top-performing reward models NIPS 2024

Fine-Tuning Language Models with Reward Learning on Policy NAACL 2024

Automated Multi-level Preference for MLLMs NIPS 2024

Teaching Language Models to Self-Improve by Learning from Language Feedback ACL 2024