Papers
261 papers found
On Softmax Direct Preference Optimization for Recommendation
Yuxin Chen, Junfei Tan, An Zhang et al.
Group Robust Preference Optimization in Reward-free RLHF
Shyam Sundhar Ramesh, Yifan Hu, Iason Chaimalas et al.
Personalized Steering of Large Language Models: Versatile Steering Vectors Through Bi-directional Preference Optimization
Yuanpu Cao, Tianrong Zhang, Bochuan Cao et al.
Discovering Preference Optimization Algorithms with and for Large Language Models
Chris Lu, Samuel Holt, Claudio Fanconi et al.
3D Structure Prediction of Atomic Systems with Flow-based Direct Preference Optimization
Rui Jiao, Xiangzhe Kong, Wenbing Huang et al.
Cal-DPO: Calibrated Direct Preference Optimization for Language Model Alignment
Teng Xiao, Yige Yuan, Huaisheng Zhu et al.
Iterative Reasoning Preference Optimization
Richard Yuanzhe Pang, Weizhe Yuan, Kyunghyun Cho et al.
Antigen-Specific Antibody Design via Direct Energy-based Preference Optimization
Xiangxin Zhou, Dongyu Xue, Ruizhe Chen et al.
SimPO: Simple Preference Optimization with a Reference-Free Reward
Yu Meng, Mengzhou Xia, Danqi Chen
$\beta$-DPO: Direct Preference Optimization with Dynamic $\beta$
Junkang Wu, Yuexiang Xie, Zhengyi Yang et al.
Controllable Protein Sequence Generation with LLM Preference Optimization
Xiangyu Liu, Yi Liu, Silei Chen et al.
AGFSync: Leveraging AI-Generated Feedback for Preference Optimization in Text-to-Image Generation
Jingkun An, Yinghao Zhu, Zongjian Li et al.
VidChain: Chain-of-Tasks with Metric-based Direct Preference Optimization for Dense Video Captioning
Ji Soo Lee, Jongha Kim, Jeehye Na et al.
Radiology Report Generation via Multi-objective Preference Optimization
Ting Xiao, Lei Shi, Peng Liu et al.
Forward KL Regularized Preference Optimization for Aligning Diffusion Policies
Zhao Shan, Chenyou Fan, Shuang Qiu et al.
Multi-Reference Preference Optimization for Large Language Models
Hung Le, Quan Hung Tran, Dung Nguyen et al.
Self-Evolutionary Large Language Models Through Uncertainty-Enhanced Preference Optimization
Jianing Wang, Yang Zhou, Xiaocheng Zhang et al.
Enhancing Audiovisual Speech Recognition Through Bifocal Preference Optimization
Yihan Wu, Yichen Lu, Yifan Peng et al.
KnowPO: Knowledge-Aware Preference Optimization for Controllable Knowledge Selection in Retrieval-Augmented Language Models
Ruizhe Zhang, Yongxin Xu, Yuzhen Xiao et al.
Advancing Audio-Based Text Generation with Imbalance Preference Optimization
Zhenghao Zhou, Yongjie Liu, Chen Cao
WEPO: Web Element Preference Optimization for LLM-based Web Navigation
Jiarun Liu, Jia Hao, Chunhong Zhang et al.
JailPO: A Novel Black-Box Jailbreak Framework via Preference Optimization Against Aligned LLMs
Hongyi Li, Jiawei Ye, Jie Wu et al.
Atomic Consistency Preference Optimization for Long-Form Question Answering
Jingfeng Chen, Raghuveer Thirukovalluru, Junlin Wang et al.
MAPO: Advancing Multilingual Reasoning through Multilingual-Alignment-as-Preference Optimization
Shuaijie She, Wei Zou, Shujian Huang et al.
Self-Training with Direct Preference Optimization Improves Chain-of-Thought Reasoning
Tianduo Wang, Shichen Li, Wei Lu