Papers
261 papers found
Adversarial Preference Optimization: Enhancing Your Alignment via RM-LLM Game
Pengyu Cheng, Yifan Yang, Jian Li et al.
Disentangling Length from Quality in Direct Preference Optimization
Ryan Park, Rafael Rafailov, Stefano Ermon et al.
Direct Preference Optimization with an Offset
Afra Amini, Tim Vieira, Ryan Cotterell
Fine-grained Video Dubbing Duration Alignment with Segment Supervised Preference Optimization
Chaoqun Cui, Liangbin Huang, Shijing Wang et al.
RPO: Retrieval Preference Optimization for Robust Retrieval-Augmented Generation
Shi-Qi Yan, Quan Liu, Zhen-Hua Ling
SDPO: Segment-Level Direct Preference Optimization for Social Agents
Aobo Kong, Wentao Ma, Shiwan Zhao et al.
Enhancing Safe and Controllable Protein Generation via Knowledge Preference Optimization
Yuhao Wang, Keyan Ding, Kehua Feng et al.
DiffPO: Diffusion-styled Preference Optimization for Inference Time Alignment of Large Language Models
Ruizhe Chen, Wenhao Chai, Zhifei Yang et al.
AutoMixAlign: Adaptive Data Mixing for Multi-Task Preference Optimization in LLMs
Nicholas E. Corrado, Julian Katz-Samuels, Adithya M Devraj et al.
Uncovering the Impact of Chain-of-Thought Reasoning for Direct Preference Optimization: Lessons from Text-to-SQL
Hanbing Liu, Haoyang Li, Xiaokang Zhang et al.
Optimal Transport-Based Token Weighting scheme for Enhanced Preference Optimization
Meng Li, Guangda Huzhang, Haibo Zhang et al.
World Modeling Makes a Better Planner: Dual Preference Optimization for Embodied Task Planning
Siyin Wang, Zhaoye Fei, Qinyuan Cheng et al.
IOPO: Empowering LLMs with Complex Instruction Following via Input-Output Preference Optimization
Xinghua Zhang, Haiyang Yu, Cheng Fu et al.
Retrieval-Augmented Fine-Tuning With Preference Optimization For Visual Program Generation
Deokhyung Kang, Jeonghun Cho, Yejin Jeon et al.
Uncertainty-Aware Iterative Preference Optimization for Enhanced LLM Reasoning
Lei Li, Hehuan Liu, Yaxin Zhou et al.
LPOI: Listwise Preference Optimization for Vision Language Models
Fatemeh Pesaran Zadeh, Yoojin Oh, Gunhee Kim
T-REG: Preference Optimization with Token-Level Reward Regularization
Wenxuan Zhou, Shujian Zhang, Lingxiao Zhao et al.
CRPO: Confidence-Reward Driven Preference Optimization for Machine Translation
Guofeng Cui, Pichao Wang, Yang Liu et al.
Comparing Bad Apples to Good Oranges Aligning Large Language Models via Joint Preference Optimization
Hritik Bansal, Ashima Suvarna, Gantavya Bhatt et al.
K-order Ranking Preference Optimization for Large Language Models
Shihao Cai, Chongming Gao, Yang Zhang et al.
ASPO: Adaptive Sentence-Level Preference Optimization for Fine-Grained Multimodal Reasoning
Yeyuan Wang, Dehong Gao, Rujiao Long et al.
Robust Preference Optimization via Dynamic Target Margins
Jie Sun, Junkang Wu, Jiancan Wu et al.
Expectation Confirmation Preference Optimization for Multi-Turn Conversational Recommendation Agent
Xueyang Feng, Jingsen Zhang, Jiakai Tang et al.
Probability-Consistent Preference Optimization for Enhanced LLM Reasoning
Yunqiao Yang, Houxing Ren, Zimu Lu et al.