Papers
261 papers found
Task Preference Optimization: Improving Multimodal Large Language Models with Vision Task Alignment
Ziang Yan, Zhilin Li, Yinan He et al.
SymDPO: Boosting In-Context Learning of Large Multimodal Models with Symbol Demonstration Direct Preference Optimization
Hongrui Jia, Chaoya Jiang, Haiyang Xu et al.
Debiasing Multimodal Large Language Models via Noise-Aware Preference Optimization
Zefeng Zhang, Hengzhu Tang, Jiawei Sheng et al.
Curriculum Direct Preference Optimization for Diffusion and Consistency Models
Florinel-Alin Croitoru, Vlad Hondru, Radu Tudor Ionescu et al.
Enhancing SAM with Efficient Prompting and Preference Optimization for Semi-supervised Medical Image Segmentation
Aishik Konwer, Zhijian Yang, Erhan Bas et al.
Boost Your Human Image Generation Model via Direct Preference Optimization
Sanghyeon Na, Yonggyu Kim, Hyunjoon Lee
Calibrated Multi-Preference Optimization for Aligning Diffusion Models
Kyungmin Lee, Xiahong Li, Qifei Wang et al.
InPO: Inversion Preference Optimization with Reparametrized DDIM for Efficient Diffusion Model Alignment
Yunhong Lu, Qichao Wang, Hengyuan Cao et al.
Eliminating Biased Length Reliance of Direct Preference Optimization via Down-Sampled KL Divergence
Junru Lu, Jiazheng Li, Siyu An et al.
Controllable Preference Optimization: Toward Controllable Multi-Objective Alignment
Yiju Guo, Ganqu Cui, Lifan Yuan et al.
Direct Multi-Turn Preference Optimization for Language Agents
Wentao Shi, Mengqi Yuan, Junkang Wu et al.
EPO: Hierarchical LLM Agents with Environment Preference Optimization
Qi Zhao, Haotian Fu, Chen Sun et al.
mDPO: Conditional Preference Optimization for Multimodal Large Language Models
Fei Wang, Wenxuan Zhou, James Y. Huang et al.
WPO: Enhancing RLHF with Weighted Preference Optimization
Wenxuan Zhou, Ravi Agrawal, Shujian Zhang et al.
ORPO: Monolithic Preference Optimization without Reference Model
Jiwoo Hong, Noah Lee, James Thorne
RLHF Can Speak Many Languages: Unlocking Multilingual Preference Optimization for LLMs
John Dang, Arash Ahmadian, Kelly Marchisio et al.
Model-based Preference Optimization in Abstractive Summarization without Human Feedback
Jaepill Choi, Kyubyung Chae, Jiwoo Song et al.
Filtered Direct Preference Optimization
Tetsuro Morimura, Mitsuki Sakamoto, Yuu Jinnai et al.
Knowledge Editing in Language Models via Adapted Direct Preference Optimization
Amit Rozner, Barak Battash, Lior Wolf et al.
Learning to Ask Informative Questions: Enhancing LLMs with Preference Optimization and Expected Information Gain
Davide Mazzaccara, Alberto Testoni, Raffaella Bernardi
Calibrating LLMs with Preference Optimization on Thought Trees for Generating Rationale in Science Question Scoring
Jiazheng Li, Hainiu Xu, Zhaoyue Sun et al.
BAPO: Base-Anchored Preference Optimization for Overcoming Forgetting in Large Language Models Personalization
Gihun Lee, Minchan Jeong, Yujin Kim et al.
Step-level Value Preference Optimization for Mathematical Reasoning
Guoxin Chen, Minpeng Liao, Chengxi Li et al.
Improving Factual Consistency of News Summarization by Contrastive Preference Optimization
Huawen Feng, Yan Fan, Xiong Liu et al.
V-DPO: Mitigating Hallucination in Large Vision Language Models via Vision-Guided Direct Preference Optimization
Yuxi Xie, Guanzhen Li, Xiao Xu et al.