Papers
261 papers found
AMoPO: Adaptive Multi-objective Preference Optimization without Reward Models and Reference Models
Qi Liu, Jingqing Ruan, Hao Li et al.
Boosting Vulnerability Detection of LLMs via Curriculum Preference Optimization with Synthetic Reasoning Data
Xin-Cheng Wen, Yijun Yang, Cuiyun Gao et al.
Debate, Reflect, and Distill: Multi-Agent Feedback with Tree-Structured Preference Optimization for Efficient Language Model Enhancement
Xiaofeng Zhou, Heyan Huang, Lizi Liao
Focused-DPO: Enhancing Code Generation Through Focused Preference Optimization on Error-Prone Points
Kechi Zhang, Ge Li, Jia Li et al.
SGDPO: Self-Guided Direct Preference Optimization for Language Model Alignment
Wenqiao Zhu, Ji Liu, Lulu Wang et al.
RoseRAG: Robust Retrieval-augmented Generation with Small-scale LLMs via Margin-aware Preference Optimization
Tianci Liu, Haoxiang Jiang, Tianze Wang et al.
Eeyore: Realistic Depression Simulation via Expert-in-the-Loop Supervised and Preference Optimization
Siyang Liu, Bianca Brie, Wenda Li et al.
PGPO: Enhancing Agent Reasoning via Pseudocode-style Planning Guided Preference Optimization
Zouying Cao, Runze Wang, Yifei Yang et al.
Mitigating Hallucination in Multimodal Large Language Model via Hallucination-targeted Direct Preference Optimization
Yuhan Fu, Ruobing Xie, Xingwu Sun et al.
Reverse Preference Optimization for Complex Instruction Following
Xiang Huang, Ting-En Lin, Feiteng Fang et al.
DPO Kernels: A Semantically-Aware, Kernel-Enhanced, and Divergence-Rich Paradigm for Direct Preference Optimization
Amitava Das, Suranjana Trivedy, Danush Khanna et al.
Full-Step-DPO: Self-Supervised Preference Optimization with Step-wise Rewards for Mathematical Reasoning
Huimin Xu, Xin Mao, Feng-Lin Li et al.
RadQA-DPO: A Radiology Question Answering System with Encoder-Decoder Models Enhanced by Direct Preference Optimization
Md Sultan Al Nahian, Ramakanth Kavuluru
The Fellowship of the LLMs: Multi-Model Workflows for Synthetic Preference Optimization Dataset Generation
Samee Arif, Sualeha Farid, Abdul Hameed Azeemi et al.
RedHit: Adaptive Red-Teaming of Large Language Models via Search, Reasoning, and Preference Optimization
Mohsen Sorkhpour, Abbas Yazdinejad, Ali Dehghantanha
Sakura at SemEval-2025 Task 2: Enhancing Named Entity Translation with Fine-Tuning and Preference Optimization
Alberto Poncelas, Ohnmar Htun
Dataground at SemEval-2025 Task 8: Small LLMs and Preference Optimization for Tabular QA
Giuseppe Attardi, Andrea Nelson Mauro, Daniele Sartiano
Atyaephyra at SemEval-2025 Task 4: Low-Rank Negative Preference Optimization
Jan Bronec, Jindřich Helcl
Using LLMs and Preference Optimization for Agreement-Aware HateWiC Classification
Sebastian Loftus, Adrian Mülthaler, Sanne Hoeken et al.
MPPO: Multi Pair-wise Preference Optimization for LLMs with Arbitrary Negative Samples
Shuo Xie, Fangzhi Zhu, Jiahui Wang et al.
Edit-Wise Preference Optimization for Grammatical Error Correction
Jiehao Liang, Haihui Yang, Shiping Gao et al.
Alternate Preference Optimization for Unlearning Factual Knowledge in Large Language Models
Anmol Mekala, Vineeth Dorna, Shreya Dubey et al.
MDPO: Customized Direct Preference Optimization with a Metric-based Sampler for Question and Answer Generation
Yihang Wang, Bowen Tian, Yueyang Su et al.
Northeastern Uni at Multilingual Counterspeech Generation: Enhancing Counter Speech Generation with LLM Alignment through Direct Preference Optimization
Sahil Wadhwa, Chengtian Xu, Haoming Chen et al.
Diffusion Model Alignment Using Direct Preference Optimization
Bram Wallace, Meihua Dang, Rafael Rafailov et al.