direct preference optimization

317 papers

Also known as

DPO

Papers