conftrace_

← Learning Types

Machine Learning › Learning Types ›

Reinforcement Learning from Human Feedback

143 papers

Papers per year

1

13

60

55

14

Papers

Dataground at SemEval-2025 Task 8: Small LLMs and Preference Optimization for Tabular QA ACL 2025

ReNeg: Learning Negative Embedding with Reward Guidance CVPR 2025

A Systematic Analysis of Base Model Choice for Reward Modeling EMNLP 2025

Permutative Preference Alignment from Listwise Ranking of Human Judgments EMNLP 2025

Direct Judgement Preference Optimization EMNLP 2025

Improve LLM-as-a-Judge Ability as a General Ability EMNLP 2025

PLLuM-Align: Polish Preference Dataset for Large Language Model Alignment EMNLP 2025

Logical Reasoning with Outcome Reward Models for Test-Time Scaling EMNLP 2025

Weights-Rotated Preference Optimization for Large Language Models EMNLP 2025

Improving Neutral Point-of-View Generation with Data- and Parameter-Efficient RL EMNLP 2025

Enhancing RLHF with Human Gaze Modeling EMNLP 2025

CARE: Multilingual Human Preference Learning for Cultural Awareness EMNLP 2025

OpenRLHF: A Ray-based Easy-to-use, Scalable and High-performance RLHF Framework EMNLP 2025

Thinking with DistilQwen: A Tale of Four Distilled Reasoning and Reward Model Series EMNLP 2025

Agent-in-the-Loop: A Data Flywheel for Continuous Improvement in LLM-based Customer Support EMNLP 2025

Toward Optimal LLM Alignments Using Two-Player Games EMNLP 2025

Aligning Black-Box LLMs for Aspect Sentiment Quad Prediction EMNLP 2025

DISCO Balances the Scales: Adaptive Domain- and Difficulty-Aware Reinforcement Learning on Imbalanced Data EMNLP 2025

CAPO: Confidence Aware Preference Optimization Learning for Multilingual Preferences IJCNLP 2025

On Softmax Direct Preference Optimization for Recommendation NIPS 2024

Fast Best-of-N Decoding via Speculative Rejection NIPS 2024

LeDex: Training LLMs to Better Self-Debug and Explain Code NIPS 2024

Interpreting Learned Feedback Patterns in Large Language Models NIPS 2024

Unpacking DPO and PPO: Disentangling Best Practices for Learning from Preference Feedback NIPS 2024

Group Robust Preference Optimization in Reward-free RLHF NIPS 2024