← Learning Types

Machine Learning › Learning Types ›

Reinforcement Learning

2932 directly classified papers

Papers per year

Papers

Meta-Rewarding Language Models: Self-Improving Alignment with LLM-as-a-Meta-Judge EMNLP 2025

Enhancing Persona Consistency for LLMs’ Role-Playing using Persona-Aware Contrastive Learning ACL 2025

Towards Better Robot Learners: Leveraging Implicit and Explicit Human Feedback Together in Human Robot Interactions AAAI 2025

Efficient and Robust Reinforcement Learning from Human Feedback AAAI 2025

iTool: Reinforced Fine-Tuning with Dynamic Deficiency Calibration for Advanced Tool Use EMNLP 2025

Beyond Task-Oriented and Chitchat Dialogues: Proactive and Transition-Aware Conversational Agents EMNLP 2025

START: Self-taught Reasoner with Tools EMNLP 2025

R-PRM: Reasoning-Driven Process Reward Modeling EMNLP 2025

Towards Robust, Efficient, and Practical Decision-Making: From Reward-Maximizing Deep Reinforcement Learning to Reward-Matching GFlowNets AAAI 2025

Representation-driven Option Discovery in Reinforcement Learning AAAI 2025

MarkovType: A Markov Decision Process Strategy for Non-Invasive Brain-Computer Interfaces Typing Systems AAAI 2025

Dynamic Retriever for In-Context Knowledge Editing via Policy Optimization EMNLP 2025

Learning Structured World Models From and For Physical Interactions AAAI 2025

Axioms for AI Alignment from Human Feedback AAAI 2025

Robots Learning Through Physical Interactive Intelligence AAAI 2025

The Hallucination Tax of Reinforcement Finetuning EMNLP 2025

The POWER of Ikigai: Optimizing Life Fulfillment with an Integrated User Simulator and Adaptive Hobby Recommender AAAI 2025

Unilaw-R1: A Large Language Model for Legal Reasoning with Reinforcement Learning and Iterative Inference EMNLP 2025

Learning Logic Specifications for Policy Guidance in POMDPs: an Inductive Logic Programming Approach AAAI 2025

PUER: Boosting Few-shot Positive-Unlabeled Entity Resolution with Reinforcement Learning EMNLP 2025

Spontaneous Giving and Calculated Greed in Language Models EMNLP 2025

Governance in Motion: Co-evolution of Constitutions and AI models for Scalable Safety EMNLP 2025

Identification of Multiple Logical Interpretations in Counter-Arguments EMNLP 2025

Learning Like Humans: Advancing LLM Reasoning Capabilities via Adaptive Difficulty Curriculum Learning and Expert-Guided Self-Reformulation EMNLP 2025

CARMO: Dynamic Criteria Generation for Context Aware Reward Modelling ACL 2025