conftrace_

reinforcement learning

4352 papers

Explore in graph

Also known as

RL REINFORCE

Co-occurring keywords

large language model (13587) policy learning (702) markov decision process (790) policy optimization (657) policy gradient (520) deep reinforcement learning (903) multi-agent system (1819) imitation learning (744) regret bound (1926) language model (4599)

Papers

Sycophancy Mitigation Through Reinforcement Learning with Uncertainty-Aware Adaptive Reasoning Trajectories EMNLP 2025

Cache-Efficient Posterior Sampling for Reinforcement Learning with LLM-Derived Priors Across Discrete and Continuous Domains EMNLP 2025

Online Learning Defense against Iterative Jailbreak Attacks via Prompt Optimization AACL 2025

DeAL: Decoding-time Alignment for Large Language Models ACL 2025

Online Iterative Self-Alignment for Radiology Report Generation ACL 2025

WebAgent-R1: Training Web Agents via End-to-End Multi-Turn Reinforcement Learning EMNLP 2025

InnateCoder: Learning Programmatic Options with Foundation Models IJCAI 2025

Aligning LLMs with Individual Preferences via Interaction COLING 2025

Large Language Models with Reinforcement Learning from Human Feedback Approach for Enhancing Explainable Sexism Detection COLING 2025

CARFT: Boosting LLM Reasoning via Contrastive Learning with Annotated Chain-of-Thought-based Reinforced Fine-Tuning EMNLP 2025

A Collaborative Reasoning Framework Powered by Reinforcement Learning and Large Language Models for Complex Questions Answering over Knowledge Graph COLING 2025

Robustness to Spurious Correlations via Dynamic Knowledge Transfer IJCAI 2025

ADPFedGNN: Adaptive Decoupling Personalized Federated Graph Neural Network IJCAI 2025

Mutual-Taught for Co-adapting Policy and Reward Models ACL 2025

In-Context Reinforcement Learning with Retrieval-Augmented Generation for Text-to-SQL COLING 2025

EFormer: An Effective Edge-based Transformer for Vehicle Routing Problems IJCAI 2025

Simulate, Refine and Integrate: Strategy Synthesis for Efficient SMT Solving IJCAI 2025

Dialogue Systems for Emotional Support via Value Reinforcement ACL 2025

Survey on Strategic Mining in Blockchain: A Reinforcement Learning Approach IJCAI 2025

AMO: Adaptive Motion Optimization for Hyper-Dexterous Humanoid Whole-Body Control RSS 2025

Demonstrating Berkeley Humanoid Lite: An Open-source, Accessible, and Customizable 3D-printed Humanoid Robot RSS 2025

GARLIC: GPT-Augmented Reinforcement Learning with Intelligent Control for Vehicle Dispatching AAAI 2025

Local Look-Ahead Guidance via Verifier-in-the-Loop for Automated Theorem Proving ACL 2025

AI-Powered Algorithm-Centric Quantum Processor Topology Design AAAI 2025

Generate First, Then Sample: Enhancing Fake News Detection with LLM-Augmented Reinforced Sampling ACL 2025