conftrace
_
Papers
Trends
Conferences
Explore
More
Authors
Topics
Keywords
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Achievements
← Keywords
reinforcement learning
4352 papers
Explore in graph
Also known as
RL
REINFORCE
Co-occurring keywords
large language model
(13587)
policy learning
(702)
markov decision process
(790)
policy optimization
(657)
policy gradient
(520)
deep reinforcement learning
(903)
multi-agent system
(1819)
imitation learning
(744)
regret bound
(1926)
language model
(4599)
Papers
CogDual: Enhancing Dual Cognition of LLMs via Reinforcement Learning with Implicit Rule-Based Rewards
EMNLP 2025
RAVEN++: Pinpointing Fine-Grained Violations in Advertisement Videos with Active Reinforcement Reasoning
EMNLP 2025
DecEx-RAG: Boosting Agentic Retrieval-Augmented Generation with Decision and Execution Optimization via Process Supervision
EMNLP 2025
LORD: Large Models Based Opposite Reward Design for Autonomous Driving
WACV 2025
ReinDiffuse: Crafting Physically Plausible Motions with Reinforced Diffusion Model
WACV 2025
Provoking Multi-modal Few-Shot LVLM via Exploration-Exploitation In-Context Learning
CVPR 2025
Video-RTS: Rethinking Reinforcement Learning and Test-Time Scaling for Efficient and Enhanced Video Reasoning
EMNLP 2025
AgentCPM-GUI: Building Mobile-Use Agents with Reinforcement Fine-Tuning
EMNLP 2025
GROVE: A Generalized Reward for Learning Open-Vocabulary Physical Skill
CVPR 2025
Token-level Proximal Policy Optimization for Query Generation
EMNLP 2025
ACING: Actor-Critic for Instruction Learning in Black-Box LLMs
EMNLP 2025
StoryLLaVA: Enhancing Visual Storytelling with Multi-Modal Large Language Models
COLING 2025
Mixing Inference-time Experts for Enhancing LLM Reasoning
EMNLP 2025
Playpen: An Environment for Exploring Learning From Dialogue Game Feedback
EMNLP 2025
Parrot: A Training Pipeline Enhances Both Program CoT and Natural Language CoT for Reasoning
EMNLP 2025
MrGuard: A Multilingual Reasoning Guardrail for Universal LLM Safety
EMNLP 2025
IntentionFrame: A Semi-Structured, Multi-Aspect Framework for Fine-Grained Conversational Intention Understanding
EMNLP 2025
Enhancing Study-Level Inference from Clinical Trial Papers via Reinforcement Learning-Based Numeric Reasoning
EMNLP 2025
NL2Lean: Translating Natural Language into Lean 4 through Multi-Aspect Reinforcement Learning
EMNLP 2025
Prior Prompt Engineering for Reinforcement Fine-Tuning
EMNLP 2025
SPPD: Self-training with Process Preference Learning Using Dynamic Value Margin
EMNLP 2025
Aligning Dialogue Agents with Global Feedback via Large Language Model Multimodal Reward Decomposition
EMNLP 2025
WebEvolver: Enhancing Web Agent Self-Improvement with Co-evolving World Model
EMNLP 2025
VLP: Vision-Language Preference Learning for Embodied Manipulation
EMNLP 2025
Highly Imperceptible Black-Box Graph Injection Attacks with Reinforcement Learning
AAAI 2025
<
1
…
26
27
28
…
175
>