← Learning Types

Deep Learning › Learning Types ›

Reinforcement Learning

1263 directly classified papers

Papers per year

Papers

DiffChat: Learning to Chat with Text-to-Image Synthesis Models for Interactive Image Creation ACL 2024

ALaRM: Align Language Models via Hierarchical Rewards Modeling ACL 2024

Just Ask One More Time! Self-Agreement Improves Reasoning of Language Models in (Almost) All Scenarios ACL 2024

RePALM: Popular Quote Tweet Generation via Auto-Response Augmentation ACL 2024

Applying RLAIF for Code Generation with API-usage in Lightweight LLMs ACL 2024

Self-Training with Direct Preference Optimization Improves Chain-of-Thought Reasoning ACL 2024

Enhancing Numerical Reasoning with the Guidance of Reliable Reasoning Processes ACL 2024

Trial and Error: Exploration-Based Trajectory Optimization of LLM Agents ACL 2024

ChiMed-GPT: A Chinese Medical Large Language Model with Full Training Regime and Better Alignment to Human Preferences ACL 2024

Training Language Models to Generate Text with Citations via Fine-grained Rewards ACL 2024

M-RAG: Reinforcing Large Language Model Performance through Retrieval-Augmented Generation with Multiple Partitions ACL 2024

Unveiling the Art of Heading Design: A Harmonious Blend of Summarization, Neology, and Algorithm ACL 2024

PACE: Improving Prompt with Actor-Critic Editing for Large Language Model ACL 2024

Reinforcement Learning-Driven LLM Agent for Automated Attacks on LLMs ACL 2024

Enhancing Reinforcement Learning with Dense Rewards from Language Model Critic EMNLP 2024

PrefPaint: Aligning Image Inpainting Diffusion Model with Human Preference NIPS 2024

Learning Formal Mathematics From Intrinsic Motivation NIPS 2024

Controllable Citation Sentence Generation with Language Models ACL 2024

Carbon Footprint Reduction for Sustainable Data Centers in Real-Time AAAI 2024

Reward Certification for Policy Smoothed Reinforcement Learning AAAI 2024

Two-Stage Evolutionary Reinforcement Learning for Enhancing Exploration and Exploitation AAAI 2024

Controlled maximal variability along with reliable performance in recurrent neural networks NIPS 2024

Robust Communicative Multi-Agent Reinforcement Learning with Active Defense AAAI 2024

Learn to Follow: Decentralized Lifelong Multi-Agent Pathfinding via Planning and Learning AAAI 2024

PMAC: Personalized Multi-Agent Communication AAAI 2024