reinforcement learning

4122 papers

Explore in graph

Also known as

RLVR HARL GRPO RL PPO REINFORCE RFT DRL RL NULL LQR RLHF

Co-occurring keywords

large language model (12755) policy learning (699) markov decision process (788) policy gradient (518) policy optimization (630) deep reinforcement learning (903) multi-agent system (1743) imitation learning (741) regret bound (1918) language model (4573)

Papers

Generation of Policy-Level Explanations for Reinforcement Learning AAAI 2019

Efficiently Combining Human Demonstrations and Interventions for Safe Training of Autonomous Systems in Real-Time AAAI 2019

Unsupervised Controllable Text Formalization AAAI 2019

ShrinkML: End-to-End ASR Model Compression Using Reinforcement Learning INTERSPEECH 2019

Reinforcement Learning for Improved Low Resource Dialogue Generation AAAI 2019

Playing by the Book: An Interactive Game Approach for Action Graph Extraction from Text NAACL 2019

Task Agnostic Meta-Learning for Few-Shot Learning CVPR 2019

Meta-Learning Convolutional Neural Architectures for Multi-Target Concrete Defect Classification With the COncrete DEfect BRidge IMage Dataset CVPR 2019

Seeded self-play for language learning EMNLP 2019

Learning to request guidance in emergent language EMNLP 2019

Reinforcement-based denoising of distantly supervised NER with partial annotation EMNLP 2019

Answer-Supervised Question Reformulation for Enhancing Conversational Machine Comprehension EMNLP 2019

Transfer in Deep Reinforcement Learning Using Knowledge Graphs EMNLP 2019

Generalization in Generation: A closer look at Exposure Bias EMNLP 2019

Provably efficient RL with Rich Observations via Latent State Decoding ICML 2019

Fast Neural Architecture Search of Compact Semantic Segmentation Models via Auxiliary Cells CVPR 2019

When to use parametric models in reinforcement learning? NIPS 2019

Successor Uncertainties: Exploration and Uncertainty in Temporal Difference Learning NIPS 2019

A Theory of State Abstraction for Reinforcement Learning AAAI 2019

Finite-Sample Analysis for SARSA with Linear Function Approximation NIPS 2019

Regret Minimization for Reinforcement Learning by Evaluating the Optimal Bias Function NIPS 2019

Regret Bounds for Learning State Representations in Reinforcement Learning NIPS 2019

Importance Resampling for Off-policy Prediction NIPS 2019

MCP: Learning Composable Hierarchical Control with Multiplicative Compositional Policies NIPS 2019

On the Utility of Learning about Humans for Human-AI Coordination NIPS 2019