reinforcement learning

4122 papers

Explore in graph

Also known as

RLVR HARL GRPO RL PPO REINFORCE RFT DRL RL NULL LQR RLHF

Co-occurring keywords

large language model (12755) policy learning (699) markov decision process (788) policy gradient (518) policy optimization (630) deep reinforcement learning (903) multi-agent system (1743) imitation learning (741) regret bound (1918) language model (4573)

Papers

Prediction Improves Simultaneous Neural Machine Translation EMNLP 2018

Decoupling Strategy and Generation in Negotiation Dialogues EMNLP 2018

Logician and Orator: Learning from the Duality between Language and Knowledge in Open Domain EMNLP 2018

Automatic Essay Scoring Incorporating Rating Schema via Reinforcement Learning EMNLP 2018

Improving Reinforcement Learning Based Image Captioning with Natural Language Prior EMNLP 2018

Adaptive Multi-pass Decoder for Neural Machine Translation EMNLP 2018

Model-Free Trajectory-based Policy Optimization with Monotonic Improvement JMLR 2018

Towards Sample Efficient Reinforcement Learning IJCAI 2018

Improving Reinforcement Learning with Human Input IJCAI 2018

Scalable Initial State Interdiction for Factored MDPs IJCAI 2018

Learning to Infer Final Plans in Human Team Planning IJCAI 2018

Reinforcing Coherence for Sequence to Sequence Model in Dialogue Generation IJCAI 2018

Scheduled Policy Optimization for Natural Language Communication with Intelligent Agents IJCAI 2018

A Weakly Supervised Method for Topic Segmentation and Labeling in Goal-oriented Dialogues via Reinforcement Learning IJCAI 2018

Learning to Design Games: Strategic Environments in Reinforcement Learning IJCAI 2018

Learning Environmental Calibration Actions for Policy Self-Evolution IJCAI 2018

Master-Slave Curriculum Design for Reinforcement Learning IJCAI 2018

Knowledge-Guided Agent-Tactic-Aware Learning for StarCraft Micromanagement IJCAI 2018

Multi-Level Policy and Reward Reinforcement Learning for Image Captioning IJCAI 2018

Model-free, Model-based, and General Intelligence IJCAI 2018

Ray: A Distributed Framework for Emerging AI Applications OSDI 2018

Watching a Small Portion could be as Good as Watching All: Towards Efficient Video Classification IJCAI 2018

Can Neural Machine Translation be Improved with User Feedback? NAACL 2018

Object Ordering with Bidirectional Matchings for Visual Reasoning NAACL 2018

Reinforced Co-Training NAACL 2018