conftrace_

reinforcement learning

4122 papers

Explore in graph

Also known as

RLVR HARL GRPO RL PPO REINFORCE RFT DRL RL NULL LQR RLHF

Co-occurring keywords

large language model (12755) policy learning (699) markov decision process (788) policy gradient (518) policy optimization (630) deep reinforcement learning (903) multi-agent system (1743) imitation learning (741) regret bound (1918) language model (4573)

Papers

Trading off Utility, Informativeness, and Complexity in Emergent Communication NIPS 2022

Efficient Adversarial Training without Attacking: Worst-Case-Aware Robust Reinforcement Learning NIPS 2022

Mask-based Latent Reconstruction for Reinforcement Learning NIPS 2022

Semantic Exploration from Language Abstractions and Pretrained Representations NIPS 2022

Evaluation beyond Task Performance: Analyzing Concepts in AlphaZero in Hex NIPS 2022

IMED-RL: Regret optimal learning of ergodic Markov decision processes NIPS 2022

Learning General World Models in a Handful of Reward-Free Deployments NIPS 2022

ProtoX: Explaining a Reinforcement Learning Agent via Prototyping NIPS 2022

Human-Robotic Prosthesis as Collaborating Agents for Symmetrical Walking NIPS 2022

QUARK: Controllable Text Generation with Reinforced Unlearning NIPS 2022

Non-Markovian Reward Modelling from Trajectory Labels via Interpretable Multiple Instance Learning NIPS 2022

Pragmatically Learning from Pedagogical Demonstrations in Multi-Goal Environments NIPS 2022

Distributed Influence-Augmented Local Simulators for Parallel MARL in Large Networked Systems NIPS 2022

Assistive Teaching of Motor Control Tasks to Humans NIPS 2022

A Few Expert Queries Suffices for Sample-Efficient RL with Resets and Linear Value Approximation NIPS 2022

Off-Policy Evaluation with Deficient Support Using Side Information NIPS 2022

Scalable Multi-agent Covering Option Discovery based on Kronecker Graphs NIPS 2022

LECO: Learnable Episodic Count for Task-Specific Intrinsic Reward NIPS 2022

First Contact: Unsupervised Human-Machine Co-Adaptation via Mutual Information Maximization NIPS 2022

Provable Benefit of Multitask Representation Learning in Reinforcement Learning NIPS 2022

Modeling Human Exploration Through Resource-Rational Reinforcement Learning NIPS 2022

BYOL-Explore: Exploration by Bootstrapped Prediction NIPS 2022

Factored Adaptation for Non-Stationary Reinforcement Learning NIPS 2022

TarGF: Learning Target Gradient Field to Rearrange Objects without Explicit Goal Specification NIPS 2022

Grounding Aleatoric Uncertainty for Unsupervised Environment Design NIPS 2022