Research Explorer
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Achievements
About
Methodology
← Keywords
policy optimization
630 papers
Explore in graph
Also known as
GRPO
POLO
MAPO
PO
PPO
Co-occurring keywords
reinforcement learning
(4122)
markov decision process
(788)
offline reinforcement learning
(492)
deep reinforcement learning
(903)
model-based reinforcement learning
(415)
large language model
(12755)
safe reinforcement learning
(119)
policy learning
(699)
value function
(294)
regret bound
(1918)
Papers
Rehabilitating Homeless: Dataset and Key Insights
AAAI 2023
Efficient Exploration via Epistemic-Risk-Seeking Policy Optimization
ICML 2023
Provably Efficient Primal-Dual Reinforcement Learning for CMDPs with Non-stationary Objectives and Constraints
AAAI 2023
Unlabeled Imperfect Demonstrations in Adversarial Imitation Learning
AAAI 2023
Local Optimization Achieves Global Optimality in Multi-Agent Reinforcement Learning
ICML 2023
User-Oriented Robust Reinforcement Learning
AAAI 2023
Tackling Safe and Efficient Multi-Agent Reinforcement Learning via Dynamic Shielding (Student Abstract)
AAAI 2023
The Virtues of Laziness in Model-based RL: A Unified Objective and Algorithms
ICML 2023
Open-Ended Diverse Solution Discovery with Regulated Behavior Patterns for Cross-Domain Adaptation
AAAI 2023
Policy Space Diversity for Non-Transitive Games
NIPS 2023
Wasserstein Gradient Flows for Optimizing Gaussian Mixture Policies
NIPS 2023
Instance-Dependent Confidence and Early Stopping for Reinforcement Learning
JMLR 2023
PiCor: Multi-Task Deep Reinforcement Learning with Policy Correction
AAAI 2023
Hybrid Policy Optimization from Imperfect Demonstrations
NIPS 2023
Adaptation Augmented Model-based Policy Optimization
JMLR 2023
Coordinate Ascent for Off-Policy RL with Global Convergence Guarantees
AISTATS 2023
Augmented Proximal Policy Optimization for Safe Reinforcement Learning
AAAI 2023
Multi-Modal Inverse Constrained Reinforcement Learning from a Mixture of Demonstrations
NIPS 2023
Decision-Aware Actor-Critic with Function Approximation and Theoretical Guarantees
NIPS 2023
Efficient Diffusion Policies For Offline Reinforcement Learning
NIPS 2023
Survival Instinct in Offline Reinforcement Learning
NIPS 2023
Accelerating Exploration with Unlabeled Prior Data
NIPS 2023
Provable Safe Reinforcement Learning with Binary Feedback
AISTATS 2023
Low-Switching Policy Gradient with Exploration via Online Sensitivity Sampling
ICML 2023
Counterfactual Learning with General Data-Generating Policies
AAAI 2023
<
1
…
9
10
11
…
26
>