Research Explorer
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Achievements
About
Methodology
← Keywords
policy optimization
630 papers
Explore in graph
Also known as
GRPO
POLO
MAPO
PO
PPO
Co-occurring keywords
reinforcement learning
(4122)
markov decision process
(788)
offline reinforcement learning
(492)
deep reinforcement learning
(903)
model-based reinforcement learning
(415)
large language model
(12755)
safe reinforcement learning
(119)
policy learning
(699)
value function
(294)
regret bound
(1918)
Papers
Constrained Reinforcement Learning via Policy Splitting
ACML 2020
Reinforcement Learning from Imperfect Demonstrations under Soft Expert Guidance
AAAI 2020
Deep Model-Based Reinforcement Learning via Estimated Uncertainty and Conservative Policy Optimization
AAAI 2020
Empirical Likelihood for Contextual Bandits
NIPS 2020
How to Learn a Useful Critic? Model-based Action-Gradient-Estimator Policy Optimization
NIPS 2020
Characterizing Optimal Mixed Policies: Where to Intervene and What to Observe
NIPS 2020
Dynamic Regret of Policy Optimization in Non-Stationary Environments
NIPS 2020
IPO: Interior-Point Policy Optimization under Constraints
AAAI 2020
Being Optimistic to Be Conservative: Quickly Learning a CVaR Policy
AAAI 2020
Lifelong Learning with a Changing Action Set
AAAI 2020
Constrained Markov Decision Processes via Backward Value Functions
ICML 2020
Control Frequency Adaptation via Action Persistence in Batch Reinforcement Learning
ICML 2020
Compositional Transfer in Hierarchical Reinforcement Learning
RSS 2020
Generalized Tsallis Entropy Reinforcement Learning and Its Application to Soft Mobile Robots
RSS 2020
Explicit Gradient Learning for Black-Box Optimization
ICML 2020
Taylor Expansion Policy Optimization
ICML 2020
Optimistic Policy Optimization with Bandit Feedback
ICML 2020
A Game Theoretic Framework for Model Based Reinforcement Learning
ICML 2020
Decisions, Counterfactual Explanations and Strategic Behavior
NIPS 2020
Cost-Effective Incentive Allocation via Structured Counterfactual Inference
AAAI 2020
A Markov Decision Process Model for Socio-Economic Systems Impacted by Climate Change
ICML 2020
Efficiently Solving MDPs with Stochastic Mirror Descent
ICML 2020
Adaptive Smoothing for Path Integral Control
JMLR 2020
Balancing Quality and Human Involvement: An Effective Approach to Interactive Neural Machine Translation
AAAI 2020
Separating value functions across time-scales
ICML 2019
<
1
…
20
21
22
…
26
>