Research Explorer
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Achievements
About
Methodology
← Keywords
policy optimization
630 papers
Explore in graph
Also known as
GRPO
POLO
MAPO
PO
PPO
Co-occurring keywords
reinforcement learning
(4122)
markov decision process
(788)
offline reinforcement learning
(492)
deep reinforcement learning
(903)
model-based reinforcement learning
(415)
large language model
(12755)
safe reinforcement learning
(119)
policy learning
(699)
value function
(294)
regret bound
(1918)
Papers
Importance Weighted Actor-Critic for Optimal Conservative Offline Reinforcement Learning
NIPS 2023
How to Fine-tune the Model: Unified Model Shift and Model Bias Policy Optimization
NIPS 2023
Multi-Modal Inverse Constrained Reinforcement Learning from a Mixture of Demonstrations
NIPS 2023
Learning Adversarial Low-rank Markov Decision Processes with Unknown Transition and Full-information Feedback
NIPS 2023
Anytime-Competitive Reinforcement Learning with Policy Prior
NIPS 2023
Provable Safe Reinforcement Learning with Binary Feedback
AISTATS 2023
Bi-Level Offline Policy Optimization with Limited Exploration
NIPS 2023
Learning from Active Human Involvement through Proxy Value Propagation
NIPS 2023
Counterfactual Learning with General Data-Generating Policies
AAAI 2023
Reinforcement Learning with Stepwise Fairness Constraints
AISTATS 2023
Active Exploration via Experiment Design in Markov Chains
AISTATS 2023
A Reinforcement Learning Look at Risk-Sensitive Linear Quadratic Gaussian Control
L4DC 2023
Heuristic Search for Multi-Objective Probabilistic Planning
AAAI 2023
Tempo Adaptation in Non-stationary Reinforcement Learning
NIPS 2023
Safe Exploration in Reinforcement Learning: A Generalized Formulation and Algorithms
NIPS 2023
Towards Robust and Safe Reinforcement Learning with Benign Off-policy Data
ICML 2023
Boosted Off-Policy Learning
AISTATS 2023
DoMo-AC: Doubly Multi-step Off-policy Actor-Critic Algorithm
ICML 2023
Levin Tree Search with Context Models
IJCAI 2023
PAC-Bayesian Offline Contextual Bandits With Guarantees
ICML 2023
Policy Mirror Ascent for Efficient and Independent Learning in Mean Field Games
ICML 2023
Adversarial Learning of Distributional Reinforcement Learning
ICML 2023
Differentially Private Episodic Reinforcement Learning with Heavy-tailed Rewards
ICML 2023
The Virtues of Laziness in Model-based RL: A Unified Objective and Algorithms
ICML 2023
Distributional Multi-Objective Decision Making
IJCAI 2023
<
1
…
10
11
12
…
26
>