Philip S. Thomas
25 papers · 2011–2024 · 6 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+10 more ↓ Show less ↑
π£ Hot Topic Early Bird π Interdisciplinary Bridge π§ Keyword Pioneer πΊοΈ Taxonomy Completionist (12) π Conference Polyglot (6)
π§
Keyword Pioneer
π£
Hot Topic Early Bird
π
Keyword Champion
(2)
π
Grand Slam
ποΈ
Keyword Collector
(98)
π
Conference Pioneer
π
Trend Setter
π
Century Club
(25)
π₯
Unstoppable
(8)
β‘
Prolific Year
(5)
Conferences
NIPS (16)
AAAI (3)
ICML (2)
IJCAI (2)
ICLR (1)
JMLR (1)
Top co-authors
Keywords
off-policy evaluation
(9)
reinforcement learning
(6)
variance estimation
(3)
policy evaluation
(3)
policy search
(3)
markov decision process
(3)
safe policy improvement
(2)
temporal difference learning
(2)
behavior policy
(2)
offline reinforcement learning
(2)
mean squared error
(2)
safe reinforcement learning
(2)
importance sampling
(2)
counterfactual reasoning
(2)
sequential decision making
(2)
policy gradient
(2)
temporal abstraction
(2)
natural gradient
(2)
transfer learning
(1)
policy optimization
(1)
Papers
Abstract Reward Processes: Leveraging State Abstraction for Consistent Off-Policy Evaluation
NIPS 2024
Position: Benchmarking is Limited in Reinforcement Learning Research
ICML 2024
Data-Efficient Policy Evaluation Through Behavior Policy Search
JMLR 2024
From Past to Future: Rethinking Eligibility Traces
AAAI 2024
Behavior Alignment via Reward Function Optimization
NIPS 2023
Fairness Guarantees under Demographic Shift
ICLR 2022
Off-Policy Evaluation for Action-Dependent Non-stationary Environments
NIPS 2022
Structural Credit Assignment in Neural Networks using Reinforcement Learning
NIPS 2021
Multi-Objective SPIBB: Seldonian Offline Policy Improvement with Safety Constraints in Finite MDPs
NIPS 2021
SOPE: Spectrum of Off-Policy Estimators
NIPS 2021
Universal Off-Policy Evaluation
NIPS 2021
High-Confidence Off-Policy (or Counterfactual) Variance Estimation
AAAI 2021
Towards Safe Policy Improvement for Non-Stationary MDPs
NIPS 2020
Security Analysis of Safe and Seldonian Reinforcement Learning Algorithms
NIPS 2020
Offline Contextual Bandits with High Probability Fairness Guarantees
NIPS 2019
A Meta-MDP Approach to Exploration for Lifelong Reinforcement Learning
NIPS 2019
Natural Option Critic
AAAI 2019
Importance Sampling for Fair Policy Selection
IJCAI 2018
Using Options and Covariance Testing for Long Horizon Off-Policy Policy Evaluation
NIPS 2017
Data-Efficient Policy Evaluation Through Behavior Policy Search
ICML 2017
Personalized Ad Recommendation Systems for Life-Time Value Optimization with Guarantees
IJCAI 2015
Policy Evaluation Using the Ξ©-Return
NIPS 2015
Projected Natural Actor-Critic
NIPS 2013
TD_gamma: Re-evaluating Complex Backups in Temporal Difference Learning
NIPS 2011
Policy Gradient Coagent Networks
NIPS 2011