Masatoshi Uehara
34 papers · 2019–2025 · 6 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+12 more ↓ Show less ↑
π Conference Polyglot (6) π Academic Marathon (6) π Interdisciplinary Bridge π§ Keyword Pioneer π Cross-Pollinator (9)
πΊοΈ
Taxonomy Completionist
(32)
π
Conference Polyglot
(6)
π
Academic Marathon
(6)
π€
Dynamic Duo
(15)
π
Triple Crown
π¬
Deep Specialist
(14)
π
Keyword Champion
(2)
π
Trend Setter
π
Century Club
(34)
β‘
Prolific Year
(7)
π₯
Unstoppable
(7)
ποΈ
Keyword Collector
(105)
Conferences
ICML (10)
NIPS (8)
ICLR (7)
AISTATS (3)
COLT (3)
JMLR (3)
Top co-authors
Keywords
off-policy evaluation
(8)
reinforcement learning
(6)
function approximation
(4)
causal inference
(4)
doubly robust
(4)
offline reinforcement learning
(3)
off-policy learning
(3)
importance sampling
(3)
partially observable markov decision process
(3)
unnormalized model
(2)
regret bound
(2)
noise contrastive estimation
(2)
policy learning
(2)
diffusion model
(2)
policy gradient
(2)
score matching
(2)
markov decision process
(2)
covariate shift
(2)
minimax optimization
(2)
generative model
(2)
Papers
Fine-Tuning Discrete Diffusion Models via Reward Optimization with Applications to DNA and Protein Design
ICLR 2025
Reward-Guided Iterative Refinement in Diffusion Models at Test-Time with Applications to Protein and DNA Design
ICML 2025
Adding Conditional Control to Diffusion Models with Reinforcement Learning
ICLR 2025
Localized Debiased Machine Learning: Efficient Inference on Quantile Treatment Effects and Beyond
JMLR 2024
Bridging Model-Based Optimization and Generative Modeling via Conservative Fine-Tuning of Diffusion Models
NIPS 2024
Functional Graphical Models: Structure Enables Offline Data-Driven Optimization
AISTATS 2024
Provable Offline Preference-Based Reinforcement Learning
ICLR 2024
Provable Reward-Agnostic Preference-Based Reinforcement Learning
ICLR 2024
Feedback Efficient Online Fine-Tuning of Diffusion Models
ICML 2024
Inference on Strongly Identified Functionals of Weakly Identified Functions
COLT 2023
Future-Dependent Value-Based Off-Policy Evaluation in POMDPs
NIPS 2023
PAC Reinforcement Learning for Predictive State Representations
ICLR 2023
Computationally Efficient PAC RL in POMDPs with Latent Determinism and Conditional Embeddings
ICML 2023
Distributional Offline Policy Evaluation with Predictive Error Guarantees
ICML 2023
Minimax Instrumental Variable Regression and $L_2$ Convergence Guarantees without Identification or Closedness
COLT 2023
Offline Minimax Soft-Q-learning Under Realizability and Partial Coverage
NIPS 2023
Pessimistic Model-based Offline Reinforcement Learning under Partial Coverage
ICLR 2022
Representation Learning for Online and Offline RL in Low-rank MDPs
ICLR 2022
A Minimax Learning Approach to Off-Policy Evaluation in Confounded Partially Observable Markov Decision Processes
ICML 2022
Provably Efficient Reinforcement Learning in Partially Observable Dynamical Systems
NIPS 2022
Efficient Reinforcement Learning in Block MDPs: A Model-free Representation Learning approach
ICML 2022
Optimal Off-Policy Evaluation from Multiple Logging Policies
ICML 2021
Fast Rates for the Regret of Offline Reinforcement Learning
COLT 2021
Information criteria for non-normalized models
JMLR 2021
Mitigating Covariate Shift in Imitation Learning via Offline Data With Partial Coverage
NIPS 2021
A Unified Statistically Efficient Estimation Framework for Unnormalized Models
AISTATS 2020
Off-Policy Evaluation and Learning for External Validity under a Covariate Shift
NIPS 2020
Double Reinforcement Learning for Efficient Off-Policy Evaluation in Markov Decision Processes
JMLR 2020
Doubly Robust Off-Policy Value and Gradient Estimation for Deterministic Policies
NIPS 2020
Double Reinforcement Learning for Efficient and Robust Off-Policy Evaluation
ICML 2020
Statistically Efficient Off-Policy Policy Gradients
ICML 2020
Minimax Weight and Q-Function Learning for Off-Policy Evaluation
ICML 2020
Imputation estimators for unnormalized models with missing data
AISTATS 2020
Intrinsically Efficient, Stable, and Bounded Off-Policy Evaluation for Reinforcement Learning
NIPS 2019