Doina Precup
114 papers · 2008–2026 · 16 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+18 more ↓ Show less ↑
πΊοΈ Taxonomy Completionist (39) π§ Keyword Pioneer π Interdisciplinary Bridge π Renaissance Researcher (10) π£ Hot Topic Early Bird
π
Renaissance Researcher
(10)
π§
Keyword Pioneer
π
Interdisciplinary Bridge
π
Keyword Trendsetter Combo
(4)
π
Conference Loyalist
(40)
π
Keyword Champion
(6)
π¬
Deep Specialist
(19)
π€
Dynamic Duo
(10)
π
Grand Slam
π
Triple Crown
π±
Topic Pioneer
β
The Questioner
(3)
β‘
Prolific Year
(15)
π
Trend Setter
ποΈ
Keyword Collector
(161)
π
Conference Pioneer
π
Century Club
(113)
π₯
Unstoppable
(18)
Conferences
NIPS (40)
ICML (23)
AAAI (11)
AISTATS (11)
ICLR (9)
IJCAI (6)
JMLR (3)
EMNLP (2)
UAI (2)
ACL (1)
ACML (1)
CORL (1)
CVPR (1)
INTERSPEECH (1)
MIDL (1)
NAACL (1)
Top co-authors
Research topics
Keywords
reinforcement learning
(36)
value function
(16)
markov decision process
(14)
temporal difference learning
(9)
representation learning
(8)
function approximation
(8)
transfer learning
(6)
temporal abstraction
(6)
continuous control
(6)
value function approximation
(6)
deep reinforcement learning
(5)
hierarchical reinforcement learning
(5)
policy evaluation
(5)
partial observability
(5)
bellman error
(5)
neural network
(5)
policy gradient
(4)
sequential decision making
(4)
continual learning
(4)
policy optimization
(4)
Papers
Bootstrapping Personalized Insulin Therapy via Model-Based Reinforcement Learning: An In Silico Study
AAAI 2026
Training Language Models to Self-Correct via Reinforcement Learning
ICLR 2025
Selective Unlearning via Representation Erasure Using Domain Adversarial Training
ICLR 2025
MaestroMotif: Skill Design from Artificial Intelligence Feedback
ICLR 2025
Rejecting Hallucinated State Targets during Planning
ICML 2025
Langevin Soft Actor-Critic: Efficient Exploration through Uncertainty-Driven Critic Learning
ICLR 2025
Provable and Practical: Efficient Exploration in Reinforcement Learning via Langevin Monte Carlo
ICLR 2024
Finding Increasingly Large Extremal Graphs with AlphaZero and Tabu Search
IJCAI 2024
Code as Reward: Empowering Reinforcement Learning with VLMs
ICML 2024
Mixtures of Experts Unlock Parameter Scaling for Deep RL
ICML 2024
Nash Learning from Human Feedback
ICML 2024
On the Privacy of Selection Mechanisms with Gaussian Noise
AISTATS 2024
Discrete Probabilistic Inference as Control in Multi-path Environments
UAI 2024
On learning history-based policies for controlling Markov decision processes
AISTATS 2024
On the Limits of Multi-modal Meta-Learning with Auxiliary Task Modulation Using Conditional Batch Normalization
NAACL 2024
Policy Gradient Methods in the Presence of Symmetries and State Abstractions
JMLR 2024
Efficient Reinforcement Learning by Discovering Neural Pathways
NIPS 2024
Learning Successor Features the Simple Way
NIPS 2024
Adaptive Exploration for Data-Efficient General Value Function Evaluations
NIPS 2024
Offline Multitask Representation Learning for Reinforcement Learning
NIPS 2024
QGFN: Controllable Greediness with Action Values
NIPS 2024
Parseval Regularization for Continual Reinforcement Learning
NIPS 2024
Conditions on Preference Relations that Guarantee the Existence of Optimal Policies
AISTATS 2024
Consciousness-Inspired Spatio-Temporal Abstractions for Better Generalization in Reinforcement Learning
ICLR 2024
ReactZyme: A Benchmark for Enzyme-Reaction Prediction
NIPS 2024
Towards Safe Mechanical Ventilation Treatment Using Deep Offline Reinforcement Learning
AAAI 2023
On the Challenges of Using Reinforcement Learning in Precision Drug Dosing: Delay and Prolongedness of Action Effects
AAAI 2023
For SALE: State-Action Representation Learning for Deep Reinforcement Learning
NIPS 2023
Prediction and Control in Continual Reinforcement Learning
NIPS 2023
When Do Graph Neural Networks Help with Node Classification? Investigating the Homophily Principle on Node Distinguishability
NIPS 2023
A Definition of Continual Reinforcement Learning
NIPS 2023
Finite time analysis of temporal difference learning with linear function approximation: Tail averaging and regularisation
AISTATS 2023
Multi-Environment Pretraining Enables Transfer to Action Limited Datasets
ICML 2023
Temporal Abstraction in Reinforcement Learning with the Successor Representation
JMLR 2023
Continuous MDP Homomorphisms and Homomorphic Policy Gradient
NIPS 2022
COptiDICE: Offline Constrained Reinforcement Learning via Stationary Distribution Correction Estimation
ICLR 2022
Policy Gradients Incorporating the Future
ICLR 2022
Constructing a Good Behavior Basis for Transfer using Generalized Policy Updates
ICLR 2022
Improving Robustness against Real-World and Worst-Case Distribution Shifts through Decision Region Quantification
ICML 2022
Why Should I Trust You, Bellman? The Bellman Error is a Poor Replacement for Value Error
ICML 2022
Towards painless policy optimization for constrained MDPs
UAI 2022
Revisiting Heterophily For Graph Neural Networks
NIPS 2022
Proving Theorems using Incremental Learning and Hindsight Experience Replay
ICML 2022
On the Expressivity of Markov Reward (Extended Abstract)
IJCAI 2022
A Deep Reinforcement Learning Approach to Marginalized Importance Sampling with the Successor Representation
ICML 2021
Randomized Exploration in Reinforcement Learning with General Value Function Approximation
ICML 2021
Gradient Starvation: A Learning Proclivity in Neural Networks
NIPS 2021
Variance Penalized On-Policy and Off-Policy Actor-Critic
AAAI 2021
Self-Supervised Attention-Aware Reinforcement Learning
AAAI 2021
Flow Network based Generative Models for Non-Iterative Diverse Candidate Generation
NIPS 2021
On the Expressivity of Markov Reward
NIPS 2021
Locally Persistent Exploration in Continuous Control Tasks with Sparse Rewards
ICML 2021
Flexible Option Learning
NIPS 2021
Temporally Abstract Partial Models
NIPS 2021
A Consciousness-Inspired Planning Agent for Model-Based Reinforcement Learning
NIPS 2021
Preferential Temporal Difference Learning
ICML 2021
What can I do here? A Theory of Affordances in Reinforcement Learning
ICML 2020
Forethought and Hindsight in Credit Assignment
NIPS 2020
On Efficiency in Hierarchical Reinforcement Learning
NIPS 2020
Value-driven Hindsight Modelling
NIPS 2020
Reward Propagation Using Graph Convolutional Networks
NIPS 2020
An Equivalence between Loss Functions and Non-Uniform Sampling in Experience Replay
NIPS 2020
Algorithmic Improvements for Deep Reinforcement Learning Applied to Interactive Fiction
AAAI 2020
Options of Interest: Temporal Abstraction with Interest Functions
AAAI 2020
Gifting in Multi-Agent Reinforcement Learning (Student Abstract)
AAAI 2020
Value Preserving State-Action Abstractions
AISTATS 2020
Efficient Planning under Partial Observability with Unnormalized Q Functions and Spectral Learning
AISTATS 2020
A Distributional Analysis of Sampling-Based Reinforcement Learning Algorithms
AISTATS 2020
Interference and Generalization in Temporal Difference Learning
ICML 2020
Invariant Causal Prediction for Block MDPs
ICML 2020
SVRG for Policy Evaluation with Fewer Gradient Evaluations
IJCAI 2020
Leveraging Observations in Bandits: Between Risks and Benefits
AAAI 2019
Navigation Agents for the Visually Impaired: A Sidewalk Simulator and Experiments
CORL 2019
Hindsight Credit Assignment
NIPS 2019
The Option Keyboard: Combining Skills in Reinforcement Learning
NIPS 2019
Break the Ceiling: Stronger Multi-scale Deep Graph Convolutional Networks
NIPS 2019
The Termination Critic
AISTATS 2019
Learning Options with Interest Functions
AAAI 2019
Combined Reinforcement Learning via Abstract Representations
AAAI 2019
Prediction of Disease Progression in Multiple Sclerosis Patients using Deep Learning Analysis of MRI Data
MIDL 2019
Neural Transfer Learning for Cry-Based Diagnosis of Perinatal Asphyxia
INTERSPEECH 2019
Connecting Weighted Automata and Recurrent Neural Networks through Spectral Learning
AISTATS 2019
Off-Policy Deep Reinforcement Learning without Exploration
ICML 2019
Per-Decision Option Discounting
ICML 2019
Nonlinear Weighted Finite Automata
AISTATS 2018
Learning Safe Policies with Expert Guidance
NIPS 2018
Convergent Tree Backup and Retrace with Function Approximation
ICML 2018
Temporal Regularization for Markov Decision Process
NIPS 2018
World Knowledge for Reading Comprehension: Rare Entity Prediction with Hierarchical LSTMs Using External Descriptions
EMNLP 2017
Approximate Value Iteration with Temporally Extended Actions (Extended Abstract)
IJCAI 2017
Leveraging Lexical Resources for Learning Entity Embeddings in Multi-Relational Data
ACL 2016
Differentially Private Policy Evaluation
ICML 2016
Learning Multi-Step Predictive State Representations
IJCAI 2016
Practical Kernel-Based Reinforcement Learning
JMLR 2016
Verb Phrase Ellipsis Resolution Using Discriminative and Margin-Infused Algorithms
EMNLP 2016
An Expectation-Maximization Algorithm to Compute a Stochastic Factorization From Data
IJCAI 2015
Variational Generative Stochastic Networks with Collaborative Shaping
ICML 2015
Basis refinement strategies for linear value function approximation in MDPs
NIPS 2015
Data Generation as Sequential Decision Making
NIPS 2015
Optimizing Energy Production Using Policy Search and Predictive State Representations
NIPS 2014
Sample-based approximate regularization
ICML 2014
A new Q(lambda) with interim forward view and Monte Carlo equivalence
ICML 2014
Learning with Pseudo-Ensembles
NIPS 2014
Iterative Multilevel MRF Leveraging Context and Voxel Information for Brain Tumour Segmentation in MRI
CVPR 2014
Learning from Limited Demonstrations
NIPS 2013
Average Reward Optimization Objective In Partially Observable Domains
ICML 2013
Bellman Error Based Feature Generation using Random Projections on Sparse Spaces
NIPS 2013
Value Pursuit Iteration
NIPS 2012
On Average Reward Policy Evaluation in Infinite-State Partially Observable Systems
AISTATS 2012
On-line Reinforcement Learning Using Incremental Kernel-Based Stochastic Factorization
NIPS 2012
Reinforcement Learning using Kernel-Based Stochastic Factorization
NIPS 2011
A Study of Approximate Inference in Probabilistic Relational Models
ACML 2010
Convergent Temporal-Difference Learning with Arbitrary Smooth Function Approximation
NIPS 2009
Bounding Performance Loss in Approximate MDP Homomorphisms
NIPS 2008