A. Rupam Mahmood
14 papers · 2014–2024 · 7 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+7 more ↓ Show less ↑
π Academic Marathon (10) π Interdisciplinary Bridge π§ Keyword Pioneer π Conference Polyglot (7) π Cross-Pollinator (14)
πΊοΈ
Taxonomy Completionist
(14)
π
Interdisciplinary Bridge
π
Triple Crown
π±
Topic Pioneer
β‘
Prolific Year
(5)
π
Trend Setter
π
Century Club
(14)
Conferences
JMLR (4)
ICML (3)
ICLR (2)
NIPS (2)
CORL (1)
IJCAI (1)
UAI (1)
Top co-authors
Keywords
reinforcement learning
(5)
off-policy learning
(4)
value function
(3)
function approximation
(3)
temporal-difference learning
(3)
temporal difference learning
(2)
policy gradient
(2)
continuous control
(2)
eligibility trace
(2)
robotic manipulation
(1)
importance sampling
(1)
markov decision process
(1)
discount factor
(1)
bellman equation
(1)
policy optimization
(1)
sample efficiency
(1)
incremental learning
(1)
kl divergence
(1)
variance reduction
(1)
online learning
(1)
Papers
Deep Policy Gradient Methods Without Batch Updates, Target Networks, or Replay Buffers
NIPS 2024
Target Networks and Over-parameterization Stabilize Off-policy Bootstrapping with Function Approximation
ICML 2024
Revisiting Scalable Hessian Diagonal Approximations for Applications in Reinforcement Learning
ICML 2024
Addressing Loss of Plasticity and Catastrophic Forgetting in Continual Learning
ICLR 2024
Provable and Practical: Efficient Exploration in Reinforcement Learning via Langevin Monte Carlo
ICLR 2024
Loosely consistent emphatic temporal-difference learning
UAI 2023
Correcting discount-factor mismatch in on-policy policy gradient methods
ICML 2023
Greedification Operators for Policy Optimization: Investigating Forward and Reverse KL Divergences
JMLR 2022
Autoregressive Policies for Continuous Control Deep Reinforcement Learning
IJCAI 2019
On Generalized Bellman Equations and Temporal-Difference Learning
JMLR 2018
Benchmarking Reinforcement Learning Algorithms on Real-World Robots
CORL 2018
True Online Temporal-Difference Learning
JMLR 2016
An Emphatic Approach to the Problem of Off-policy Temporal-Difference Learning
JMLR 2016
Weighted importance sampling for off-policy learning with linear function approximation
NIPS 2014