Mohammad Gheshlaghi azar
21 papers · 2011–2025 · 6 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+9 more ↓ Show less ↑
π£ Hot Topic Early Bird π Interdisciplinary Bridge π§ Keyword Pioneer πΊοΈ Taxonomy Completionist (12) π Conference Polyglot (6)
π§
Keyword Pioneer
π£
Hot Topic Early Bird
π
Interdisciplinary Bridge
π€
Dynamic Duo
(14)
ποΈ
Keyword Collector
(80)
π
Conference Pioneer
π
Trend Setter
π
Century Club
(21)
π₯
Unstoppable
(9)
Conferences
ICML (7)
NIPS (5)
ICLR (4)
AISTATS (2)
JMLR (2)
EMNLP (1)
Top co-authors
Keywords
reinforcement learning
(6)
self-supervised learning
(4)
representation learning
(3)
policy optimization
(2)
contrastive learning
(2)
latent representation
(2)
policy iteration
(2)
markov decision process
(2)
mirror descent
(2)
policy gradient
(2)
dynamic programming
(2)
transfer learning
(2)
direct preference optimization
(1)
online learning
(1)
preference learning
(1)
function approximation
(1)
reward modeling
(1)
policy learning
(1)
game theory
(1)
minimax optimality
(1)
Papers
Self-Improving Robust Preference Optimization
ICLR 2025
A General Theoretical Paradigm to Understand Learning from Human Preferences
AISTATS 2024
Nash Learning from Human Feedback
ICML 2024
Contrastive Policy Gradient: Aligning LLMs on sequence-level scores in a supervised-friendly fashion
EMNLP 2024
An Analysis of Quantile Temporal-Difference Learning
JMLR 2024
Understanding Self-Predictive Learning for Reinforcement Learning
ICML 2023
Regularization and Variance-Weighted Regression Achieves Minimax Optimality in Linear MDPs: Theory and Practice
ICML 2023
Large-Scale Representation Learning on Graphs via Bootstrapping
ICLR 2022
BYOL-Explore: Exploration by Bootstrapped Prediction
NIPS 2022
Drop, Swap, and Generate: A Self-Supervised Approach for Generating Neural Activity
NIPS 2021
Bootstrap Latent-Predictive Representations for Multitask Reinforcement Learning
ICML 2020
Bootstrap Your Own Latent - A New Approach to Self-Supervised Learning
NIPS 2020
Fast computation of Nash Equilibria in Imperfect Information Games
ICML 2020
Hindsight Credit Assignment
NIPS 2019
The Reactor: A fast and sample-efficient Actor-Critic agent for Reinforcement Learning
ICLR 2018
Noisy Networks For Exploration
ICLR 2018
Minimax Regret Bounds for Reinforcement Learning
ICML 2017
Online Stochastic Optimization under Correlated Bandit Feedback
ICML 2014
Sequential Transfer in Multi-armed Bandit with Finite Set of Models
NIPS 2013
Dynamic Policy Programming
JMLR 2012
Dynamic Policy Programming with Function Approximation
AISTATS 2011