Mohammad Gheshlaghi azar

21 papers · 2011–2025 · 6 conferences · across top CS/AI conferences

Achievements

+9 more ↓

🐣 Hot Topic Early Bird 🌉 Interdisciplinary Bridge 🧭 Keyword Pioneer 🗺️ Taxonomy Completionist (12) 🌍 Conference Polyglot (6)

🧭 Keyword Pioneer 🐣 Hot Topic Early Bird 🌉 Interdisciplinary Bridge 🤝 Dynamic Duo (14) 🗃️ Keyword Collector (80) 🚀 Conference Pioneer 📈 Trend Setter 💎 Century Club (21) 🔥 Unstoppable (9)

Conferences

ICML (7) NIPS (5) ICLR (4) AISTATS (2) JMLR (2) EMNLP (1)

Top co-authors

Rémi Munos (14) Bilal Piot (9) Michal Valko (8) Yunhao Tang (5) Mark Rowland (5) Olivier Pietquin (4) Will Dabney (4) Zhaohan Daniel Guo (4) Matthieu Geist (4) Daniele Calandriello (4)

Keywords

reinforcement learning (6) self-supervised learning (4) representation learning (3) policy optimization (2) contrastive learning (2) latent representation (2) policy iteration (2) markov decision process (2) mirror descent (2) policy gradient (2) dynamic programming (2) transfer learning (2) direct preference optimization (1) online learning (1) preference learning (1) function approximation (1) reward modeling (1) policy learning (1) game theory (1) minimax optimality (1)

Papers

Self-Improving Robust Preference Optimization ICLR 2025 A General Theoretical Paradigm to Understand Learning from Human Preferences AISTATS 2024 Nash Learning from Human Feedback ICML 2024 Contrastive Policy Gradient: Aligning LLMs on sequence-level scores in a supervised-friendly fashion EMNLP 2024 An Analysis of Quantile Temporal-Difference Learning JMLR 2024 Understanding Self-Predictive Learning for Reinforcement Learning ICML 2023 Regularization and Variance-Weighted Regression Achieves Minimax Optimality in Linear MDPs: Theory and Practice ICML 2023 Large-Scale Representation Learning on Graphs via Bootstrapping ICLR 2022 BYOL-Explore: Exploration by Bootstrapped Prediction NIPS 2022 Drop, Swap, and Generate: A Self-Supervised Approach for Generating Neural Activity NIPS 2021 Bootstrap Latent-Predictive Representations for Multitask Reinforcement Learning ICML 2020 Bootstrap Your Own Latent - A New Approach to Self-Supervised Learning NIPS 2020 Fast computation of Nash Equilibria in Imperfect Information Games ICML 2020 Hindsight Credit Assignment NIPS 2019 The Reactor: A fast and sample-efficient Actor-Critic agent for Reinforcement Learning ICLR 2018 Noisy Networks For Exploration ICLR 2018 Minimax Regret Bounds for Reinforcement Learning ICML 2017 Online Stochastic Optimization under Correlated Bandit Feedback ICML 2014 Sequential Transfer in Multi-armed Bandit with Finite Set of Models NIPS 2013 Dynamic Policy Programming JMLR 2012 Dynamic Policy Programming with Function Approximation AISTATS 2011