Alekh Agarwal
85 papers · 2007–2025 · 12 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+18 more ↓ Show less ↑
πΊοΈ Taxonomy Completionist (26) π§ Keyword Pioneer π Interdisciplinary Bridge π Renaissance Researcher (5) π£ Hot Topic Early Bird
π
Renaissance Researcher
(5)
π§
Keyword Pioneer
π
Interdisciplinary Bridge
π
Keyword Trendsetter Combo
(3)
π
Conference Loyalist
(26)
π
Keyword Champion
(2)
π¬
Deep Specialist
(10)
π€
Dynamic Duo
(22)
π
Grand Slam
π
Triple Crown
π₯
Mega-Team
(20)
π
Trend Setter
β‘
Prolific Year
(10)
β
The Questioner
ποΈ
Keyword Collector
(111)
π
Conference Pioneer
π
Century Club
(85)
π₯
Unstoppable
(17)
Conferences
NIPS (26)
ICML (25)
COLT (17)
JMLR (6)
ICLR (3)
AISTATS (2)
AAAI (1)
ACL (1)
ALT (1)
EMNLP (1)
NAACL (1)
UAI (1)
Top co-authors
Keywords
regret bound
(16)
contextual bandit
(13)
online learning
(11)
reinforcement learning
(8)
sample complexity
(8)
representation learning
(7)
function approximation
(7)
global convergence
(6)
convex optimization
(6)
cost-sensitive classification
(5)
multi-armed bandit
(5)
policy optimization
(5)
active learning
(4)
stochastic optimization
(4)
multi-class classification
(4)
supervised learning
(3)
learning theory
(3)
importance sampling
(3)
policy gradient
(3)
model-based reinforcement learning
(3)
Papers
Design Considerations in Offline Preference-based RL
ICML 2025
Theoretical guarantees on the best-of-n alignment policy
ICML 2025
Catoni Contextual Bandits are Robust to Heavy-tailed Rewards
ICML 2025
Rewarding Progress: Scaling Automated Process Verifiers for LLM Reasoning
ICLR 2025
Optimizing Pre-Training Data Mixtures with Mixtures of Data Expert Models
ACL 2025
Small steps no more: Global convergence of stochastic gradient bandits for arbitrary learning rates
NIPS 2024
Conditional Language Policy: A General Framework For Steerable Multi-Objective Finetuning
EMNLP 2024
Model-Free Representation Learning and Exploration in Low-Rank MDPs
JMLR 2024
More Benefits of Being Distributional: Second-Order Bounds for Reinforcement Learning
ICML 2024
Efficient End-to-End Visual Document Understanding with Rationale Distillation
NAACL 2024
A Mechanism for Sample-Efficient In-Context Learning for Sparse Retrieval Tasks
ALT 2024
The Non-linear $F$-Design and Applications to Interactive Learning
ICML 2024
A Minimaximalist Approach to Reinforcement Learning from Human Feedback
ICML 2024
Learning in POMDPs is Sample-Efficient with Hindsight Observability
ICML 2023
VO$Q$L: Towards Optimal Regret in Model-free RL with Nonlinear Function Approximation
COLT 2023
Ordering-based Conditions for Global Convergence of Policy Gradient Methods
NIPS 2023
Stochastic Gradient Succeeds for Bandits
ICML 2023
Provable Benefits of Representational Transfer in Reinforcement Learning
COLT 2023
Non-Linear Reinforcement Learning in Large Action Spaces: Structural Conditions and Sample-efficiency of Posterior Sampling
COLT 2022
On the Statistical Efficiency of Reward-Free Exploration in Non-Linear RL
NIPS 2022
Model-based RL with Optimistic Posterior Sampling: Structural Conditions and Sample Complexity
NIPS 2022
Efficient Reinforcement Learning in Block MDPs: A Model-free Representation Learning approach
ICML 2022
Minimax Regret Optimization for Robust Machine Learning under Distribution Shift
COLT 2022
Provably Filtering Exogenous Distractors using Multistep Inverse Dynamics
ICLR 2022
Adversarially Trained Actor Critic for Offline Reinforcement Learning
ICML 2022
On the Theory of Policy Gradient Methods: Optimality, Approximation, and Distribution Shift
JMLR 2021
Cautiously Optimistic Policy Optimization and Exploration with Linear Function Approximation
COLT 2021
Towards a Dimension-Free Understanding of Adaptive Linear Control
COLT 2021
Bellman-consistent Pessimism for Offline Reinforcement Learning
NIPS 2021
Provably Correct Optimization and Exploration with Non-linear Policies
ICML 2021
A Contextual Bandit Bake-off
JMLR 2021
Safe Reinforcement Learning via Curriculum Induction
NIPS 2020
Metareasoning in Modular Software Systems: On-the-Fly Configuration Using Reinforcement Learning with Rich Contextual Representations
AAAI 2020
Provably Good Batch Off-Policy Reinforcement Learning Without Great Exploration
NIPS 2020
Optimality and Approximation with Policy Gradient Methods in Markov Decision Processes
COLT 2020
Model-Based Reinforcement Learning with a Generative Model is Minimax Optimal
COLT 2020
Taking a hint: How to leverage loss predictors in contextual bandits?
COLT 2020
Deep Batch Active Learning by Diverse, Uncertain Gradient Lower Bounds
ICLR 2020
PC-PG: Policy Cover Directed Exploration for Provable Policy Gradient Learning
NIPS 2020
FLAMBE: Structural Complexity and Representation Learning of Low Rank MDPs
NIPS 2020
Policy Improvement via Imitation of Multiple Oracles
NIPS 2020
Provably efficient RL with Rich Observations via Latent State Decoding
ICML 2019
Fair Regression: Quantitative Definitions and Reduction-Based Algorithms
ICML 2019
Active Learning for Cost-Sensitive Classification
JMLR 2019
Off-Policy Policy Gradient with Stationary Distribution Correction
UAI 2019
Bias Correction of Learned Generative Models using Likelihood-Free Importance Weighting
NIPS 2019
Model-based RL in Contextual Decision Processes: PAC bounds and Exponential Improvements over Model-free Approaches
COLT 2019
Warm-starting Contextual Bandits: Robustly Combining Supervised and Bandit Feedback
ICML 2019
On Oracle-Efficient PAC RL with Rich Observations
NIPS 2018
Efficient Contextual Bandits in Non-stationary Worlds
COLT 2018
Practical Contextual Bandits with Regression Oracles
ICML 2018
A Reductions Approach to Fair Classification
ICML 2018
Open Problem: The Dependence of Sample Complexity Lower Bounds on Planning Horizon
COLT 2018
Hierarchical Imitation and Reinforcement Learning
ICML 2018
Off-policy evaluation for slate recommendation
NIPS 2017
Active Learning for Cost-Sensitive Classification
ICML 2017
Optimal and Adaptive Off-policy Evaluation in Contextual Bandits
ICML 2017
Open Problem: First-Order Regret Bounds for Contextual Bandits
COLT 2017
Corralling a Band of Bandit Algorithms
COLT 2017
Contextual Decision Processes with low Bellman rank are PAC-Learnable
ICML 2017
Efficient Second Order Online Learning by Sketching
NIPS 2016
PAC Reinforcement Learning with Rich Observations
NIPS 2016
Contextual semibandits via supervised learning oracles
NIPS 2016
Learning to Search Better than Your Teacher
ICML 2015
A Lower Bound for the Optimization of Finite Sums
ICML 2015
Fast Convergence of Regularized Learning in Games
NIPS 2015
Efficient and Parsimonious Agnostic Active Learning
NIPS 2015
Least Squares Revisited: Scalable Approaches for Multi-class Prediction
ICML 2014
Robust Multi-objective Learning with Mentor Feedback
COLT 2014
Learning Sparsely Used Overcomplete Dictionaries
COLT 2014
Scalable Non-linear Learning with Adaptive Polynomial Expansions
NIPS 2014
A Reliable Effective Terascale Linear Learning System
JMLR 2014
Taming the Monster: A Fast and Simple Algorithm for Contextual Bandits
ICML 2014
Selective sampling algorithms for cost-sensitive multiclass prediction
ICML 2013
Stochastic optimization and sparse statistical recovery: Optimal algorithms for high dimensions
NIPS 2012
Contextual Bandit Learning with Predictable Rewards
AISTATS 2012
Distributed Delayed Stochastic Optimization
NIPS 2011
Stochastic convex optimization with bandit feedback
NIPS 2011
Oracle inequalities for computationally budgeted model selection
COLT 2011
Optimal Allocation Strategies for the Dark Pool Problem
AISTATS 2010
Fast global convergence rates of gradient methods for high-dimensional statistical recovery
NIPS 2010
Distributed Dual Averaging In Networks
NIPS 2010
Message-passing for Graph-structured Linear Programs: Proximal Methods and Rounding Schemes
JMLR 2010
Information-theoretic lower bounds on the oracle complexity of convex optimization
NIPS 2009
An Analysis of Inference with the Universum
NIPS 2007