conftrace_

Alekh Agarwal

85 papers · 2007–2025 · 12 conferences · across top CS/AI conferences

Achievements

Jump to papers ↓

+18 more ↓

🗺️ Taxonomy Completionist (26) 🧭 Keyword Pioneer 🌉 Interdisciplinary Bridge 🌈 Renaissance Researcher (5) 🐣 Hot Topic Early Bird

🌈 Renaissance Researcher (5) 🧭 Keyword Pioneer 🌉 Interdisciplinary Bridge 🌟 Keyword Trendsetter Combo (3) 🏠 Conference Loyalist (26) 🏆 Keyword Champion (2) 🔬 Deep Specialist (10) 🤝 Dynamic Duo (22) 🏆 Grand Slam 👑 Triple Crown 👥 Mega-Team (20) 📈 Trend Setter ⚡ Prolific Year (10) ❓ The Questioner 🗃️ Keyword Collector (111) 🚀 Conference Pioneer 💎 Century Club (85) 🔥 Unstoppable (17)

Conferences

NIPS (26) ICML (25) COLT (17) JMLR (6) ICLR (3) AISTATS (2) AAAI (1) ACL (1) ALT (1) EMNLP (1) NAACL (1) UAI (1)

Top co-authors

John Langford (22) Akshay Krishnamurthy (16) Nan Jiang (10) Miroslav Dudík (8) Haipeng Luo (7) Tong Zhang (7) Wen Sun (6) Robert E. Schapire (6) Martin J. Wainwright (5) Hal Daume III (5)

Keywords

regret bound (16) contextual bandit (13) online learning (11) reinforcement learning (8) sample complexity (8) representation learning (7) function approximation (7) global convergence (6) convex optimization (6) cost-sensitive classification (5) multi-armed bandit (5) policy optimization (5) active learning (4) stochastic optimization (4) multi-class classification (4) supervised learning (3) learning theory (3) importance sampling (3) policy gradient (3) model-based reinforcement learning (3)

Papers

Design Considerations in Offline Preference-based RL ICML 2025 Theoretical guarantees on the best-of-n alignment policy ICML 2025 Catoni Contextual Bandits are Robust to Heavy-tailed Rewards ICML 2025 Rewarding Progress: Scaling Automated Process Verifiers for LLM Reasoning ICLR 2025 Optimizing Pre-Training Data Mixtures with Mixtures of Data Expert Models ACL 2025 Small steps no more: Global convergence of stochastic gradient bandits for arbitrary learning rates NIPS 2024 Conditional Language Policy: A General Framework For Steerable Multi-Objective Finetuning EMNLP 2024 Model-Free Representation Learning and Exploration in Low-Rank MDPs JMLR 2024 More Benefits of Being Distributional: Second-Order Bounds for Reinforcement Learning ICML 2024 Efficient End-to-End Visual Document Understanding with Rationale Distillation NAACL 2024 A Mechanism for Sample-Efficient In-Context Learning for Sparse Retrieval Tasks ALT 2024 The Non-linear $F$-Design and Applications to Interactive Learning ICML 2024 A Minimaximalist Approach to Reinforcement Learning from Human Feedback ICML 2024 Learning in POMDPs is Sample-Efficient with Hindsight Observability ICML 2023 VO$Q$L: Towards Optimal Regret in Model-free RL with Nonlinear Function Approximation COLT 2023 Ordering-based Conditions for Global Convergence of Policy Gradient Methods NIPS 2023 Stochastic Gradient Succeeds for Bandits ICML 2023 Provable Benefits of Representational Transfer in Reinforcement Learning COLT 2023 Non-Linear Reinforcement Learning in Large Action Spaces: Structural Conditions and Sample-efficiency of Posterior Sampling COLT 2022 On the Statistical Efficiency of Reward-Free Exploration in Non-Linear RL NIPS 2022 Model-based RL with Optimistic Posterior Sampling: Structural Conditions and Sample Complexity NIPS 2022 Efficient Reinforcement Learning in Block MDPs: A Model-free Representation Learning approach ICML 2022 Minimax Regret Optimization for Robust Machine Learning under Distribution Shift COLT 2022 Provably Filtering Exogenous Distractors using Multistep Inverse Dynamics ICLR 2022 Adversarially Trained Actor Critic for Offline Reinforcement Learning ICML 2022 On the Theory of Policy Gradient Methods: Optimality, Approximation, and Distribution Shift JMLR 2021 Cautiously Optimistic Policy Optimization and Exploration with Linear Function Approximation COLT 2021 Towards a Dimension-Free Understanding of Adaptive Linear Control COLT 2021 Bellman-consistent Pessimism for Offline Reinforcement Learning NIPS 2021 Provably Correct Optimization and Exploration with Non-linear Policies ICML 2021 A Contextual Bandit Bake-off JMLR 2021 Safe Reinforcement Learning via Curriculum Induction NIPS 2020 Metareasoning in Modular Software Systems: On-the-Fly Configuration Using Reinforcement Learning with Rich Contextual Representations AAAI 2020 Provably Good Batch Off-Policy Reinforcement Learning Without Great Exploration NIPS 2020 Optimality and Approximation with Policy Gradient Methods in Markov Decision Processes COLT 2020 Model-Based Reinforcement Learning with a Generative Model is Minimax Optimal COLT 2020 Taking a hint: How to leverage loss predictors in contextual bandits? COLT 2020 Deep Batch Active Learning by Diverse, Uncertain Gradient Lower Bounds ICLR 2020 PC-PG: Policy Cover Directed Exploration for Provable Policy Gradient Learning NIPS 2020 FLAMBE: Structural Complexity and Representation Learning of Low Rank MDPs NIPS 2020 Policy Improvement via Imitation of Multiple Oracles NIPS 2020 Provably efficient RL with Rich Observations via Latent State Decoding ICML 2019 Fair Regression: Quantitative Definitions and Reduction-Based Algorithms ICML 2019 Active Learning for Cost-Sensitive Classification JMLR 2019 Off-Policy Policy Gradient with Stationary Distribution Correction UAI 2019 Bias Correction of Learned Generative Models using Likelihood-Free Importance Weighting NIPS 2019 Model-based RL in Contextual Decision Processes: PAC bounds and Exponential Improvements over Model-free Approaches COLT 2019 Warm-starting Contextual Bandits: Robustly Combining Supervised and Bandit Feedback ICML 2019 On Oracle-Efficient PAC RL with Rich Observations NIPS 2018 Efficient Contextual Bandits in Non-stationary Worlds COLT 2018 Practical Contextual Bandits with Regression Oracles ICML 2018 A Reductions Approach to Fair Classification ICML 2018 Open Problem: The Dependence of Sample Complexity Lower Bounds on Planning Horizon COLT 2018 Hierarchical Imitation and Reinforcement Learning ICML 2018 Off-policy evaluation for slate recommendation NIPS 2017 Active Learning for Cost-Sensitive Classification ICML 2017 Optimal and Adaptive Off-policy Evaluation in Contextual Bandits ICML 2017 Open Problem: First-Order Regret Bounds for Contextual Bandits COLT 2017 Corralling a Band of Bandit Algorithms COLT 2017 Contextual Decision Processes with low Bellman rank are PAC-Learnable ICML 2017 Efficient Second Order Online Learning by Sketching NIPS 2016 PAC Reinforcement Learning with Rich Observations NIPS 2016 Contextual semibandits via supervised learning oracles NIPS 2016 Learning to Search Better than Your Teacher ICML 2015 A Lower Bound for the Optimization of Finite Sums ICML 2015 Fast Convergence of Regularized Learning in Games NIPS 2015 Efficient and Parsimonious Agnostic Active Learning NIPS 2015 Least Squares Revisited: Scalable Approaches for Multi-class Prediction ICML 2014 Robust Multi-objective Learning with Mentor Feedback COLT 2014 Learning Sparsely Used Overcomplete Dictionaries COLT 2014 Scalable Non-linear Learning with Adaptive Polynomial Expansions NIPS 2014 A Reliable Effective Terascale Linear Learning System JMLR 2014 Taming the Monster: A Fast and Simple Algorithm for Contextual Bandits ICML 2014 Selective sampling algorithms for cost-sensitive multiclass prediction ICML 2013 Stochastic optimization and sparse statistical recovery: Optimal algorithms for high dimensions NIPS 2012 Contextual Bandit Learning with Predictable Rewards AISTATS 2012 Distributed Delayed Stochastic Optimization NIPS 2011 Stochastic convex optimization with bandit feedback NIPS 2011 Oracle inequalities for computationally budgeted model selection COLT 2011 Optimal Allocation Strategies for the Dark Pool Problem AISTATS 2010 Fast global convergence rates of gradient methods for high-dimensional statistical recovery NIPS 2010 Distributed Dual Averaging In Networks NIPS 2010 Message-passing for Graph-structured Linear Programs: Proximal Methods and Rounding Schemes JMLR 2010 Information-theoretic lower bounds on the oracle complexity of convex optimization NIPS 2009 An Analysis of Inference with the Universum NIPS 2007