Mark Schmidt

37 papers · 2008–2024 · 9 conferences · across top CS/AI conferences

Achievements

+14 more ↓

🐣 Hot Topic Early Bird 🗺️ Taxonomy Completionist (18) 🌉 Interdisciplinary Bridge 🧭 Keyword Pioneer 🌍 Conference Polyglot (9)

🌉 Interdisciplinary Bridge 🐝 Cross-Pollinator (15) 🌈 Renaissance Researcher (6) 🌟 Keyword Trendsetter Combo (5) 👑 Triple Crown 🔬 Deep Specialist (10) 🏆 Keyword Champion 🗃️ Keyword Collector (62) ⚡ Prolific Year (5) 🚀 Conference Pioneer 📈 Trend Setter ❓ The Questioner 💎 Century Club (37) 🔥 Unstoppable (8)

Conferences

NIPS (11) AISTATS (10) ICML (9) ICLR (2) ECCV (1) IJCAI (1) JMLR (1) UAI (1) WACV (1)

Top co-authors

Sharan Vaswani (6) Frederik Kunstner (6) Mohammad Emtiyaz Khan (4) Wu Lin (4) Issam Laradji (3) Julie Nutini (3) Simon Lacoste-Julien (3) Nicolas L. Roux (2) Reza Babanezhad Harikandeh (2) Frank Wood (2)

Keywords

convergence rate (8) convex optimization (6) variational inference (4) gradient descent (4) stochastic gradient descent (4) regret bound (3) natural gradient (3) stochastic gradient (3) second-order optimization (2) online learning (2) thompson sampling (2) stochastic optimization (2) imitation learning (2) kl divergence (2) bayesian inference (2) deep learning (2) submodular optimization (2) expectation maximization (2) exponential family (2) mirror descent (2)

Papers

Heavy-Tailed Class Imbalance and Why Adam Outperforms Gradient Descent on Language Models NIPS 2024 Noise Is Not the Main Factor Behind the Gap Between Sgd and Adam on Transformers, But Sign Descent Might Be ICLR 2023 Optimistic Thompson Sampling-based algorithms for episodic reinforcement learning UAI 2023 Searching for Optimal Per-Coordinate Step-sizes with Multidimensional Backtracking NIPS 2023 Don't be so Monotone: Relaxing Stochastic Line Search in Over-Parameterized Models NIPS 2023 BiSLS/SPS: Auto-tune Step Sizes for Stable Bi-level Optimization NIPS 2023 Simplifying Momentum-based Positive-definite Submanifold Optimization with Applications to Deep Learning ICML 2023 Target-based Surrogates for Stochastic Optimization ICML 2023 Homeomorphic-Invariance of EM: Non-Asymptotic Convergence in KL Divergence for Exponential Families via Mirror Descent (Extended Abstract) IJCAI 2022 Let's Make Block Coordinate Descent Converge Faster: Faster Greedy Rules, Message-Passing, Active-Set Complexity, and Superlinear Convergence JMLR 2022 Robust Asymmetric Learning in POMDPs ICML 2021 Homeomorphic-Invariance of EM: Non-Asymptotic Convergence in KL Divergence for Exponential Families via Mirror Descent AISTATS 2021 Tractable structured natural-gradient descent using local parameterizations ICML 2021 AutoRetouch: Automatic Professional Face Retouching WACV 2021 Fast and Furious Convergence: Stochastic Second Order Methods under Interpolation AISTATS 2020 Regret Bounds without Lipschitz Continuity: Online Learning with Relative-Lipschitz Losses NIPS 2020 Handling the Positive-Definite Constraint in the Bayesian Learning Rule ICML 2020 Distributed Maximization of "Submodular plus Diversity" Functions for Multi-label Feature Selection on Huge Datasets AISTATS 2019 Painless Stochastic Gradient: Interpolation, Line-Search, and Convergence Rates NIPS 2019 Are we there yet? Manifold identification of gradient-related proximal methods AISTATS 2019 Fast and Faster Convergence of SGD for Over-Parameterized Models and an Accelerated Perceptron AISTATS 2019 Fast and Simple Natural-Gradient Variational Inference with Mixture of Exponential-family Approximations ICML 2019 Online Learning Rate Adaptation with Hypergradient Descent ICLR 2018 SLANG: Fast Structured Covariance Approximations for Bayesian Deep Learning with Natural Gradient NIPS 2018 Where are the blobs: Counting by Localization with Point Supervision ECCV 2018 Model-Independent Online Learning for Influence Maximization ICML 2017 Horde of Bandits using Gaussian Markov Random Fields AISTATS 2017 StopWasting My Gradients: Practical SVRG NIPS 2015 Coordinate Descent Converges Faster with the Gauss-Southwell Rule Than Random Selection ICML 2015 Non-Uniform Stochastic Average Gradient Method for Training Conditional Random Fields AISTATS 2015 Block-Coordinate Frank-Wolfe Optimization for Structural SVMs ICML 2013 A Stochastic Gradient Method with an Exponential Convergence _Rate for Finite Training Sets NIPS 2012 On Sparse, Spectral and Other Parameterizations of Binary Probabilistic Models AISTATS 2012 Convergence Rates of Inexact Proximal-Gradient Methods for Convex Optimization NIPS 2011 Modeling annotator expertise: Learning when everybody knows a bit of something AISTATS 2010 Convex Structure Learning in Log-Linear Models: Beyond Pairwise Potentials AISTATS 2010 An interior-point stochastic approximation method and an L1-regularized delta rule NIPS 2008