Mark Schmidt
37 papers · 2008–2024 · 9 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+14 more ↓ Show less ↑
π£ Hot Topic Early Bird πΊοΈ Taxonomy Completionist (18) π Interdisciplinary Bridge π§ Keyword Pioneer π Conference Polyglot (9)
π
Interdisciplinary Bridge
π
Cross-Pollinator
(15)
π
Renaissance Researcher
(6)
π
Keyword Trendsetter Combo
(5)
π
Triple Crown
π¬
Deep Specialist
(10)
π
Keyword Champion
ποΈ
Keyword Collector
(62)
β‘
Prolific Year
(5)
π
Conference Pioneer
π
Trend Setter
β
The Questioner
π
Century Club
(37)
π₯
Unstoppable
(8)
Conferences
NIPS (11)
AISTATS (10)
ICML (9)
ICLR (2)
ECCV (1)
IJCAI (1)
JMLR (1)
UAI (1)
WACV (1)
Top co-authors
Keywords
convergence rate
(8)
convex optimization
(6)
variational inference
(4)
gradient descent
(4)
stochastic gradient descent
(4)
regret bound
(3)
natural gradient
(3)
stochastic gradient
(3)
second-order optimization
(2)
online learning
(2)
thompson sampling
(2)
stochastic optimization
(2)
imitation learning
(2)
kl divergence
(2)
bayesian inference
(2)
deep learning
(2)
submodular optimization
(2)
expectation maximization
(2)
exponential family
(2)
mirror descent
(2)
Papers
Heavy-Tailed Class Imbalance and Why Adam Outperforms Gradient Descent on Language Models
NIPS 2024
Noise Is Not the Main Factor Behind the Gap Between Sgd and Adam on Transformers, But Sign Descent Might Be
ICLR 2023
Optimistic Thompson Sampling-based algorithms for episodic reinforcement learning
UAI 2023
Searching for Optimal Per-Coordinate Step-sizes with Multidimensional Backtracking
NIPS 2023
Don't be so Monotone: Relaxing Stochastic Line Search in Over-Parameterized Models
NIPS 2023
BiSLS/SPS: Auto-tune Step Sizes for Stable Bi-level Optimization
NIPS 2023
Simplifying Momentum-based Positive-definite Submanifold Optimization with Applications to Deep Learning
ICML 2023
Target-based Surrogates for Stochastic Optimization
ICML 2023
Homeomorphic-Invariance of EM: Non-Asymptotic Convergence in KL Divergence for Exponential Families via Mirror Descent (Extended Abstract)
IJCAI 2022
Let's Make Block Coordinate Descent Converge Faster: Faster Greedy Rules, Message-Passing, Active-Set Complexity, and Superlinear Convergence
JMLR 2022
Robust Asymmetric Learning in POMDPs
ICML 2021
Homeomorphic-Invariance of EM: Non-Asymptotic Convergence in KL Divergence for Exponential Families via Mirror Descent
AISTATS 2021
Tractable structured natural-gradient descent using local parameterizations
ICML 2021
AutoRetouch: Automatic Professional Face Retouching
WACV 2021
Fast and Furious Convergence: Stochastic Second Order Methods under Interpolation
AISTATS 2020
Regret Bounds without Lipschitz Continuity: Online Learning with Relative-Lipschitz Losses
NIPS 2020
Handling the Positive-Definite Constraint in the Bayesian Learning Rule
ICML 2020
Distributed Maximization of "Submodular plus Diversity" Functions for Multi-label Feature Selection on Huge Datasets
AISTATS 2019
Painless Stochastic Gradient: Interpolation, Line-Search, and Convergence Rates
NIPS 2019
Are we there yet? Manifold identification of gradient-related proximal methods
AISTATS 2019
Fast and Faster Convergence of SGD for Over-Parameterized Models and an Accelerated Perceptron
AISTATS 2019
Fast and Simple Natural-Gradient Variational Inference with Mixture of Exponential-family Approximations
ICML 2019
Online Learning Rate Adaptation with Hypergradient Descent
ICLR 2018
SLANG: Fast Structured Covariance Approximations for Bayesian Deep Learning with Natural Gradient
NIPS 2018
Where are the blobs: Counting by Localization with Point Supervision
ECCV 2018
Model-Independent Online Learning for Influence Maximization
ICML 2017
Horde of Bandits using Gaussian Markov Random Fields
AISTATS 2017
StopWasting My Gradients: Practical SVRG
NIPS 2015
Coordinate Descent Converges Faster with the Gauss-Southwell Rule Than Random Selection
ICML 2015
Non-Uniform Stochastic Average Gradient Method for Training Conditional Random Fields
AISTATS 2015
Block-Coordinate Frank-Wolfe Optimization for Structural SVMs
ICML 2013
A Stochastic Gradient Method with an Exponential Convergence _Rate for Finite Training Sets
NIPS 2012
On Sparse, Spectral and Other Parameterizations of Binary Probabilistic Models
AISTATS 2012
Convergence Rates of Inexact Proximal-Gradient Methods for Convex Optimization
NIPS 2011
Modeling annotator expertise: Learning when everybody knows a bit of something
AISTATS 2010
Convex Structure Learning in Log-Linear Models: Beyond Pairwise Potentials
AISTATS 2010
An interior-point stochastic approximation method and an L1-regularized delta rule
NIPS 2008