Suvrit Sra
80 papers · 2005–2025 · 10 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+19 more ↓ Show less ↑
π§ Keyword Pioneer π£ Hot Topic Early Bird πΊοΈ Taxonomy Completionist (19) π Interdisciplinary Bridge π Conference Polyglot (10)
π
Interdisciplinary Bridge
π£
Hot Topic Early Bird
π§
Keyword Pioneer
π
Keyword Trendsetter Combo
(4)
π
Conference Loyalist
(28)
π§¬
Topic Evolution
π€
Dynamic Duo
(14)
π
Grand Slam
π
Triple Crown
π±
Topic Pioneer
π¬
Deep Specialist
(30)
π
Keyword Champion
(2)
β‘
Prolific Year
(6)
ποΈ
Keyword Collector
(89)
π
Trend Setter
π
Century Club
(80)
π₯
Unstoppable
(14)
β
The Questioner
(5)
π
Conference Pioneer
Conferences
NIPS (28)
ICML (24)
ICLR (10)
AISTATS (6)
COLT (5)
JMLR (2)
L4DC (2)
AAAI (1)
ACML (1)
CVPR (1)
Top co-authors
Keywords
determinantal point process
(9)
convex optimization
(8)
stochastic gradient descent
(8)
nonconvex optimization
(8)
stochastic optimization
(7)
manifold optimization
(6)
riemannian optimization
(6)
convergence rate
(5)
variance reduction
(5)
gradient descent
(5)
positive definite matrix
(4)
riemannian manifold
(4)
markov chain monte carlo
(4)
stochastic gradient
(4)
non-convex optimization
(4)
nonsmooth optimization
(4)
first-order method
(4)
neural network optimization
(3)
riemannian geometry
(3)
representation learning
(3)
Papers
Graph Transformers Dream of Electric Flow
ICLR 2025
Linear attention is (maybe) all you need (to understand Transformer optimization)
ICLR 2024
Transformers Implement Functional Gradient Descent to Learn Non-Linear Functions In Context
ICML 2024
First-Order Methods for Linearly Constrained Bilevel Optimization
NIPS 2024
How to Escape Sharp Minima with Random Perturbations
ICML 2024
Transformers learn to implement preconditioned gradient descent for in-context learning
NIPS 2023
The Crucial Role of Normalization in Sharpness-Aware Minimization
NIPS 2023
Can Direct Latent Model Learning Solve Linear Quadratic Gaussian Control?
L4DC 2023
On the Training Instability of Shuffling SGD with Batch Normalization
ICML 2023
Global optimality for Euclidean CCCP under Riemannian convexity
ICML 2023
Sign and Basis Invariant Networks for Spectral Graph Representation Learning
ICLR 2023
Understanding the unstable convergence of gradient descent
ICML 2022
Efficient Sampling on Riemannian Manifolds via Langevin MCMC
NIPS 2022
CCCP is Frank-Wolfe in disguise
NIPS 2022
Max-Margin Contrastive Learning
AAAI 2022
Understanding Riemannian Acceleration via a Proximal Extragradient Framework
COLT 2022
Minibatch vs Local SGD with Shuffling: Tight Convergence Bounds and Beyond
ICLR 2022
Neural Network Weights Do Not Converge to Stationary Points: An Invariant Measure Perspective
ICML 2022
Beyond Worst-Case Analysis in Stochastic Approximation: Moment Estimation Improves Instance Complexity
ICML 2022
Time Varying Regression with Hidden Linear Dynamics
L4DC 2022
Three Operator Splitting with a Nonconvex Loss Function
ICML 2021
Contrastive Learning with Hard Negative Samples
ICLR 2021
Provably Efficient Algorithms for Multi-Objective Competitive RL
ICML 2021
Three Operator Splitting with Subgradients, Stochastic Gradients, and Adaptive Learning Rates
NIPS 2021
Coping with Label Shift via Distributionally Robust Optimisation
ICLR 2021
Open Problem: Can Single-Shuffle SGD be Better than Reshuffling SGD and GD?
COLT 2021
Can contrastive learning avoid shortcut solutions?
NIPS 2021
Online Learning in Unknown Markov Games
ICML 2021
Why Gradient Clipping Accelerates Training: A Theoretical Justification for Adaptivity
ICLR 2020
Why are Adaptive Methods Good for Attention Models?
NIPS 2020
Geodesically-convex optimization for averaging partially observed covariance matrices
ACML 2020
Complexity of Finding Stationary Points of Nonconvex Nonsmooth Functions
ICML 2020
Strength from Weakness: Fast Learning Using Weak Supervision
ICML 2020
Towards Minimax Optimal Reinforcement Learning in Factored Markov Decision Processes
NIPS 2020
Learning Adversarial Markov Decision Processes with Bandit Feedback and Unknown Transition
ICML 2020
From Nesterovβs Estimate Sequence to Riemannian Acceleration
COLT 2020
SGD with shuffling: optimal rates without component convexity and large epoch requirements
NIPS 2020
Flexible Modeling of Diversity with Strongly Log-Concave Distributions
NIPS 2019
Small nonlinearities in activation functions create bad local minima in neural networks
ICLR 2019
Small ReLU networks are powerful memorizers: a tight analysis of memorization capacity
NIPS 2019
Are deep ResNets provably better than linear predictors?
NIPS 2019
Conditional Gradient Methods via Stochastic Path-Integrated Differential Estimator
ICML 2019
Random Shuffling Beats SGD after Finite Epochs
ICML 2019
Escaping Saddle Points with Adaptive Gradient Methods
ICML 2019
Learning Determinantal Point Processes by Corrective Negative Sampling
AISTATS 2019
Efficiently testing local optimality and escaping saddles for ReLU networks
ICLR 2019
A Generic Approach for Escaping Saddle points
AISTATS 2018
An Estimate Sequence for Geodesically Convex Optimization
COLT 2018
Non-Linear Temporal Subspace Representations for Activity Recognition
CVPR 2018
Global Optimality Conditions for Deep Neural Networks
ICLR 2018
Modular Proximal Optimization for Multidimensional Total-Variation Regularization
JMLR 2018
Exponentiated Strongly Rayleigh Distributions
NIPS 2018
Direct Runge-Kutta Discretization Achieves Acceleration
NIPS 2018
Combinatorial Topic Models using Small-Variance Asymptotics
AISTATS 2017
Elementary Symmetric Polynomials for Optimal Experimental Design
NIPS 2017
Polynomial time algorithms for dual volume sampling
NIPS 2017
Gaussian quadrature for matrix inverse forms with applications
ICML 2016
Fast DPP Sampling for Nystrom with Application to Kernel Methods
ICML 2016
Geometric Mean Metric Learning
ICML 2016
Fast Mixing Markov Chains for Strongly Rayleigh Measures, DPPs, and Constrained Sampling
NIPS 2016
First-order Methods for Geodesically Convex Optimization
COLT 2016
Proximal Stochastic Methods for Nonsmooth Nonconvex Finite-Sum Optimization
NIPS 2016
Stochastic Variance Reduction for Nonconvex Optimization
ICML 2016
Parallel and Distributed Block-Coordinate Frank-Wolfe Algorithms
ICML 2016
Efficient Sampling for k-Determinantal Point Processes
AISTATS 2016
AdaDelay: Delay Adaptive Distributed Stochastic Optimization
AISTATS 2016
Kronecker Determinantal Point Processes
NIPS 2016
Riemannian SVRG: Fast Stochastic Optimization on Riemannian Manifolds
NIPS 2016
Matrix Manifold Optimization for Gaussian Mixtures
NIPS 2015
Fixed-point algorithms for learning determinantal point processes
ICML 2015
On Variance Reduction in Stochastic Gradient Descent and its Asynchronous Variants
NIPS 2015
Data modeling with the elliptical gamma distribution
AISTATS 2015
Towards an optimal stochastic alternating direction method of multipliers
ICML 2014
Efficient Structured Matrix Rank Minimization
NIPS 2014
Randomized Nonlinear Component Analysis
ICML 2014
Reflection methods for user-friendly submodular optimization
NIPS 2013
Geometric optimisation on positive definite matrices for elliptically contoured distributions
NIPS 2013
Scalable nonconvex inexact proximal splitting
NIPS 2012
A new metric on the manifold of kernel matrices with application to matrix geometric means
NIPS 2012
Clustering on the Unit Hypersphere using von Mises-Fisher Distributions
JMLR 2005