Suvrit Sra

80 papers · 2005–2025 · 10 conferences · across top CS/AI conferences

Achievements

+19 more ↓

🧭 Keyword Pioneer 🐣 Hot Topic Early Bird 🗺️ Taxonomy Completionist (19) 🌉 Interdisciplinary Bridge 🌍 Conference Polyglot (10)

🌉 Interdisciplinary Bridge 🐣 Hot Topic Early Bird 🧭 Keyword Pioneer 🌟 Keyword Trendsetter Combo (4) 🏠 Conference Loyalist (28) 🧬 Topic Evolution 🤝 Dynamic Duo (14) 🏆 Grand Slam 👑 Triple Crown 🌱 Topic Pioneer 🔬 Deep Specialist (30) 🏆 Keyword Champion (2) ⚡ Prolific Year (6) 🗃️ Keyword Collector (89) 📈 Trend Setter 💎 Century Club (80) 🔥 Unstoppable (14) ❓ The Questioner (5) 🚀 Conference Pioneer

Conferences

NIPS (28) ICML (24) ICLR (10) AISTATS (6) COLT (5) JMLR (2) L4DC (2) AAAI (1) ACML (1) CVPR (1)

Top co-authors

Ali Jadbabaie (14) Stefanie Jegelka (13) Jingzhao Zhang (10) Chulhee Yun (10) Kwangjun Ahn (7) Chengtao Li (5) Xiang Cheng (5) Reshad Hosseini (4) Sashank J. Reddi (4) Alp Yurtsever (4)

Keywords

determinantal point process (9) convex optimization (8) stochastic gradient descent (8) nonconvex optimization (8) stochastic optimization (7) manifold optimization (6) riemannian optimization (6) convergence rate (5) variance reduction (5) gradient descent (5) positive definite matrix (4) riemannian manifold (4) markov chain monte carlo (4) stochastic gradient (4) non-convex optimization (4) nonsmooth optimization (4) first-order method (4) neural network optimization (3) riemannian geometry (3) representation learning (3)

Papers

Graph Transformers Dream of Electric Flow ICLR 2025 Linear attention is (maybe) all you need (to understand Transformer optimization) ICLR 2024 Transformers Implement Functional Gradient Descent to Learn Non-Linear Functions In Context ICML 2024 First-Order Methods for Linearly Constrained Bilevel Optimization NIPS 2024 How to Escape Sharp Minima with Random Perturbations ICML 2024 Transformers learn to implement preconditioned gradient descent for in-context learning NIPS 2023 The Crucial Role of Normalization in Sharpness-Aware Minimization NIPS 2023 Can Direct Latent Model Learning Solve Linear Quadratic Gaussian Control? L4DC 2023 On the Training Instability of Shuffling SGD with Batch Normalization ICML 2023 Global optimality for Euclidean CCCP under Riemannian convexity ICML 2023 Sign and Basis Invariant Networks for Spectral Graph Representation Learning ICLR 2023 Understanding the unstable convergence of gradient descent ICML 2022 Efficient Sampling on Riemannian Manifolds via Langevin MCMC NIPS 2022 CCCP is Frank-Wolfe in disguise NIPS 2022 Max-Margin Contrastive Learning AAAI 2022 Understanding Riemannian Acceleration via a Proximal Extragradient Framework COLT 2022 Minibatch vs Local SGD with Shuffling: Tight Convergence Bounds and Beyond ICLR 2022 Neural Network Weights Do Not Converge to Stationary Points: An Invariant Measure Perspective ICML 2022 Beyond Worst-Case Analysis in Stochastic Approximation: Moment Estimation Improves Instance Complexity ICML 2022 Time Varying Regression with Hidden Linear Dynamics L4DC 2022 Three Operator Splitting with a Nonconvex Loss Function ICML 2021 Contrastive Learning with Hard Negative Samples ICLR 2021 Provably Efficient Algorithms for Multi-Objective Competitive RL ICML 2021 Three Operator Splitting with Subgradients, Stochastic Gradients, and Adaptive Learning Rates NIPS 2021 Coping with Label Shift via Distributionally Robust Optimisation ICLR 2021 Open Problem: Can Single-Shuffle SGD be Better than Reshuffling SGD and GD? COLT 2021 Can contrastive learning avoid shortcut solutions? NIPS 2021 Online Learning in Unknown Markov Games ICML 2021 Why Gradient Clipping Accelerates Training: A Theoretical Justification for Adaptivity ICLR 2020 Why are Adaptive Methods Good for Attention Models? NIPS 2020 Geodesically-convex optimization for averaging partially observed covariance matrices ACML 2020 Complexity of Finding Stationary Points of Nonconvex Nonsmooth Functions ICML 2020 Strength from Weakness: Fast Learning Using Weak Supervision ICML 2020 Towards Minimax Optimal Reinforcement Learning in Factored Markov Decision Processes NIPS 2020 Learning Adversarial Markov Decision Processes with Bandit Feedback and Unknown Transition ICML 2020 From Nesterov’s Estimate Sequence to Riemannian Acceleration COLT 2020 SGD with shuffling: optimal rates without component convexity and large epoch requirements NIPS 2020 Flexible Modeling of Diversity with Strongly Log-Concave Distributions NIPS 2019 Small nonlinearities in activation functions create bad local minima in neural networks ICLR 2019 Small ReLU networks are powerful memorizers: a tight analysis of memorization capacity NIPS 2019 Are deep ResNets provably better than linear predictors? NIPS 2019 Conditional Gradient Methods via Stochastic Path-Integrated Differential Estimator ICML 2019 Random Shuffling Beats SGD after Finite Epochs ICML 2019 Escaping Saddle Points with Adaptive Gradient Methods ICML 2019 Learning Determinantal Point Processes by Corrective Negative Sampling AISTATS 2019 Efficiently testing local optimality and escaping saddles for ReLU networks ICLR 2019 A Generic Approach for Escaping Saddle points AISTATS 2018 An Estimate Sequence for Geodesically Convex Optimization COLT 2018 Non-Linear Temporal Subspace Representations for Activity Recognition CVPR 2018 Global Optimality Conditions for Deep Neural Networks ICLR 2018 Modular Proximal Optimization for Multidimensional Total-Variation Regularization JMLR 2018 Exponentiated Strongly Rayleigh Distributions NIPS 2018 Direct Runge-Kutta Discretization Achieves Acceleration NIPS 2018 Combinatorial Topic Models using Small-Variance Asymptotics AISTATS 2017 Elementary Symmetric Polynomials for Optimal Experimental Design NIPS 2017 Polynomial time algorithms for dual volume sampling NIPS 2017 Gaussian quadrature for matrix inverse forms with applications ICML 2016 Fast DPP Sampling for Nystrom with Application to Kernel Methods ICML 2016 Geometric Mean Metric Learning ICML 2016 Fast Mixing Markov Chains for Strongly Rayleigh Measures, DPPs, and Constrained Sampling NIPS 2016 First-order Methods for Geodesically Convex Optimization COLT 2016 Proximal Stochastic Methods for Nonsmooth Nonconvex Finite-Sum Optimization NIPS 2016 Stochastic Variance Reduction for Nonconvex Optimization ICML 2016 Parallel and Distributed Block-Coordinate Frank-Wolfe Algorithms ICML 2016 Efficient Sampling for k-Determinantal Point Processes AISTATS 2016 AdaDelay: Delay Adaptive Distributed Stochastic Optimization AISTATS 2016 Kronecker Determinantal Point Processes NIPS 2016 Riemannian SVRG: Fast Stochastic Optimization on Riemannian Manifolds NIPS 2016 Matrix Manifold Optimization for Gaussian Mixtures NIPS 2015 Fixed-point algorithms for learning determinantal point processes ICML 2015 On Variance Reduction in Stochastic Gradient Descent and its Asynchronous Variants NIPS 2015 Data modeling with the elliptical gamma distribution AISTATS 2015 Towards an optimal stochastic alternating direction method of multipliers ICML 2014 Efficient Structured Matrix Rank Minimization NIPS 2014 Randomized Nonlinear Component Analysis ICML 2014 Reflection methods for user-friendly submodular optimization NIPS 2013 Geometric optimisation on positive definite matrices for elliptically contoured distributions NIPS 2013 Scalable nonconvex inexact proximal splitting NIPS 2012 A new metric on the manifold of kernel matrices with application to matrix geometric means NIPS 2012 Clustering on the Unit Hypersphere using von Mises-Fisher Distributions JMLR 2005