Matus Telgarsky

26 papers · 2010–2025 · 6 conferences · across top CS/AI conferences

Achievements

+9 more ↓

🧭 Keyword Pioneer 🐝 Cross-Pollinator (10) 🌍 Conference Polyglot (6) 🏃 Academic Marathon (15) 🌈 Renaissance Researcher (5)

🌈 Renaissance Researcher (5) 🌉 Interdisciplinary Bridge 🗺️ Taxonomy Completionist (21) 🐺 Lone Wolf (7) 🏆 Keyword Champion (2) 💎 Century Club (26) 🗃️ Keyword Collector (68) 🔥 Unstoppable (7) 🚀 Conference Pioneer

Conferences

COLT (8) ICLR (7) ICML (6) AISTATS (2) JMLR (2) ALT (1)

Top co-authors

Ziwei Ji (9) Daniel Hsu (4) Jingfeng Wu (2) Miroslav Dudík (2) Bin Yu (2) Alexander Rakhlin (1) Yuzheng Hu (1) Justin D. Li (1) Rong Ge (1) Lan Wang (1)

Keywords

gradient descent (5) empirical risk minimization (3) convex optimization (3) convergence rate (3) optimal transport (2) logistic regression (2) approximation theory (2) mirror descent (2) generalization bound (2) logistic loss (2) dual optimization (2) implicit bia (2) stochastic optimization (2) margin maximization (2) maximum margin (2) convergence analysis (1) primal-dual optimization (1) non-convex optimization (1) batch normalization (1) statistical consistency (1)

Papers

Benefits of Early Stopping in Gradient Descent for Overparameterized Logistic Regression ICML 2025 Spectrum Extraction and Clipping for Implicitly Linear Layers AISTATS 2024 Large Stepsize Gradient Descent for Logistic Loss: Non-Monotonicity of the Loss Improves Optimization Efficiency COLT 2024 Transformers, parallel computation, and logarithmic depth ICML 2024 On Achieving Optimal Adversarial Test Error ICLR 2023 Feature selection and low test error in shallow low-rotation ReLU networks ICLR 2023 Actor-critic is implicitly biased towards high entropy optimal policies ICLR 2022 Stochastic linear optimization never overfits with quadratically-bounded losses on general data COLT 2022 Fast margin maximization via dual acceleration ICML 2021 Characterizing the implicit bias via a primal-dual analysis ALT 2021 Generalization bounds via distillation ICLR 2021 Neural tangent kernels, transportation mappings, and universal approximation ICLR 2020 Polylogarithmic width suffices for gradient descent to achieve arbitrarily small test error with shallow ReLU networks ICLR 2020 Gradient descent follows the regularization path for general losses COLT 2020 Gradient descent aligns the layers of deep linear networks ICLR 2019 A Gradual, Semi-Discrete Approach to Generative Network Training via Explicit Wasserstein Minimization ICML 2019 The implicit bias of gradient descent on nonseparable data COLT 2019 Neural Networks and Rational Functions ICML 2017 Non-convex learning via Stochastic Gradient Langevin Dynamics: a nonasymptotic analysis COLT 2017 benefits of depth in neural networks COLT 2016 Convex Risk Minimization and Conditional Probability Estimation COLT 2015 Tensor Decompositions for Learning Latent Variable Models JMLR 2014 Margins, Shrinkage, and Boosting ICML 2013 Boosting with the Logistic Loss is Consistent COLT 2013 A Primal-Dual Convergence Analysis of Boosting JMLR 2012 Hartigan’s Method: k-means Clustering without Voronoi AISTATS 2010