Shai Shalev-shwartz

58 papers · 2005–2025 · 8 conferences · across top CS/AI conferences

Achievements

+15 more ↓

🐣 Hot Topic Early Bird 🗺️ Taxonomy Completionist (24) 🌉 Interdisciplinary Bridge 🧭 Keyword Pioneer 🌍 Conference Polyglot (8)

🐣 Hot Topic Early Bird 🌍 Conference Polyglot (8) 🐝 Cross-Pollinator (15) 🌟 Keyword Trendsetter Combo (3) 🌱 Topic Pioneer 👥 Mega-Team (61) 🔬 Deep Specialist (10) 🏆 Keyword Champion (4) 🗃️ Keyword Collector (111) ⚡ Prolific Year (5) 🚀 Conference Pioneer 📈 Trend Setter ❓ The Questioner 💎 Century Club (58) 🔥 Unstoppable (13)

Conferences

JMLR (18) NIPS (14) ICML (11) COLT (8) ICLR (4) ACL (1) AISTATS (1) ALT (1)

Top co-authors

Amit Daniely (9) Ohad Shamir (8) Eran Malach (7) Alon Gonen (7) Sivan Sabato (6) Yoram Singer (6) Tong Zhang (4) Elad Hazan (3) Yonatan Wexler (3) Amir Globerson (3)

Keywords

online learning (8) convex optimization (8) sample complexity (7) empirical risk minimization (7) stochastic optimization (6) multiclass classification (6) gradient descent (5) stochastic gradient descent (5) support vector machine (4) computational complexity (4) learning theory (4) feature selection (4) non-convex optimization (4) active learning (3) boosting algorithm (3) halfspace learning (3) matrix completion (3) collaborative filtering (3) stochastic dual coordinate ascent (3) regularized loss minimization (3)

Papers

Jamba: Hybrid Transformer-Mamba Language Models ICLR 2025 Knowledge Distillation: Bad Models Can Be Good Role Models NIPS 2022 When Hardness of Approximation Meets Hardness of Learning JMLR 2022 Computational Separation Between Convolutional and Fully-Connected Networks ICLR 2021 SenseBERT: Driving Some Sense into BERT ACL 2020 The Implications of Local Correlation on Learning Some Deep Functions NIPS 2020 Distribution Free Learning with Local Queries ALT 2020 The Implicit Bias of Depth: How Incremental Learning Drives Generalization ICLR 2020 Is Deeper Better only when Shallow is Good? NIPS 2019 Average Stability is Invariant to Data Preconditioning. Implications to Exp-concave Empirical Risk Minimization JMLR 2018 SGD Learns Over-parameterized Networks that Provably Generalize on Linearly Separable Data ICLR 2018 Effective Semisupervised Learning on Manifolds COLT 2017 Fast Rates for Empirical Risk Minimization of Strict Saddle Problems COLT 2017 Decoupling "when to update" from "how to update" NIPS 2017 Failures of Gradient-Based Deep Learning ICML 2017 On Graduated Optimization for Stochastic Non-Convex Problems ICML 2016 Learning a Metric Embedding for Face Recognition using the Multibatch Method NIPS 2016 On Lower and Upper Bounds in Smooth and Strongly Convex Optimization JMLR 2016 Subspace Learning with Partial Information JMLR 2016 Solving Ridge Regression using Sketched Preconditioned SVRG ICML 2016 Minimizing the Maximal Loss: How and Why ICML 2016 SDCA without Duality, Regularization, and Individual Convexity ICML 2016 Complexity Theoretic Limitations on Learning DNF’s COLT 2016 Strongly Adaptive Online Learning ICML 2015 Beyond Convexity: Stochastic Quasi-Convex Optimization NIPS 2015 Learning Sparse Low-Threshold Linear Classifiers JMLR 2015 Multiclass Learnability and the ERM Principle JMLR 2015 On the Computational Efficiency of Training Neural Networks NIPS 2014 The Complexity of Learning Halfspaces using Generalized Linear Methods COLT 2014 Optimal learners for multiclass problems COLT 2014 Accelerated Proximal Stochastic Dual Coordinate Ascent for Regularized Loss Minimization ICML 2014 K-means recovers ICA filters when independent components are sparse ICML 2014 Matrix Completion with the Trace Norm: Learning, Bounding, and Transducing JMLR 2014 Efficient Active Learning of Halfspaces: an Aggressive Approach ICML 2013 Accelerated Mini-Batch Stochastic Dual Coordinate Ascent NIPS 2013 More data speeds up training time in learning halfspaces over sparse vectors NIPS 2013 Stochastic Dual Coordinate Ascent Methods for Regularized Loss Minimization JMLR 2013 Learning Optimally Sparse Support Vector Machines ICML 2013 Efficient Active Learning of Halfspaces: An Aggressive Approach JMLR 2013 Vanishing Component Analysis ICML 2013 Regularization Techniques for Learning with Matrices JMLR 2012 Using More Data to Speed-up Training Time AISTATS 2012 Near-Optimal Algorithms for Online Matrix Prediction COLT 2012 Multiclass Learnability and the ERM principle COLT 2011 Collaborative Filtering with the Trace Norm: Learning, Bounding, and Transducing COLT 2011 Stochastic Methods for -regularized Loss Minimization JMLR 2011 Efficient Learning with Partially Observed Attributes JMLR 2011 ShareBoost: Efficient multiclass learning with feature sharing NIPS 2011 Learnability, Stability and Uniform Convergence JMLR 2010 Mind the Duality Gap: Logarithmic regret algorithms for online optimization NIPS 2008 Ranking Categorical Features Using Generalization Properties JMLR 2008 Online Learning of Complex Prediction Problems Using Simultaneous Projections JMLR 2008 Fast Rates for Regularized Objectives NIPS 2008 Efficient Learning of Label Ranking by Soft Projections onto Polyhedra JMLR 2006 Online Passive-Aggressive Algorithms JMLR 2006 Online Classification for Complex Problems Using Simultaneous Projections NIPS 2006 Convex Repeated Games and Fenchel Duality NIPS 2006 Smooth ε-Insensitive Regression by Loss Symmetrization JMLR 2005