Shai Shalev-shwartz
58 papers · 2005–2025 · 8 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+15 more ↓ Show less ↑
π£ Hot Topic Early Bird πΊοΈ Taxonomy Completionist (24) π Interdisciplinary Bridge π§ Keyword Pioneer π Conference Polyglot (8)
π£
Hot Topic Early Bird
π
Conference Polyglot
(8)
π
Cross-Pollinator
(15)
π
Keyword Trendsetter Combo
(3)
π±
Topic Pioneer
π₯
Mega-Team
(61)
π¬
Deep Specialist
(10)
π
Keyword Champion
(4)
ποΈ
Keyword Collector
(111)
β‘
Prolific Year
(5)
π
Conference Pioneer
π
Trend Setter
β
The Questioner
π
Century Club
(58)
π₯
Unstoppable
(13)
Conferences
JMLR (18)
NIPS (14)
ICML (11)
COLT (8)
ICLR (4)
ACL (1)
AISTATS (1)
ALT (1)
Top co-authors
Keywords
online learning
(8)
convex optimization
(8)
sample complexity
(7)
empirical risk minimization
(7)
stochastic optimization
(6)
multiclass classification
(6)
gradient descent
(5)
stochastic gradient descent
(5)
support vector machine
(4)
computational complexity
(4)
learning theory
(4)
feature selection
(4)
non-convex optimization
(4)
active learning
(3)
boosting algorithm
(3)
halfspace learning
(3)
matrix completion
(3)
collaborative filtering
(3)
stochastic dual coordinate ascent
(3)
regularized loss minimization
(3)
Papers
Jamba: Hybrid Transformer-Mamba Language Models
ICLR 2025
Knowledge Distillation: Bad Models Can Be Good Role Models
NIPS 2022
When Hardness of Approximation Meets Hardness of Learning
JMLR 2022
Computational Separation Between Convolutional and Fully-Connected Networks
ICLR 2021
SenseBERT: Driving Some Sense into BERT
ACL 2020
The Implications of Local Correlation on Learning Some Deep Functions
NIPS 2020
Distribution Free Learning with Local Queries
ALT 2020
The Implicit Bias of Depth: How Incremental Learning Drives Generalization
ICLR 2020
Is Deeper Better only when Shallow is Good?
NIPS 2019
Average Stability is Invariant to Data Preconditioning. Implications to Exp-concave Empirical Risk Minimization
JMLR 2018
SGD Learns Over-parameterized Networks that Provably Generalize on Linearly Separable Data
ICLR 2018
Effective Semisupervised Learning on Manifolds
COLT 2017
Fast Rates for Empirical Risk Minimization of Strict Saddle Problems
COLT 2017
Decoupling "when to update" from "how to update"
NIPS 2017
Failures of Gradient-Based Deep Learning
ICML 2017
On Graduated Optimization for Stochastic Non-Convex Problems
ICML 2016
Learning a Metric Embedding for Face Recognition using the Multibatch Method
NIPS 2016
On Lower and Upper Bounds in Smooth and Strongly Convex Optimization
JMLR 2016
Subspace Learning with Partial Information
JMLR 2016
Solving Ridge Regression using Sketched Preconditioned SVRG
ICML 2016
Minimizing the Maximal Loss: How and Why
ICML 2016
SDCA without Duality, Regularization, and Individual Convexity
ICML 2016
Complexity Theoretic Limitations on Learning DNFβs
COLT 2016
Strongly Adaptive Online Learning
ICML 2015
Beyond Convexity: Stochastic Quasi-Convex Optimization
NIPS 2015
Learning Sparse Low-Threshold Linear Classifiers
JMLR 2015
Multiclass Learnability and the ERM Principle
JMLR 2015
On the Computational Efficiency of Training Neural Networks
NIPS 2014
The Complexity of Learning Halfspaces using Generalized Linear Methods
COLT 2014
Optimal learners for multiclass problems
COLT 2014
Accelerated Proximal Stochastic Dual Coordinate Ascent for Regularized Loss Minimization
ICML 2014
K-means recovers ICA filters when independent components are sparse
ICML 2014
Matrix Completion with the Trace Norm: Learning, Bounding, and Transducing
JMLR 2014
Efficient Active Learning of Halfspaces: an Aggressive Approach
ICML 2013
Accelerated Mini-Batch Stochastic Dual Coordinate Ascent
NIPS 2013
More data speeds up training time in learning halfspaces over sparse vectors
NIPS 2013
Stochastic Dual Coordinate Ascent Methods for Regularized Loss Minimization
JMLR 2013
Learning Optimally Sparse Support Vector Machines
ICML 2013
Efficient Active Learning of Halfspaces: An Aggressive Approach
JMLR 2013
Vanishing Component Analysis
ICML 2013
Regularization Techniques for Learning with Matrices
JMLR 2012
Using More Data to Speed-up Training Time
AISTATS 2012
Near-Optimal Algorithms for Online Matrix Prediction
COLT 2012
Multiclass Learnability and the ERM principle
COLT 2011
Collaborative Filtering with the Trace Norm: Learning, Bounding, and Transducing
COLT 2011
Stochastic Methods for -regularized Loss Minimization
JMLR 2011
Efficient Learning with Partially Observed Attributes
JMLR 2011
ShareBoost: Efficient multiclass learning with feature sharing
NIPS 2011
Learnability, Stability and Uniform Convergence
JMLR 2010
Mind the Duality Gap: Logarithmic regret algorithms for online optimization
NIPS 2008
Ranking Categorical Features Using Generalization Properties
JMLR 2008
Online Learning of Complex Prediction Problems Using Simultaneous Projections
JMLR 2008
Fast Rates for Regularized Objectives
NIPS 2008
Efficient Learning of Label Ranking by Soft Projections onto Polyhedra
JMLR 2006
Online Passive-Aggressive Algorithms
JMLR 2006
Online Classification for Complex Problems Using Simultaneous Projections
NIPS 2006
Convex Repeated Games and Fenchel Duality
NIPS 2006
Smooth Ξ΅-Insensitive Regression by Loss Symmetrization
JMLR 2005