Papers
1,396 papers found
Faster Projection-free Online Learning
Elad Hazan, Edgar Minasyan
Fast Rates for Online Prediction with Abstention
Gergely Neu, Nikita Zhivotovskiy
Finite Regret and Cycles with Fixed Step-Size via Alternating Gradient Descent-Ascent
James P. Bailey, Gauthier Gidel, Georgios Piliouras
Finite-Time Analysis of Asynchronous Stochastic Approximation and $Q$-Learning
Guannan Qu, Adam Wierman
Finite Time Analysis of Linear Two-timescale Stochastic Approximation with Markovian Noise
Maxim Kaledin, Eric Moulines, Alexey Naumov et al.
Free Energy Wells and Overlap Gap Property in Sparse PCA
Gérard Ben Arous, Alexander S. Wein, Ilias Zadik
From Nesterov’s Estimate Sequence to Riemannian Acceleration
Kwangjun Ahn, Suvrit Sra
From tree matching to sparse graph alignment
Luca Ganassali, Laurent Massoulié
Gradient descent algorithms for Bures-Wasserstein barycenters
Sinho Chewi, Tyler Maunu, Philippe Rigollet et al.
Gradient descent follows the regularization path for general losses
Ziwei Ji, Miroslav Dudík, Robert E. Schapire et al.
Hardness of Identity Testing for Restricted Boltzmann Machines and Potts models
Antonio Blanca, Zongchen Chen, Daniel Štefankovič et al.
Hierarchical Clustering: A 0.585 Revenue Approximation
Noga Alon, Yossi Azar, Danny Vainstein
Highly smooth minimization of non-smooth problems
Brian Bullins
High probability guarantees for stochastic convex optimization
Damek Davis, Dmitriy Drusvyatskiy
How Good is SGD with Random Shuffling?
Itay Safran, Ohad Shamir
How to Trap a Gradient Flow
Sébastien Bubeck, Dan Mikulincer
ID3 Learns Juntas for Smoothed Product Distributions
Alon Brutzkus, Amit Daniely, Eran Malach
Implicit Bias of Gradient Descent for Wide Two-layer Neural Networks Trained with the Logistic Loss
Lénaïc Chizat, Francis Bach
Implicit regularization for deep neural networks driven by an Ornstein-Uhlenbeck like process
Guy Blanc, Neha Gupta, Gregory Valiant et al.
Improper Learning for Non-Stochastic Control
Max Simchowitz, Karan Singh, Elad Hazan
Information Directed Sampling for Linear Partial Monitoring
Johannes Kirschner, Tor Lattimore, Andreas Krause
Information Theoretic Optimal Learning of Gaussian Graphical Models
Sidhant Misra, Marc Vuffray, Andrey Y. Lokhov
Kernel and Rich Regimes in Overparametrized Models
Blake Woodworth, Suriya Gunasekar, Jason D. Lee et al.
Last Iterate is Slower than Averaged Iterate in Smooth Convex-Concave Saddle Point Problems
Noah Golowich, Sarath Pattathil, Constantinos Daskalakis et al.