conftrace_

stochastic gradient descent

1091 papers

Explore in graph

Also known as

SGD

Co-occurring keywords

non-convex optimization (547) distributed learning (563) neural network optimization (1293) convergence rate (607) convex optimization (1321) neural network (6616) variance reduction (523) stochastic optimization (1060) convergence analysis (395) differential privacy (1016)

Papers

GCN meets GPU: Decoupling “When to Sample” from “How to Sample” NIPS 2020

Stochastic Approximate Gradient Descent via the Langevin Algorithm AAAI 2020

Anchor Box Optimization for Object Detection WACV 2020

Control Batch Size and Learning Rate to Generalize Well: Theoretical and Empirical Evidence NIPS 2019

Surfing: Iterative Optimization Over Incrementally Trained Deep Networks NIPS 2019

Momentum-Based Variance Reduction in Non-Convex SGD NIPS 2019

SSRGD: Simple Stochastic Recursive Gradient Descent for Escaping Saddle Points NIPS 2019

New Convergence Aspects of Stochastic Gradient Algorithms JMLR 2019

Continuous-time Models for Stochastic Optimization Algorithms NIPS 2019

Combining Global Sparse Gradients with Local Gradients in Distributed Neural Network Training EMNLP 2019

Making Asynchronous Stochastic Gradient Descent Work for Transformers EMNLP 2019

Scalable and Efficient Pairwise Learning to Achieve Statistical Accuracy AAAI 2019

Active Mini-Batch Sampling Using Repulsive Point Processes AAAI 2019

Feature Grouping as a Stochastic Regularizer for High-Dimensional Structured Data ICML 2019

Self-similar Epochs: Value in arrangement ICML 2019

Accelerated Linear Convergence of Stochastic Momentum Methods in Wasserstein Distances ICML 2019

Decentralized Stochastic Optimization and Gossip Algorithms with Compressed Communication ICML 2019

SAGA with Arbitrary Sampling ICML 2019

SGD: General Analysis and Improved Rates ICML 2019

DoubleSqueeze: Parallel Stochastic Gradient Descent with Double-pass Error-Compensated Compression ICML 2019

Characterization of Convex Objective Functions and Optimal Expected Convergence Rates for SGD ICML 2019

AdaGrad Stepsizes: Sharp Convergence Over Nonconvex Landscapes ICML 2019

On the Computation and Communication Complexity of Parallel SGD with Dynamic Batch Sizes for Stochastic Non-Convex Optimization ICML 2019

Surrogate Losses for Online Learning of Stepsizes in Stochastic Non-Convex Optimization ICML 2019

Error Feedback Fixes SignSGD and other Gradient Compression Schemes ICML 2019