← Optimization & Theory

Machine Learning › Optimization & Theory ›

Neural Network Optimization

3648 directly classified papers

Papers per year

Papers

Edit Probability for Scene Text Recognition CVPR 2018

Is it Time to Swish? Comparing Deep Learning Activation Functions Across NLP tasks EMNLP 2018

Multi-Head Attention with Disagreement Regularization EMNLP 2018

N-ary Relation Extraction using Graph-State LSTM EMNLP 2018

Breaking the Activation Function Bottleneck through Adaptive Parameterization NIPS 2018

Deep Convolutional Neural Networks with Merge-and-Run Mappings IJCAI 2018

Building Sparse Deep Feedforward Networks using Tree Receptive Fields IJCAI 2018

Algorithmic Regularization in Over-parameterized Matrix Sensing and Neural Networks with Quadratic Activations COLT 2018

Accelerated Gradient Descent Escapes Saddle Points Faster than Gradient Descent COLT 2018

Decoupling Gradient-Like Learning Rules from Representations ICML 2018

Accelerating Natural Gradient with Higher-Order Invariance ICML 2018

Fast and Scalable Bayesian Deep Learning by Weight-Perturbation in Adam ICML 2018

Understanding and Simplifying One-Shot Architecture Search ICML 2018

Differentiable Dynamic Programming for Structured Prediction and Attention ICML 2018

Towards Binary-Valued Gates for Robust LSTM Training ICML 2018

Fast Gradient-Based Methods with Exponential Rate: A Hybrid Control Framework ICML 2018

Gradient Primal-Dual Algorithm Converges to Second-Order Stationary Solution for Nonconvex Distributed Optimization Over Networks ICML 2018

Characterizing Implicit Bias in Terms of Optimization Geometry ICML 2018

Faster Derivative-Free Stochastic Algorithm for Shared Memory Machines ICML 2018

Entropy-SGD optimizes the prior of a PAC-Bayes bound: Generalization properties of Entropy-SGD and data-dependent priors ICML 2018

Randomized Block Cubic Newton Method ICML 2018

On Acceleration with Noise-Corrupted Gradients ICML 2018

Long-Term Human Motion Prediction by Modeling Motion Context and Enhancing Motion Dynamics IJCAI 2018

SADAGRAD: Strongly Adaptive Stochastic Gradient Methods ICML 2018

GradNorm: Gradient Normalization for Adaptive Loss Balancing in Deep Multitask Networks ICML 2018