← Optimization & Theory

Deep Learning › Optimization & Theory ›

Theory

1072 directly classified papers

Papers per year

Papers

Local Regularizer Improves Generalization AAAI 2020

Sample Complexity Bounds for RNNs with Application to Combinatorial Graph Problems (Student Abstract) AAAI 2020

VECA: A Method for Detecting Overfitting in Neural Networks (Student Abstract) AAAI 2020

A Formal Hierarchy of RNN Architectures ACL 2020

Stolen Probability: A Structural Weakness of Neural Language Models ACL 2020

High-Frequency Component Helps Explain the Generalization of Convolutional Neural Networks CVPR 2020

An Internal Covariate Shift Bounding Algorithm for Deep Neural Networks by Unitizing Layers' Outputs CVPR 2020

TESA: Tensor Element Self-Attention via Matricization CVPR 2020

TBT: Targeted Neural Network Attack With Bit Trojan CVPR 2020

What Deep CNNs Benefit From Global Covariance Pooling: An Optimization Perspective CVPR 2020

RNNs can generate bounded hierarchical languages with optimal memory EMNLP 2020

Understanding the Difficulty of Training Transformers EMNLP 2020

Byte Pair Encoding is Suboptimal for Language Model Pretraining EMNLP 2020

An information theoretic view on selecting linguistic probes EMNLP 2020

On the Ability and Limitations of Transformers to Recognize Formal Languages EMNLP 2020

Neural Path Features and Neural Path Kernel : Understanding the role of gates in deep learning NIPS 2020

On the training dynamics of deep networks with $L_2$ regularization NIPS 2020

The Statistical Complexity of Early-Stopped Mirror Descent NIPS 2020

Infinitely deep neural networks as diffusion processes AISTATS 2020

Understanding Generalization in Deep Learning via Tensor Methods AISTATS 2020

Asymptotic normality and confidence intervals for derivatives of 2-layers neural network in the random features model NIPS 2020

Batch normalization provably avoids ranks collapse for randomly initialised deep networks NIPS 2020

Directional convergence and alignment in deep learning NIPS 2020

Chaining Meets Chain Rule: Multilevel Entropic Regularization and Training of Neural Networks JMLR 2020

Generalization bound of globally optimal non-convex neural network training: Transportation map estimation by infinite dimensional Langevin dynamics NIPS 2020