← Optimization & Theory

Deep Learning › Optimization & Theory ›

Neural Network Optimization

902 directly classified papers

Papers per year

Papers

Forget-free Continual Learning with Winning Subnetworks ICML 2022

Training Your Sparse Neural Network Better with Any Mask ICML 2022

Large-Scale Graph Neural Architecture Search ICML 2022

Multi-Level Firing with Spiking DS-ResNet: Enabling Better and Deeper Directly-Trained Spiking Neural Networks IJCAI 2022

Smooth Maximum Unit: Smooth Activation Function for Deep Networks Using Smoothing Maximum Technique CVPR 2022

Controllable Dynamic Multi-Task Architectures CVPR 2022

Scaling Up Your Kernels to 31x31: Revisiting Large Kernel Design in CNNs CVPR 2022

Gradient Flow in Sparse Neural Networks and How Lottery Tickets Win AAAI 2022

Revisiting Random Channel Pruning for Neural Network Compression CVPR 2022

What Works and Doesn’t Work, A Deep Decoder for Neural Machine Translation ACL 2022

DyRep: Bootstrapping Training With Dynamic Re-Parameterization CVPR 2022

NSGZero: Efficiently Learning Non-exploitable Policy in Large-Scale Network Security Games with Neural Monte Carlo Tree Search AAAI 2022

KOALA: A Kalman Optimization Algorithm with Loss Adaptivity AAAI 2022

Modality-specific Learning Rates for Effective Multimodal Additive Late-fusion ACL 2022

DF-ResNet: Boosting Speaker Verification Performance with Depth-First Design INTERSPEECH 2022

NAS-VAD: Neural Architecture Search for Voice Activity Detection INTERSPEECH 2022

A Closer Look at Parameter Contributions When Training Neural Language and Translation Models COLING 2022

Adversarial Branch Architecture Search for Unsupervised Domain Adaptation WACV 2022

Improving Sharpness-Aware Minimization with Fisher Mask for Better Generalization on Language Models EMNLP 2022

Gradient and Mangitude Based Pruning for Sparse Deep Neural Networks AAAI 2022

Does the Geometry of the Data Control the Geometry of Neural Predictions? (Student Abstract) AAAI 2022

Towards Practical Adam: Non-Convexity, Convergence Theory, and Mini-Batch Acceleration JMLR 2022

Learning Rates as a Function of Batch Size: A Random Matrix Theory Approach to Neural Network Training JMLR 2022

Confidence Calibration for Intent Detection via Hyperspherical Space and Rebalanced Accuracy-Uncertainty Loss AAAI 2022

A Momentumized, Adaptive, Dual Averaged Gradient Method JMLR 2022