← Optimization & Theory

Machine Learning › Optimization & Theory ›

Neural Network Optimization

3648 directly classified papers

Papers per year

Papers

How Two-Layer Neural Networks Learn, One (Giant) Step at a Time JMLR 2024

QT-ViT: Improving Linear Attention in ViT with Quadratic Taylor Expansion NIPS 2024

Deep linear networks for regression are implicitly regularized towards flat minima NIPS 2024

The Closeness of In-Context Learning and Weight Shifting for Softmax Regression NIPS 2024

Towards Geometric Normalization Techniques in SE(3) Equivariant Graph Neural Networks for Physical Dynamics Simulations IJCAI 2024

Global Convergence in Training Large-Scale Transformers NIPS 2024

Understanding the Generalization Benefits of Late Learning Rate Decay AISTATS 2024

Neural Echos: Depthwise Convolutional Filters Replicate Biological Receptive Fields WACV 2024

Physics-Informed Neural Networks: Minimizing Residual Loss with Wide Networks and Effective Activations IJCAI 2024

Over-Reasoning and Redundant Calculation of Large Language Models EACL 2024

Vision Mamba Mender NIPS 2024

Provable Acceleration of Nesterov’s Accelerated Gradient Method over Heavy Ball Method in Training Over-Parameterized Neural Networks IJCAI 2024

Extreme Fine-tuning: A Novel and Fast Fine-tuning Approach for Text Classification EACL 2024

Causality-enhanced Discreted Physics-informed Neural Networks for Predicting Evolutionary Equations IJCAI 2024

The Implicit Bias of Adam on Separable Data NIPS 2024

On the Sparsity of the Strong Lottery Ticket Hypothesis NIPS 2024

Activating Self-Attention for Multi-Scene Absolute Pose Regression NIPS 2024

Dynamic Masking Rate Schedules for MLM Pretraining EACL 2024

Open Problem: Black-Box Reductions and Adaptive Gradient Methods for Nonconvex Optimization COLT 2024

SynSP: Synergy of Smoothness and Precision in Pose Sequences Refinement CVPR 2024

Local to Global: Learning Dynamics and Effect of Initialization for Transformers NIPS 2024

Exact Mean Square Linear Stability Analysis for SGD COLT 2024

InceptionNeXt: When Inception Meets ConvNeXt CVPR 2024

Ordered Momentum for Asynchronous SGD NIPS 2024

Deep model-free KKL observer: A switching approach L4DC 2024