conftrace_

← Optimization & Theory

Machine Learning › Optimization & Theory ›

Neural Network Optimization

3,648 papers

Papers per year

Papers

TorchOpt: An Efficient Library for Differentiable Optimization JMLR 2023

On the Convergence to a Global Solution of Shuffling-Type Gradient Algorithms NIPS 2023

Dissecting the Effects of SGD Noise in Distinct Regimes of Deep Learning ICML 2023

TORE: Token Reduction for Efficient Human Mesh Recovery with Transformer ICCV 2023

Few-bit Backward: Quantized Gradients of Activation Functions for Memory Footprint Reduction ICML 2023

Mind the (optimality) Gap: A Gap-Aware Learning Rate Scheduler for Adversarial Nets AISTATS 2023

Two Examples are Better than One: Context Regularization for Gradient-based Prompt Tuning ACL 2023

Re2TAL: Rewiring Pretrained Video Backbones for Reversible Temporal Action Localization CVPR 2023

Alias-Free Convnets: Fractional Shift Invariance via Polynomial Activations CVPR 2023

DSGD-CECA: Decentralized SGD with Communication-Optimal Exact Consensus Algorithm ICML 2023

Pareto Frontiers in Deep Feature Learning: Data, Compute, Width, and Luck NIPS 2023

Sequence-Based Plan Feasibility Prediction for Efficient Task and Motion Planning RSS 2023

Incorporating Syntactic Knowledge into Pre-trained Language Model using Optimization for Overcoming Catastrophic Forgetting EMNLP 2023

Decoder Tuning: Efficient Language Understanding as Decoding ACL 2023

Why Can GPT Learn In-Context? Language Models Secretly Perform Gradient Descent as Meta-Optimizers ACL 2023

On Learning Rates and Schrödinger Operators JMLR 2023

Continual Learning with Scaled Gradient Projection AAAI 2023

Maximal Initial Learning Rates in Deep ReLU Networks ICML 2023

Enhancing NeRF akin to Enhancing LLMs: Generalizable NeRF Transformer with Mixture-of-View-Experts ICCV 2023

Towards More Efficient Insertion Transformer with Fractional Positional Encoding EACL 2023

Revisiting Non-Autoregressive Translation at Scale ACL 2023

Toward Edge-Efficient Dense Predictions With Synergistic Multi-Task Neural Architecture Search WACV 2023

Continuous-time Analysis of Anchor Acceleration NIPS 2023

To Stay or Not to Stay in the Pre-train Basin: Insights on Ensembling in Transfer Learning NIPS 2023

Making Scalable Meta Learning Practical NIPS 2023