← Optimization & Theory

Deep Learning › Optimization & Theory ›

Model Compression

1674 directly classified papers

Papers per year

Papers

Winning the Lottery Ahead of Time: Efficient Early Network Pruning ICML 2022

POET: Training Neural Networks on Tiny Devices with Integrated Rematerialization and Paging ICML 2022

Overcoming Oscillations in Quantization-Aware Training ICML 2022

PAC-Net: A Model Pruning Approach to Inductive Transfer Learning ICML 2022

SDQ: Stochastic Differentiable Quantization with Mixed Precision ICML 2022

DAdaQuant: Doubly-adaptive quantization for communication-efficient Federated Learning ICML 2022

Sparse Double Descent: Where Network Pruning Aggravates Overfitting ICML 2022

The State of Sparse Training in Deep Reinforcement Learning ICML 2022

Learning to Win Lottery Tickets in BERT Transfer via Task-agnostic Mask Training NAACL 2022

Leaner and Faster: Two-Stage Model Compression for Lightweight Text-Image Retrieval NAACL 2022

Adaptable Adapters NAACL 2022

Towards Efficient NLP: A Standard Evaluation and A Strong Baseline NAACL 2022

KroneckerBERT: Significant Compression of Pre-trained Language Models Through Kronecker Decomposition and Knowledge Distillation NAACL 2022

MoEBERT: from BERT to Mixture-of-Experts via Importance-Guided Adaptation NAACL 2022

GRAM: Fast Fine-tuning of Pre-trained Language Models for Content-based Collaborative Filtering NAACL 2022

Pro-KD: Progressive Distillation by Following the Footsteps of the Teacher COLING 2022

Accelerating Inference for Pretrained Language Models by Unified Multi-Perspective Early Exiting COLING 2022

Token and Head Adaptive Transformers for Efficient Natural Language Processing COLING 2022

Parameter-Efficient Mixture-of-Experts Architecture for Pre-trained Language Models COLING 2022

Implicit Feature Decoupling With Depthwise Quantization CVPR 2022

Wavelet Knowledge Distillation: Towards Efficient Image-to-Image Translation CVPR 2022

Focal and Global Knowledge Distillation for Detectors CVPR 2022

Dataset Distillation by Matching Training Trajectories CVPR 2022

DyRep: Bootstrapping Training With Dynamic Re-Parameterization CVPR 2022

Knowledge Distillation With the Reused Teacher Classifier CVPR 2022