← Optimization & Theory

Deep Learning › Optimization & Theory ›

Model Compression

1674 directly classified papers

Papers per year

Papers

Model Cascading: Towards Jointly Improving Efficiency and Accuracy of NLP Systems EMNLP 2022

Zero-Shot Dynamic Quantization for Transformer Inference EMNLP 2022

Sparse Mixers: Combining MoE and Mixing to build a more efficient BERT EMNLP 2022

RedApt: An Adaptor for wav2vec 2 EncodingFaster and Smaller Speech Translation without Quality Compromise EMNLP 2022

SparseAdapter: An Easy Approach for Improving the Parameter-Efficiency of Adapters EMNLP 2022

Quadapter: Adapter for GPT-2 Quantization EMNLP 2022

Partially-Random Initialization: A Smoking Gun for Binarization Hypothesis of BERT EMNLP 2022

AlphaTuning: Quantization-Aware Parameter-Efficient Adaptation of Large-Scale Pre-Trained Language Models EMNLP 2022

Prompt Compression and Contrastive Conditioning for Controllability and Toxicity Reduction in Language Models EMNLP 2022

Revisiting Locality Sensitive Hashing for Vocabulary Selection in Fast Neural Machine Translation EMNLP 2022

Low-bit Shift Network for End-to-End Spoken Language Understanding INTERSPEECH 2022

A Passive Similarity based CNN Filter Pruning for Efficient Acoustic Scene Classification INTERSPEECH 2022

Accelerating Inference and Language Model Fusion of Recurrent Neural Network Transducers via End-to-End 4-bit Quantization INTERSPEECH 2022

MISRNet: Lightweight Neural Vocoder Using Multi-Input Single Shared Residual Blocks INTERSPEECH 2022

EPIC TTS Models: Empirical Pruning Investigations Characterizing Text-To-Speech Models INTERSPEECH 2022

Binary Early-Exit Network for Adaptive Inference on Low-Resource Devices INTERSPEECH 2022

Regularizing Transformer-based Acoustic Models by Penalizing Attention Weights INTERSPEECH 2022

Optimizing Reusable Knowledge for Continual Learning via Metalearning NIPS 2021

Channel Permutations for N:M Sparsity NIPS 2021

Demystifying and Generalizing BinaryConnect NIPS 2021

YOLObile: Real-Time Object Detection on Mobile Devices via Compression-Compilation Co-Design AAAI 2021

SA-BNN: State-Aware Binary Neural Network AAAI 2021

Dynamic Encoder Transducer: A Flexible Solution for Trading Off Accuracy for Latency INTERSPEECH 2021

On-Device Streaming Transformer-Based End-to-End Speech Recognition INTERSPEECH 2021

Large Scale Private Learning via Low-rank Reparametrization ICML 2021