← Optimization & Theory

Deep Learning › Optimization & Theory ›

Model Compression

1674 directly classified papers

Papers per year

Papers

Pit One Against Many: Leveraging Attention-head Embeddings for Parameter-efficient Multi-head Attention EMNLP 2023

On the Dimensionality of Sentence Embeddings EMNLP 2023

Variator: Accelerating Pre-trained Models with Plug-and-Play Compression Modules EMNLP 2023

Multilingual Lottery Tickets to Pretrain Language Models EMNLP 2023

The Cost of Compression: Investigating the Impact of Compression on Parametric Knowledge in Language Models EMNLP 2023

FaLA: Fast Linear Adaptation for Replacing Backbone Models on Edge Devices EMNLP 2023

MUX-PLMs: Data Multiplexing for High-throughput Language Models EMNLP 2023

Length-Adaptive Distillation: Customizing Small Language Model for Dynamic Token Pruning EMNLP 2023

HadSkip: Homotopic and Adaptive Layer Skipping of Pre-trained Language Models for Efficient Inference EMNLP 2023

Speculative Decoding: Exploiting Speculative Execution for Accelerating Seq2seq Generation EMNLP 2023

Efficient Long-Range Transformers: You Need to Attend More, but Not Necessarily at Every Layer EMNLP 2023

AdaTranS: Adapting with Boundary-based Shrinking for End-to-End Speech Translation EMNLP 2023

Leap-of-Thought: Accelerating Transformers via Dynamic Token Routing EMNLP 2023

Hybrid Inverted Index Is a Robust Accelerator for Dense Retrieval EMNLP 2023

Gradient-based Gradual Pruning for Language-Specific Multilingual Neural Machine Translation EMNLP 2023

Dynamic Low-rank Estimation for Transformer-based Language Models EMNLP 2023

Co-training and Co-distillation for Quality Improvement and Compression of Language Models EMNLP 2023

Dynamic Stashing Quantization for Efficient Transformer Training EMNLP 2023

LLM-Adapters: An Adapter Family for Parameter-Efficient Fine-Tuning of Large Language Models EMNLP 2023

Complexity-Guided Slimmable Decoder for Efficient Deep Video Compression CVPR 2023

Q-DETR: An Efficient Low-Bit Quantized Detection Transformer CVPR 2023

Lossless 4-bit Quantization of Architecture Compressed Conformer ASR Systems on the 300-hr Switchboard Corpus INTERSPEECH 2023

Compressed MoE ASR Model Based on Knowledge Distillation and Quantization INTERSPEECH 2023

EMQ: Evolving Training-free Proxies for Automated Mixed Precision Quantization ICCV 2023

ROME: Robustifying Memory-Efficient NAS via Topology Disentanglement and Gradient Accumulation ICCV 2023