Research Explorer
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Achievements
About
Methodology
← Optimization & Theory
Deep Learning
›
Optimization & Theory
›
Model Compression
1674 directly classified papers
Papers per year
2012: 1
2013: 2
2014: 2
2015: 7
2016: 9
2017: 27
2018: 51
2019: 79
2020: 189
2021: 165
2022: 206
2023: 207
2024: 325
2025: 399
2026: 5
Papers
Pit One Against Many: Leveraging Attention-head Embeddings for Parameter-efficient Multi-head Attention
EMNLP 2023
On the Dimensionality of Sentence Embeddings
EMNLP 2023
Variator: Accelerating Pre-trained Models with Plug-and-Play Compression Modules
EMNLP 2023
Multilingual Lottery Tickets to Pretrain Language Models
EMNLP 2023
The Cost of Compression: Investigating the Impact of Compression on Parametric Knowledge in Language Models
EMNLP 2023
FaLA: Fast Linear Adaptation for Replacing Backbone Models on Edge Devices
EMNLP 2023
MUX-PLMs: Data Multiplexing for High-throughput Language Models
EMNLP 2023
Length-Adaptive Distillation: Customizing Small Language Model for Dynamic Token Pruning
EMNLP 2023
HadSkip: Homotopic and Adaptive Layer Skipping of Pre-trained Language Models for Efficient Inference
EMNLP 2023
Speculative Decoding: Exploiting Speculative Execution for Accelerating Seq2seq Generation
EMNLP 2023
Efficient Long-Range Transformers: You Need to Attend More, but Not Necessarily at Every Layer
EMNLP 2023
AdaTranS: Adapting with Boundary-based Shrinking for End-to-End Speech Translation
EMNLP 2023
Leap-of-Thought: Accelerating Transformers via Dynamic Token Routing
EMNLP 2023
Hybrid Inverted Index Is a Robust Accelerator for Dense Retrieval
EMNLP 2023
Gradient-based Gradual Pruning for Language-Specific Multilingual Neural Machine Translation
EMNLP 2023
Dynamic Low-rank Estimation for Transformer-based Language Models
EMNLP 2023
Co-training and Co-distillation for Quality Improvement and Compression of Language Models
EMNLP 2023
Dynamic Stashing Quantization for Efficient Transformer Training
EMNLP 2023
LLM-Adapters: An Adapter Family for Parameter-Efficient Fine-Tuning of Large Language Models
EMNLP 2023
Complexity-Guided Slimmable Decoder for Efficient Deep Video Compression
CVPR 2023
Q-DETR: An Efficient Low-Bit Quantized Detection Transformer
CVPR 2023
Lossless 4-bit Quantization of Architecture Compressed Conformer ASR Systems on the 300-hr Switchboard Corpus
INTERSPEECH 2023
Compressed MoE ASR Model Based on Knowledge Distillation and Quantization
INTERSPEECH 2023
EMQ: Evolving Training-free Proxies for Automated Mixed Precision Quantization
ICCV 2023
ROME: Robustifying Memory-Efficient NAS via Topology Disentanglement and Gradient Accumulation
ICCV 2023
<
1
…
33
34
35
…
67
>