Research Explorer
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Achievements
About
Methodology
← Core AI
Artificial Intelligence
›
Core AI
›
Model Compression
1928 directly classified papers
Papers per year
2013: 2
2014: 1
2015: 6
2016: 4
2017: 13
2018: 47
2019: 81
2020: 114
2021: 172
2022: 191
2023: 272
2024: 370
2025: 489
2026: 166
Papers
DuQuant: Distributing Outliers via Dual Transformation Makes Stronger Quantized LLMs
NIPS 2024
On the Impact of Calibration Data in Post-training Quantization and Pruning
ACL 2024
How Sparse Can We Prune A Deep Network: A Fundamental Limit Perspective
NIPS 2024
MADTP: Multimodal Alignment-Guided Dynamic Token Pruning for Accelerating Vision-Language Transformer
CVPR 2024
Adversarial Distillation Based on Slack Matching and Attribution Region Alignment
CVPR 2024
Acquiring Clean Language Models from Backdoor Poisoned Datasets by Downscaling Frequency Space
ACL 2024
On the social bias of speech self-supervised models
INTERSPEECH 2024
Reducing Transformer Key-Value Cache Size with Cross-Layer Attention
NIPS 2024
Exploring compressibility of transformer based text-to-music (TTM) models
INTERSPEECH 2024
SwapMoE: Serving Off-the-shelf MoE-based Large Language Models with Tunable Memory Budget
ACL 2024
Learn and Don't Forget: Adding a New Language to ASR Foundation Models
INTERSPEECH 2024
FedMef: Towards Memory-efficient Federated Dynamic Pruning
CVPR 2024
Efficient CNNs with Quaternion Transformations and Pruning for Audio Tagging
INTERSPEECH 2024
WRP: Weight Recover Prune for Structured Sparsity
ACL 2024
Streamlining Speech Enhancement DNNs: an Automated Pruning Method Based on Dependency Graph with Advanced Regularized Loss Strategies
INTERSPEECH 2024
FM-Delta: Lossless Compression for Storing Massive Fine-tuned Foundation Models
NIPS 2024
MULTIFLOW: Shifting Towards Task-Agnostic Vision-Language Pruning
CVPR 2024
LLMCBench: Benchmarking Large Language Model Compression for Efficient Deployment
NIPS 2024
SparseFlow: Accelerating Transformers by Sparsifying Information Flows
ACL 2024
Compact 3D Gaussian Representation for Radiance Field
CVPR 2024
Data-Free Quantization via Pseudo-label Filtering
CVPR 2024
Unlocking Data-free Low-bit Quantization with Matrix Decomposition for KV Cache Compression
ACL 2024
Efficient Multi-task LLM Quantization and Serving for Multiple LoRA Adapters
NIPS 2024
Surgical Feature-Space Decomposition of LLMs: Why, When and How?
ACL 2024
BitDistiller: Unleashing the Potential of Sub-4-Bit LLMs via Self-Distillation
ACL 2024
<
1
…
28
29
30
…
78
>