Research Explorer
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Achievements
About
Methodology
← Optimization & Theory
Deep Learning
›
Optimization & Theory
›
Model Compression
1674 directly classified papers
Papers per year
2012: 1
2013: 2
2014: 2
2015: 7
2016: 9
2017: 27
2018: 51
2019: 79
2020: 189
2021: 165
2022: 206
2023: 207
2024: 325
2025: 399
2026: 5
Papers
DuQuant: Distributing Outliers via Dual Transformation Makes Stronger Quantized LLMs
NIPS 2024
F³-Pruning: A Training-Free and Generalized Pruning Strategy towards Faster and Finer Text-to-Video Synthesis
AAAI 2024
UltraSparseBERT: 99% Conditionally Sparse Language Modelling
ACL 2024
Structured Unrestricted-Rank Matrices for Parameter Efficient Finetuning
NIPS 2024
Streamlining Speech Enhancement DNNs: an Automated Pruning Method Based on Dependency Graph with Advanced Regularized Loss Strategies
INTERSPEECH 2024
MEFT: Memory-Efficient Fine-Tuning through Sparse Adapter
ACL 2024
Lightweight Transducer Based on Frame-Level Criterion
INTERSPEECH 2024
Unlocking Data-free Low-bit Quantization with Matrix Decomposition for KV Cache Compression
ACL 2024
Twin-Merging: Dynamic Integration of Modular Expertise in Model Merging
NIPS 2024
PRoLoRA: Partial Rotation Empowers More Parameter-Efficient LoRA
ACL 2024
Expanding Sparse Tuning for Low Memory Usage
NIPS 2024
SwitchHead: Accelerating Transformers with Mixture-of-Experts Attention
NIPS 2024
BitsFusion: 1.99 bits Weight Quantization of Diffusion Model
NIPS 2024
QUIK: Towards End-to-end 4-Bit Inference on Generative Large Language Models
EMNLP 2024
SequentialAttention++ for Block Sparsification: Differentiable Pruning Meets Combinatorial Optimization
NIPS 2024
CoMERA: Computing- and Memory-Efficient Training via Rank-Adaptive Tensor Optimization
NIPS 2024
Cross-model Control: Improving Multiple Large Language Models in One-time Training
NIPS 2024
Over-parameterized Student Model via Tensor Decomposition Boosted Knowledge Distillation
NIPS 2024
Reasons and Solutions for the Decline in Model Performance after Editing
NIPS 2024
2DQuant: Low-bit Post-Training Quantization for Image Super-Resolution
NIPS 2024
FM-Delta: Lossless Compression for Storing Massive Fine-tuned Foundation Models
NIPS 2024
OneBit: Towards Extremely Low-bit Large Language Models
NIPS 2024
ZipCache: Accurate and Efficient KV Cache Quantization with Salient Token Identification
NIPS 2024
Megalodon: Efficient LLM Pretraining and Inference with Unlimited Context Length
NIPS 2024
Language Models as Zero-shot Lossless Gradient Compressors: Towards General Neural Parameter Prior Models
NIPS 2024
<
1
…
20
21
22
…
67
>