Research Explorer
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Achievements
About
Methodology
← Optimization & Theory
Deep Learning
›
Optimization & Theory
›
Model Compression
1674 directly classified papers
Papers per year
2012: 1
2013: 2
2014: 2
2015: 7
2016: 9
2017: 27
2018: 51
2019: 79
2020: 189
2021: 165
2022: 206
2023: 207
2024: 325
2025: 399
2026: 5
Papers
Twin-Merging: Dynamic Integration of Modular Expertise in Model Merging
NIPS 2024
MagR: Weight Magnitude Reduction for Enhancing Post-Training Quantization
NIPS 2024
SwapMoE: Serving Off-the-shelf MoE-based Large Language Models with Tunable Memory Budget
ACL 2024
CoMERA: Computing- and Memory-Efficient Training via Rank-Adaptive Tensor Optimization
NIPS 2024
Reducing Transformer Key-Value Cache Size with Cross-Layer Attention
NIPS 2024
DiP-GO: A Diffusion Pruner via Few-step Gradient Optimization
NIPS 2024
Understanding the Role of the Projector in Knowledge Distillation
AAAI 2024
Megalodon: Efficient LLM Pretraining and Inference with Unlimited Context Length
NIPS 2024
Faster and Lighter LLMs: A Survey on Current Challenges and Way Forward
IJCAI 2024
FasterVD: On Acceleration of Video Diffusion Models
IJCAI 2024
LLM in a flash: Efficient Large Language Model Inference with Limited Memory
ACL 2024
2DQuant: Low-bit Post-Training Quantization for Image Super-Resolution
NIPS 2024
DISP-LLM: Dimension-Independent Structural Pruning for Large Language Models
NIPS 2024
Over-parameterized Student Model via Tensor Decomposition Boosted Knowledge Distillation
NIPS 2024
SequentialAttention++ for Block Sparsification: Differentiable Pruning Meets Combinatorial Optimization
NIPS 2024
UniPTS: A Unified Framework for Proficient Post-Training Sparsity
CVPR 2024
Reasons and Solutions for the Decline in Model Performance after Editing
NIPS 2024
SwitchHead: Accelerating Transformers with Mixture-of-Experts Attention
NIPS 2024
What Makes Quantization for Large Language Model Hard? An Empirical Study from the Lens of Perturbation
AAAI 2024
Clockwork Diffusion: Efficient Generation With Model-Step Distillation
CVPR 2024
Pick-or-Mix: Dynamic Channel Sampling for ConvNets
CVPR 2024
OneBit: Towards Extremely Low-bit Large Language Models
NIPS 2024
Towards More Accurate Diffusion Model Acceleration with A Timestep Tuner
CVPR 2024
Efficient Stitchable Task Adaptation
CVPR 2024
FM-Delta: Lossless Compression for Storing Massive Fine-tuned Foundation Models
NIPS 2024
<
1
…
19
20
21
…
67
>