Research Explorer
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Achievements
About
Methodology
← Core AI
Artificial Intelligence
›
Core AI
›
Model Compression
1928 directly classified papers
Papers per year
2013: 2
2014: 1
2015: 6
2016: 4
2017: 13
2018: 47
2019: 81
2020: 114
2021: 172
2022: 191
2023: 272
2024: 370
2025: 489
2026: 166
Papers
Efficient Multi-task LLM Quantization and Serving for Multiple LoRA Adapters
NIPS 2024
Surgical Feature-Space Decomposition of LLMs: Why, When and How?
ACL 2024
Adversarial Distillation Based on Slack Matching and Attribution Region Alignment
CVPR 2024
BitDistiller: Unleashing the Potential of Sub-4-Bit LLMs via Self-Distillation
ACL 2024
Black-Box Forgetting
NIPS 2024
Quantized Side Tuning: Fast and Memory-Efficient Tuning of Quantized Large Language Models
ACL 2024
MagR: Weight Magnitude Reduction for Enhancing Post-Training Quantization
NIPS 2024
Auto-Train-Once: Controller Network Guided Automatic Network Pruning from Scratch
CVPR 2024
DisCEdit: Model Editing by Identifying Discriminative Components
NIPS 2024
MADTP: Multimodal Alignment-Guided Dynamic Token Pruning for Accelerating Vision-Language Transformer
CVPR 2024
What Makes Quantization for Large Language Model Hard? An Empirical Study from the Lens of Perturbation
AAAI 2024
Exploiting Activation Sparsity with Dense to Dynamic-k Mixture-of-Experts Conversion
NIPS 2024
FlowerFormer: Empowering Neural Architecture Encoding using a Flow-aware Graph Transformer
CVPR 2024
Training Binary Neural Networks via Gaussian Variational Inference and Low-Rank Semidefinite Programming
NIPS 2024
BAM! Just Like That: Simple and Efficient Parameter Upcycling for Mixture of Experts
NIPS 2024
Dense Vision Transformer Compression with Few Samples
CVPR 2024
QT-ViT: Improving Linear Attention in ViT with Quadratic Taylor Expansion
NIPS 2024
EnOF-SNN: Training Accurate Spiking Neural Networks via Enhancing the Output Feature
NIPS 2024
Quantization of Large Language Models with an Overdetermined Basis
UAI 2024
ZipCache: Accurate and Efficient KV Cache Quantization with Salient Token Identification
NIPS 2024
Knowledge Editing for Large Language Models
COLING 2024
Finding Transformer Circuits With Edge Pruning
NIPS 2024
The Mamba in the Llama: Distilling and Accelerating Hybrid Models
NIPS 2024
Shaving Weights with Occam's Razor: Bayesian Sparsification for Neural Networks using the Marginal Likelihood
NIPS 2024
FedMef: Towards Memory-efficient Federated Dynamic Pruning
CVPR 2024
<
1
…
29
30
31
…
78
>