Research Explorer
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Achievements
About
Methodology
← Optimization & Theory
Deep Learning
›
Optimization & Theory
›
Model Compression
1674 directly classified papers
Papers per year
2012: 1
2013: 2
2014: 2
2015: 7
2016: 9
2017: 27
2018: 51
2019: 79
2020: 189
2021: 165
2022: 206
2023: 207
2024: 325
2025: 399
2026: 5
Papers
DCSF-KD: Dynamic Channel-wise Spatial Feature Knowledge Distillation for Object Detection
AAAI 2025
Unifying Uniform and Binary-coding Quantization for Accurate Compression of Large Language Models
ACL 2025
Q-Mamba: Towards more efficient Mamba models via post-training quantization
ACL 2025
FPE2M2: Approaching Lossless and Efficient Quantization with Native Floating Point
ACL 2025
Numerical Pruning for Efficient Autoregressive Models
AAAI 2025
Pretraining Context Compressor for Large Language Models with Embedding-Based Memory
ACL 2025
Activating Distributed Visual Region within LLMs for Efficient and Effective Vision-Language Training and Inference
ACL 2025
FocusLLM: Precise Understanding of Long Context by Dynamic Condensing
ACL 2025
Bitnet.cpp: Efficient Edge Inference for Ternary LLMs
ACL 2025
Run LoRA Run: Faster and Lighter LoRA Implementations
ACL 2025
ProCut: LLM Prompt Compression via Attribution Estimation
EMNLP 2025
Parameter-Efficient Fine-Tuning via Circular Convolution
ACL 2025
LLMs on a Budget? Say HOLA
EMNLP 2025
Harmonizing Diverse Models: A Layer-wise Merging Strategy for Consistent Generation
EMNLP 2025
Multi-Task Pre-Finetuning of Lightweight Transformer Encoders for Text Classification and NER
EMNLP 2025
Scaling Down, Serving Fast: Compressing and Deploying Efficient LLMs for Recommendation Systems
EMNLP 2025
Low-Rank Interconnected Adaptation across Layers
ACL 2025
Maximum Score Routing For Mixture-of-Experts
ACL 2025
GenPTQ: Green Post-Training Quantization for Large-Scale ASR Models with Mixed-Precision Bit Allocation
EMNLP 2025
Revisiting Pruning vs Quantization for Small Language Models
EMNLP 2025
SwiftPrune: Hessian-Free Weight Pruning for Large Language Models
EMNLP 2025
Sensitivity-LoRA : Low-Load Sensitivity-Based Fine-Tuning for Large Language Models
EMNLP 2025
1+1>2: A Synergistic Sparse and Low-Rank Compression Method for Large Language Models
EMNLP 2025
BcQLM: Efficient Vision-Language Understanding with Distilled Q-Gated Cross-Modal Fusion
EMNLP 2025
WAVE: Weight Templates for Adaptive Initialization of Variable-sized Models
CVPR 2025
<
1
…
14
15
16
…
67
>