Research Explorer
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Achievements
About
Methodology
← Core AI
Artificial Intelligence
›
Core AI
›
Model Compression
1928 directly classified papers
Papers per year
2013: 2
2014: 1
2015: 6
2016: 4
2017: 13
2018: 47
2019: 81
2020: 114
2021: 172
2022: 191
2023: 272
2024: 370
2025: 489
2026: 166
Papers
Beyond Dynamic Quantization: An Efficient Static Hierarchical Mix-precision Framework for Near-Lossless LLM Compression
EMNLP 2025
Advancing Weight and Channel Sparsification with Enhanced Saliency
WACV 2025
Automatic Joint Structured Pruning and Quantization for Efficient Neural Network Training and Compression
CVPR 2025
ReGLA: Refining Gated Linear Attention
NAACL 2025
Positional Overload: Positional Debiasing and Context Window Extension for Large Language Models using Set Encoding
ACL 2025
OptiPrune: Effective Pruning Approach for Every Target Sparsity
COLING 2025
DecDEC: A Systems Approach to Advancing Low-Bit LLM Quantization
OSDI 2025
Seeing What Matters: Empowering CLIP with Patch Generation-to-Selection
CVPR 2025
PTQ1.61: Push the Real Limit of Extremely Low-Bit Post-Training Quantization Methods for Large Language Models
ACL 2025
Adapters Selector: Cross-domains and Multi-tasks LoRA Modules Integration Usage Method
COLING 2025
Multilingual Iterative Model Pruning: What Matters?
AACL 2025
HiRED: Attention-Guided Token Dropping for Efficient Inference of High-Resolution Vision-Language Models
AAAI 2025
GradOT: Training-free Gradient-preserving Offsite-tuning for Large Language Models
ACL 2025
MiLoRA: Harnessing Minor Singular Components for Parameter-Efficient LLM Finetuning
NAACL 2025
Multimodal Promptable Token Merging for Diffusion Models
AAAI 2025
BitNet: 1-bit Pre-training for Large Language Models
JMLR 2025
Ego-VPA: Egocentric Video Understanding with Parameter-Efficient Adaptation
WACV 2025
MimicGait: A Model Agnostic Approach for Occluded Gait Recognition using Correlational Knowledge Distillation
WACV 2025
LiteLMGuard: Seamless and Lightweight On-Device Guardrails for Small Language Models against Quantization Vulnerabilities
IJCNLP 2025
Recover-LoRA: Data-Free Accuracy Recovery of Degraded Language Models via Low-Rank Adaptation
EMNLP 2025
Bitnet.cpp: Efficient Edge Inference for Ternary LLMs
ACL 2025
SVD-LLM V2: Optimizing Singular Value Truncation for Large Language Model Compression
NAACL 2025
SEUF: Is Unlearning One Expert Enough for Mixture-of-Experts LLMs?
ACL 2025
DualGuard: A Parameter Space Transformation Approach for Bidirectional Defense in Split-Based LLM Fine-Tuning
ACL 2025
ECHO-LLaMA: Efficient Caching for High-Performance LLaMA Training
EMNLP 2025
<
1
…
18
19
20
…
78
>