Research Explorer
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Achievements
About
Methodology
← Core AI
Artificial Intelligence
›
Core AI
›
Model Compression
1928 directly classified papers
Papers per year
2013: 2
2014: 1
2015: 6
2016: 4
2017: 13
2018: 47
2019: 81
2020: 114
2021: 172
2022: 191
2023: 272
2024: 370
2025: 489
2026: 166
Papers
WET: Overcoming Paraphrasing Vulnerabilities in Embeddings-as-a-Service with Linear Transformation Watermarks
ACL 2025
AdaTP: Attention-Debiased Token Pruning for Video Large Language Models
EMNLP 2025
PIP: Perturbation-based Iterative Pruning for Large Language Models
EMNLP 2025
Learning What to Remember: Adaptive Probabilistic Memory Retention for Memory-Efficient Language Models
EMNLP 2025
Variable Layerwise Quantization: A Simple and Effective Approach to Quantize LLMs
ACL 2025
Beyond the Surface: A Solution-Aware Retrieval Model for Competition-level Code Generation
EMNLP 2025
EmByte: Decomposition and Compression Learning for Small yet Private NLP
EMNLP 2025
Efficiently Editing Mixture-of-Experts Models with Compressed Experts
EMNLP 2025
Q-Mamba: Towards more efficient Mamba models via post-training quantization
ACL 2025
Talking Head Anime 4: Distillation for Real-Time Performance
WACV 2025
CARVQ: Corrective Adaptor with Group Residual Vector Quantization for LLM Embedding Compression
EMNLP 2025
AdaEdit: Advancing Continuous Knowledge Editing For Large Language Models
ACL 2025
QSpec: Speculative Decoding with Complementary Quantization Schemes
EMNLP 2025
A Drop-In Solution for On-the-Fly Adaptation of Speculative Decoding in Large Language Models
ACL 2025
RESF: Regularized-Entropy-Sensitive Fingerprinting for Black-Box Tamper Detection of Large Language Models
EMNLP 2025
Squeezed Attention: Accelerating Long Context Length LLM Inference
ACL 2025
DART: Distilling Autoregressive Reasoning to Silent Thought
EMNLP 2025
FPE2M2: Approaching Lossless and Efficient Quantization with Native Floating Point
ACL 2025
QuZO: Quantized Zeroth-Order Fine-Tuning for Large Language Models
EMNLP 2025
Less, but Better: Efficient Multilingual Expansion for LLMs via Layer-wise Mixture-of-Experts
ACL 2025
Assigning Distinct Roles to Quantized and Low-Rank Matrices Toward Optimal Weight Decomposition
ACL 2025
LoSiA: Efficient High-Rank Fine-Tuning via Subnet Localization and Optimization
EMNLP 2025
Alignment-Augmented Speculative Decoding with Alignment Sampling and Conditional Verification
EMNLP 2025
Rethinking Low-Rank Adaptation in Vision: Exploring Head-Level Responsiveness Across Diverse Tasks
WACV 2025
LVLM-Compress-Bench: Benchmarking the Broader Impact of Large Vision-Language Model Compression
NAACL 2025
<
1
…
22
23
24
…
78
>