Research Explorer
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Achievements
About
Methodology
← Core AI
Artificial Intelligence
›
Core AI
›
Model Compression
1928 directly classified papers
Papers per year
2013: 2
2014: 1
2015: 6
2016: 4
2017: 13
2018: 47
2019: 81
2020: 114
2021: 172
2022: 191
2023: 272
2024: 370
2025: 489
2026: 166
Papers
AnalyticKWS: Towards Exemplar-Free Analytic Class Incremental Learning for Small-footprint Keyword Spotting
ACL 2025
MLWQ: Efficient Small Language Model Deployment via Multi-Level Weight Quantization
EMNLP 2025
A Stitch in Time Saves Nine: Small VLM is a Precise Guidance for Accelerating Large VLMs
CVPR 2025
AnchorAttention: Difference-Aware Sparse Attention with Stripe Granularity
EMNLP 2025
FREE: Fast and Robust Vision Language Models with Early Exits
ACL 2025
Not All Parameters Are Created Equal: Smart Isolation Boosts Fine-Tuning Performance
EMNLP 2025
Recoverable Compression: A Multimodal Vision Token Recovery Mechanism Guided by Text Information
AAAI 2025
Bit-Flip Error Resilience in LLMs: A Comprehensive Analysis and Defense Framework
EMNLP 2025
Demystifying Small Language Models for Edge Deployment
ACL 2025
Layer-Aware Task Arithmetic: Disentangling Task-Specific and Instruction-Following Knowledge
EMNLP 2025
Revisiting Pruning vs Quantization for Small Language Models
EMNLP 2025
Speculative Decoding for Multi-Sample Inference
EMNLP 2025
SwiftPrune: Hessian-Free Weight Pruning for Large Language Models
EMNLP 2025
LAVa: Layer-wise KV Cache Eviction with Dynamic Budget Allocation
EMNLP 2025
1+1>2: A Synergistic Sparse and Low-Rank Compression Method for Large Language Models
EMNLP 2025
Human-Inspired Obfuscation for Model Unlearning: Local and Global Strategies with Hyperbolic Representations
EMNLP 2025
Dropping Experts, Recombining Neurons: Retraining-Free Pruning for Sparse Mixture-of-Experts LLMs
EMNLP 2025
FuzzAug: Data Augmentation by Coverage-guided Fuzzing for Neural Test Generation
EMNLP 2025
TwT: Thinking without Tokens by Habitual Reasoning Distillation with Multi-Teachers’ Guidance
EMNLP 2025
FAEDKV: Infinite-Window Fourier Transform for Unbiased KV Cache Compression
EMNLP 2025
MONAQ: Multi-Objective Neural Architecture Querying for Time-Series Analysis on Resource-Constrained Devices
EMNLP 2025
KurTail : Kurtosis-based LLM Quantization
EMNLP 2025
FLRC: Fine-grained Low-Rank Compressor for Efficient LLM Inference
EMNLP 2025
Controllable Memorization in LLMs via Weight Pruning
EMNLP 2025
One QuantLLM for ALL: Fine-tuning Quantized LLMs Once for Efficient Deployments
ACL 2025
<
1
…
23
24
25
…
78
>