Research Explorer
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Achievements
About
Methodology
← Core AI
Artificial Intelligence
›
Core AI
›
Model Compression
1928 directly classified papers
Papers per year
2013: 2
2014: 1
2015: 6
2016: 4
2017: 13
2018: 47
2019: 81
2020: 114
2021: 172
2022: 191
2023: 272
2024: 370
2025: 489
2026: 166
Papers
Adapters Selector: Cross-domains and Multi-tasks LoRA Modules Integration Usage Method
COLING 2025
Distilling Rule-based Knowledge into Large Language Models
COLING 2025
Enhancing One-Shot Pruned Pre-trained Language Models through Sparse-Dense-Sparse Mechanism
COLING 2025
Word Salad Chopper: Reasoning Models Waste A Ton Of Decoding Budget On Useless Repetitions, Self-Knowingly
EMNLP 2025
PruneCD: Contrasting Pruned Self Model to Improve Decoding Factuality
EMNLP 2025
A2ATS: Retrieval-Based KV Cache Reduction via Windowed Rotary Position Embedding and Query-Aware Vector Quantization
ACL 2025
Bias Vector: Mitigating Biases in Language Models with Task Arithmetic Approach
COLING 2025
LoRACoE: Improving Large Language Model via Composition-based LoRA Expert
EMNLP 2025
Parameter-Efficient Fine-Tuning of Large Language Models via Deconvolution in Subspace
COLING 2025
LeanK: Learnable K Cache Channel Pruning for Efficient Decoding
EMNLP 2025
Talking Head Anime 4: Distillation for Real-Time Performance
WACV 2025
Continual Quantization-Aware Pre-Training: When to transition from 16-bit to 1.58-bit pre-training for BitNet language models?
ACL 2025
Quantized but Deceptive? A Multi-Dimensional Truthfulness Evaluation of Quantized LLMs
EMNLP 2025
Split-Merge: Scalable and Memory-Efficient Merging of Expert LLMs
EMNLP 2025
TaoAvatar: Real-Time Lifelike Full-Body Talking Avatars for Augmented Reality via 3D Gaussian Splatting
CVPR 2025
FISTAPruner: Layer-wise Post-training Pruning for Large Language Models
EMNLP 2025
CLMTracing: Black-box User-level Watermarking for Code Language Model Tracing
EMNLP 2025
FlexiGPT: Pruning and Extending Large Language Models with Low-Rank Weight Sharing
NAACL 2025
Probe-Free Low-Rank Activation Intervention
NAACL 2025
Binarized Neural Network for Multi-spectral Image Fusion
CVPR 2025
Pruning One More Token is Enough: Leveraging Latency-Workload Non-Linearities for Vision Transformers on the Edge
WACV 2025
REVS: Unlearning Sensitive Information in Language Models via Rank Editing in the Vocabulary Space
ACL 2025
AMXFP4: Taming Activation Outliers with Asymmetric Microscaling Floating-Point for 4-bit LLM Inference
ACL 2025
TWIST: Text-encoder Weight-editing for Inserting Secret Trojans in Text-to-Image Models
ACL 2025
DynamicKV: Task-Aware Adaptive KV Cache Compression for Long Context LLMs
EMNLP 2025
<
1
…
19
20
21
…
78
>