Research Explorer
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Achievements
About
Methodology
← Core AI
Artificial Intelligence
›
Core AI
›
Model Compression
1928 directly classified papers
Papers per year
2013: 2
2014: 1
2015: 6
2016: 4
2017: 13
2018: 47
2019: 81
2020: 114
2021: 172
2022: 191
2023: 272
2024: 370
2025: 489
2026: 166
Papers
AROMA: Autonomous Rank-one Matrix Adaptation
EMNLP 2025
WINS: Winograd Structured Pruning for Fast Winograd Convolution
ICCV 2025
Hopscotch: Discovering and Skipping Redundancies in Language Models
EMNLP 2025
ImPart: Importance-Aware Delta-Sparsification for Improved Model Compression and Merging in LLMs
ACL 2025
Prompt Compression for Large Language Models: A Survey
NAACL 2025
All You Need in Knowledge Distillation Is a Tailored Coordinate System
AAAI 2025
SafeRoute: Adaptive Model Selection for Efficient and Accurate Safety Guardrails in Large Language Models
ACL 2025
HydraOpt: Navigating the Efficiency-Performance Trade-off of Adapter Merging
EMNLP 2025
MPPQ: Enhancing Post-Training Quantization for LLMs via Mixed Supervision, Proxy Rounding, and Pre-Searching
IJCAI 2025
DAM: Dynamic Attention Mask for Long-Context Large Language Model Inference Acceleration
ACL 2025
Libra-Merging: Importance-redundancy and Pruning-merging Trade-off for Acceleration Plug-in in Large Vision-Language Model
CVPR 2025
A Stitch in Time Saves Nine: Small VLM is a Precise Guidance for Accelerating Large VLMs
CVPR 2025
VoCo-LLaMA: Towards Vision Compression with Large Language Models
CVPR 2025
Why Do Some Inputs Break Low-Bit LLM Quantization?
EMNLP 2025
OBLIVIATE: Robust and Practical Machine Unlearning for Large Language Models
EMNLP 2025
Hardware-Aware Parallel Prompt Decoding for Memory-Efficient Acceleration of LLM Inference
EMNLP 2025
CASP: Compression of Large Multimodal Models Based on Attention Sparsity
CVPR 2025
EfficientLLaVA: Generalizable Auto-Pruning for Large Vision-language Models
CVPR 2025
Split-Merge: Scalable and Memory-Efficient Merging of Expert LLMs
EMNLP 2025
Quantized but Deceptive? A Multi-Dimensional Truthfulness Evaluation of Quantized LLMs
EMNLP 2025
Layer- and Timestep-Adaptive Differentiable Token Compression Ratios for Efficient Diffusion Transformers
CVPR 2025
SpecCoT: Accelerating Chain-of-Thought Reasoning through Speculative Exploration
EMNLP 2025
Multimodal Promptable Token Merging for Diffusion Models
AAAI 2025
UniGraspTransformer: Simplified Policy Distillation for Scalable Dexterous Robotic Grasping
CVPR 2025
Optimal Transport-Based Token Weighting scheme for Enhanced Preference Optimization
ACL 2025
<
1
…
25
26
27
…
78
>