Research Explorer
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Achievements
About
Methodology
← Core AI
Artificial Intelligence
›
Core AI
›
Model Compression
1928 directly classified papers
Papers per year
2013: 2
2014: 1
2015: 6
2016: 4
2017: 13
2018: 47
2019: 81
2020: 114
2021: 172
2022: 191
2023: 272
2024: 370
2025: 489
2026: 166
Papers
MoE-I2: Compressing Mixture of Experts Models through Inter-Expert Pruning and Intra-Expert Low-Rank Decomposition
EMNLP 2024
LoRASC: Expressive and Generalizable Low-rank Adaptation for Large Models via Slow Cascaded Learning
EMNLP 2024
XC-Cache: Cross-Attending to Cached Context for Efficient LLM Inference
EMNLP 2024
Further Compressing Distilled Language Models via Frequency-aware Partial Sparse Coding of Embeddings
EMNLP 2024
Less is Fed More: Sparsity Reduces Feature Distortion in Federated Learning
EMNLP 2024
STTATTS: Unified Speech-To-Text And Text-To-Speech Model
EMNLP 2024
Pruning via Merging: Compressing LLMs via Manifold Alignment Based Layer Merging
EMNLP 2024
Fast Streaming Transducer ASR Prototyping via Knowledge Distillation with Whisper
EMNLP 2024
Edge Inference With Fully Differentiable Quantized Mixed Precision Neural Networks
WACV 2024
Mini but Mighty: Finetuning ViTs With Mini Adapters
WACV 2024
PATROL: Privacy-Oriented Pruning for Collaborative Inference Against Model Inversion Attacks
WACV 2024
Task-Agnostic Self-Distillation for Few-Shot Action Recognition
IJCAI 2024
Multimodal Large Language Models with Fusion Low Rank Adaptation for Device Directed Speech Detection
INTERSPEECH 2024
Language-Specific Pruning for Efficient Reduction of Large Language Models
COLING 2024
Adaptive Rank Selections for Low-Rank Approximation of Language Models
NAACL 2024
PEMA: An Offsite-Tunable Plug-in External Memory Adaptation for Language Models
NAACL 2024
Investigating Acceleration of LLaMA Inference by Enabling Intermediate Layer Decoding via Instruction Tuning with ‘LITE’
NAACL 2024
PRILoRA: Pruned and Rank-Increasing Low-Rank Adaptation
EACL 2024
Style Vectors for Steering Generative Large Language Models
EACL 2024
Parameter-Efficient Fine-Tuning: Is There An Optimal Subset of Parameters to Tune?
EACL 2024
Resource-Efficient Neural Networks for Embedded Systems
JMLR 2024
Elastic Weight Removal for Faithful and Abstractive Dialogue Generation
NAACL 2024
QuaRot: Outlier-Free 4-Bit Inference in Rotated LLMs
NIPS 2024
LM-HT SNN: Enhancing the Performance of SNN to ANN Counterpart through Learnable Multi-hierarchical Threshold Model
NIPS 2024
DEPrune: Depth-wise Separable Convolution Pruning for Maximizing GPU Parallelism
NIPS 2024
<
1
…
39
40
41
…
78
>