Research Explorer
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Achievements
About
Methodology
← Core AI
Artificial Intelligence
›
Core AI
›
Model Compression
1928 directly classified papers
Papers per year
2013: 2
2014: 1
2015: 6
2016: 4
2017: 13
2018: 47
2019: 81
2020: 114
2021: 172
2022: 191
2023: 272
2024: 370
2025: 489
2026: 166
Papers
PrivCirNet: Efficient Private Inference via Block Circulant Transformation
NIPS 2024
FIARSE: Model-Heterogeneous Federated Learning via Importance-Aware Submodel Extraction
NIPS 2024
S2HPruner: Soft-to-Hard Distillation Bridges the Discretization Gap in Pruning
NIPS 2024
Activation Map Compression through Tensor Decomposition for Deep Learning
NIPS 2024
Spectral Adapter: Fine-Tuning in Spectral Space
NIPS 2024
NVRC: Neural Video Representation Compression
NIPS 2024
Learn more, but bother less: parameter efficient continual learning
NIPS 2024
Learn To be Efficient: Build Structured Sparsity in Large Language Models
NIPS 2024
SlimGPT: Layer-wise Structured Pruning for Large Language Models
NIPS 2024
Adaptive Layer Sparsity for Large Language Models via Activation Correlation Assessment
NIPS 2024
xRAG: Extreme Context Compression for Retrieval-augmented Generation with One Token
NIPS 2024
Adversarial Moment-Matching Distillation of Large Language Models
NIPS 2024
NoMAD-Attention: Efficient LLM Inference on CPUs Through Multiply-add-free Attention
NIPS 2024
Bridging The Gap between Low-rank and Orthogonal Adaptation via Householder Reflection Adaptation
NIPS 2024
Q-VLM: Post-training Quantization for Large Vision-Language Models
NIPS 2024
LoQT: Low-Rank Adapters for Quantized Pretraining
NIPS 2024
$\textit{Read-ME}$: Refactorizing LLMs as Router-Decoupled Mixture of Experts with System Co-Design
NIPS 2024
SLTrain: a sparse plus low rank approach for parameter and memory efficient pretraining
NIPS 2024
Unveiling LoRA Intrinsic Ranks via Salience Analysis
NIPS 2024
Refusal in Language Models Is Mediated by a Single Direction
NIPS 2024
Mixture of Scales: Memory-Efficient Token-Adaptive Binarization for Large Language Models
NIPS 2024
Search for Efficient Large Language Models
NIPS 2024
Uncovering the Redundancy in Graph Self-supervised Learning Models
NIPS 2024
The Iterative Optimal Brain Surgeon: Faster Sparse Recovery by Leveraging Second-Order Information
NIPS 2024
MiniCache: KV Cache Compression in Depth Dimension for Large Language Models
NIPS 2024
<
1
…
40
41
42
…
78
>