Research Explorer
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Achievements
About
Methodology
← Core AI
Artificial Intelligence
›
Core AI
›
Model Compression
1928 directly classified papers
Papers per year
2013: 2
2014: 1
2015: 6
2016: 4
2017: 13
2018: 47
2019: 81
2020: 114
2021: 172
2022: 191
2023: 272
2024: 370
2025: 489
2026: 166
Papers
Outlier Suppression+: Accurate quantization of large language models by equivalent and effective shifting and scaling
EMNLP 2023
DUnE: Dataset for Unified Editing
EMNLP 2023
DEPN: Detecting and Editing Privacy Neurons in Pretrained Language Models
EMNLP 2023
Sparse Low-rank Adaptation of Pre-trained Language Models
EMNLP 2023
GQA: Training Generalized Multi-Query Transformer Models from Multi-Head Checkpoints
EMNLP 2023
PRCA: Fitting Black-Box Large Language Models for Retrieval Question Answering via Pluggable Reward-Driven Contextual Adapter
EMNLP 2023
Compressing Context to Enhance Inference Efficiency of Large Language Models
EMNLP 2023
Unified Low-Resource Sequence Labeling by Sample-Aware Dynamic Sparse Finetuning
EMNLP 2023
EasyQuant: An Efficient Data-free Quantization Algorithm for LLMs
EMNLP 2023
CRaSh: Clustering, Removing, and Sharing Enhance Fine-tuning without Full Large Language Model
EMNLP 2023
Unlearn What You Want to Forget: Efficient Unlearning for LLMs
EMNLP 2023
Context Compression for Auto-regressive Transformers with Sentinel Tokens
EMNLP 2023
A Frustratingly Easy Post-Training Quantization Scheme for LLMs
EMNLP 2023
Enhancing Computation Efficiency in Large Language Models through Weight and Activation Quantization
EMNLP 2023
Towards A Unified View of Sparse Feed-Forward Network in Pretraining Large Language Model
EMNLP 2023
A Comparative Analysis of Task-Agnostic Distillation Methods for Compressing Transformer Language Models
EMNLP 2023
EELBERT: Tiny Models through Dynamic Embeddings
EMNLP 2023
Data Pruning for Efficient Model Pruning in Neural Machine Translation
EMNLP 2023
Approximating Two-Layer Feedforward Networks for Efficient Transformers
EMNLP 2023
Adaptive Smoothing Gradient Learning for Spiking Neural Networks
ICML 2023
UPSCALE: Unconstrained Channel Pruning
ICML 2023
UPop: Unified and Progressive Pruning for Compressing Vision-Language Transformers
ICML 2023
Pruning has a disparate impact on model accuracy
NIPS 2022
Mask More and Mask Later: Efficient Pre-training of Masked Language Models by Disentangling the [MASK] Token
EMNLP 2022
Pruning Neural Networks via Coresets and Convex Geometry: Towards No Assumptions
NIPS 2022
<
1
…
51
52
53
…
78
>