Artificial Intelligence › Core AI ›

Model Compression

1928 directly classified papers

Papers per year

Papers

Outlier Suppression+: Accurate quantization of large language models by equivalent and effective shifting and scaling EMNLP 2023

DUnE: Dataset for Unified Editing EMNLP 2023

DEPN: Detecting and Editing Privacy Neurons in Pretrained Language Models EMNLP 2023

Sparse Low-rank Adaptation of Pre-trained Language Models EMNLP 2023

GQA: Training Generalized Multi-Query Transformer Models from Multi-Head Checkpoints EMNLP 2023

PRCA: Fitting Black-Box Large Language Models for Retrieval Question Answering via Pluggable Reward-Driven Contextual Adapter EMNLP 2023

Compressing Context to Enhance Inference Efficiency of Large Language Models EMNLP 2023

Unified Low-Resource Sequence Labeling by Sample-Aware Dynamic Sparse Finetuning EMNLP 2023

EasyQuant: An Efficient Data-free Quantization Algorithm for LLMs EMNLP 2023

CRaSh: Clustering, Removing, and Sharing Enhance Fine-tuning without Full Large Language Model EMNLP 2023

Unlearn What You Want to Forget: Efficient Unlearning for LLMs EMNLP 2023

Context Compression for Auto-regressive Transformers with Sentinel Tokens EMNLP 2023

A Frustratingly Easy Post-Training Quantization Scheme for LLMs EMNLP 2023

Enhancing Computation Efficiency in Large Language Models through Weight and Activation Quantization EMNLP 2023

Towards A Unified View of Sparse Feed-Forward Network in Pretraining Large Language Model EMNLP 2023

A Comparative Analysis of Task-Agnostic Distillation Methods for Compressing Transformer Language Models EMNLP 2023

EELBERT: Tiny Models through Dynamic Embeddings EMNLP 2023

Data Pruning for Efficient Model Pruning in Neural Machine Translation EMNLP 2023

Approximating Two-Layer Feedforward Networks for Efficient Transformers EMNLP 2023

Adaptive Smoothing Gradient Learning for Spiking Neural Networks ICML 2023

UPSCALE: Unconstrained Channel Pruning ICML 2023

UPop: Unified and Progressive Pruning for Compressing Vision-Language Transformers ICML 2023

Pruning has a disparate impact on model accuracy NIPS 2022

Mask More and Mask Later: Efficient Pre-training of Masked Language Models by Disentangling the [MASK] Token EMNLP 2022

Pruning Neural Networks via Coresets and Convex Geometry: Towards No Assumptions NIPS 2022