← Application Areas

Machine Learning › Application Areas ›

Model Compression

1503 directly classified papers

Papers per year

Papers

Adapters: A Unified Library for Parameter-Efficient and Modular Transfer Learning EMNLP 2023

SAMP: A Model Inference Toolkit of Post-Training Quantization for Text Processing via Self-Adaptive Mixed-Precision EMNLP 2023

Gradient-based Gradual Pruning for Language-Specific Multilingual Neural Machine Translation EMNLP 2023

G-SPEED: General SParse Efficient Editing MoDel EMNLP 2023

HadSkip: Homotopic and Adaptive Layer Skipping of Pre-trained Language Models for Efficient Inference EMNLP 2023

NoisyQuant: Noisy Bias-Enhanced Post-Training Activation Quantization for Vision Transformers CVPR 2023

Model Barrier: A Compact Un-Transferable Isolation Domain for Model Intellectual Property Protection CVPR 2023

Efficient On-Device Training via Gradient Filtering CVPR 2023

Complexity-Guided Slimmable Decoder for Efficient Deep Video Compression CVPR 2023

Q-DETR: An Efficient Low-Bit Quantized Detection Transformer CVPR 2023

MobileTL: On-Device Transfer Learning with Inverted Residual Blocks AAAI 2023

Complement Sparsification: Low-Overhead Model Pruning for Federated Learning AAAI 2023

Predictive Exit: Prediction of Fine-Grained Early Exits for Computation- and Energy-Efficient Inference AAAI 2023

OMPQ: Orthogonal Mixed Precision Quantization AAAI 2023

Dynamic Structure Pruning for Compressing CNNs AAAI 2023

Fast and Accurate Binary Neural Networks Based on Depth-Width Reshaping AAAI 2023

Compressing Transformers: Features Are Low-Rank, but Weights Are Not! AAAI 2023

1% VS 100%: Parameter-Efficient Low Rank Adapter for Dense Predictions CVPR 2023

Cache me if you Can: an Online Cost-aware Teacher-Student framework to Reduce the Calls to Large Language Models EMNLP 2023

Efficient Multilingual Language Model Compression through Vocabulary Trimming EMNLP 2023

Enhancing Scalability of Pre-trained Language Models via Efficient Parameter Sharing EMNLP 2023

Focus on the Core: Efficient Attention via Pruned Token Compression for Document Classification EMNLP 2023

Decomposed Prompt Tuning via Low-Rank Reparameterization EMNLP 2023

Towards Being Parameter-Efficient: A Stratified Sparsely Activated Transformer with Dynamic Capacity EMNLP 2023

Boosting Inference Efficiency: Unleashing the Power of Parameter-Shared Pre-trained Language Models EMNLP 2023