← Application Areas

Machine Learning › Application Areas ›

Model Compression

1503 directly classified papers

Papers per year

Papers

Study Selectively: An Adaptive Knowledge Distillation based on a Voting Network for Heart Sound Classification INTERSPEECH 2024

Rethinking Imbalance in Image Super-Resolution for Efficient Inference NIPS 2024

Pruning via Merging: Compressing LLMs via Manifold Alignment Based Layer Merging EMNLP 2024

Sketching for Distributed Deep Learning: A Sharper Analysis NIPS 2024

PV-Tuning: Beyond Straight-Through Estimation for Extreme LLM Compression NIPS 2024

TroL: Traversal of Layers for Large Language and Vision Models EMNLP 2024

AdaMoE: Token-Adaptive Routing with Null Experts for Mixture-of-Experts Language Models EMNLP 2024

Building on Efficient Foundations: Effective Training of LLMs with Structured Feedforward Layers NIPS 2024

Searching for Efficient Linear Layers over a Continuous Space of Structured Matrices NIPS 2024

KVQuant: Towards 10 Million Context Length LLM Inference with KV Cache Quantization NIPS 2024

Wino Vidi Vici: Conquering Numerical Instability of 8-Bit Winograd Convolution for Accurate Inference Acceleration on Edge WACV 2024

Exploiting Label Skews in Federated Learning with Model Concatenation AAAI 2024

Stepping Forward on the Last Mile NIPS 2024

WARDEN: Multi-Directional Backdoor Watermarks for Embedding-as-a-Service Copyright Protection ACL 2024

GPT vs RETRO: Exploring the Intersection of Retrieval and Parameter-Efficient Fine-Tuning EMNLP 2024

Pruning Large Language Models to Intra-module Low-rank Architecture with Transitional Activations ACL 2024

Practical Hybrid Gradient Compression for Federated Learning Systems IJCAI 2024

QEFT: Quantization for Efficient Fine-Tuning of LLMs EMNLP 2024

Switchable Representation Learning Framework With Self-Compatibility CVPR 2023

Practical Network Acceleration With Tiny Sets CVPR 2023

CP3: Channel Pruning Plug-In for Point-Based Networks CVPR 2023

Defending Against Patch-Based Backdoor Attacks on Self-Supervised Learning CVPR 2023

MixPHM: Redundancy-Aware Parameter-Efficient Tuning for Low-Resource Visual Question Answering CVPR 2023

Re2TAL: Rewiring Pretrained Video Backbones for Reversible Temporal Action Localization CVPR 2023

Pruning Parameterization With Bi-Level Optimization for Efficient Semantic Segmentation on the Edge CVPR 2023