← Application Areas

Machine Learning › Application Areas ›

Model Compression

1503 directly classified papers

Papers per year

Papers

EPSD: Early Pruning with Self-Distillation for Efficient Model Compression AAAI 2024

LayerSkip: Enabling Early Exit Inference and Self-Speculative Decoding ACL 2024

SwitchHead: Accelerating Transformers with Mixture-of-Experts Attention NIPS 2024

SequentialAttention++ for Block Sparsification: Differentiable Pruning Meets Combinatorial Optimization NIPS 2024

AlphaPruning: Using Heavy-Tailed Self Regularization Theory for Improved Layer-wise Pruning of Large Language Models NIPS 2024

CoMERA: Computing- and Memory-Efficient Training via Rank-Adaptive Tensor Optimization NIPS 2024

Structured Unrestricted-Rank Matrices for Parameter Efficient Finetuning NIPS 2024

AmoebaLLM: Constructing Any-Shape Large Language Models for Efficient and Instant Deployment NIPS 2024

UPS: Unified Projection Sharing for Lightweight Single-Image Super-resolution and Beyond NIPS 2024

TuneTables: Context Optimization for Scalable Prior-Data Fitted Networks NIPS 2024

Cross-model Control: Improving Multiple Large Language Models in One-time Training NIPS 2024

Understanding and Minimising Outlier Features in Transformer Training NIPS 2024

Toward Efficient Inference for Mixture of Experts NIPS 2024

DISP-LLM: Dimension-Independent Structural Pruning for Large Language Models NIPS 2024

TinyLUT: Tiny Look-Up Table for Efficient Image Restoration at the Edge NIPS 2024

2DQuant: Low-bit Post-Training Quantization for Image Super-Resolution NIPS 2024

Data Shunt: Collaboration of Small and Large Models for Lower Costs and Better Performance AAAI 2024

Balancing Speciality and Versatility: a Coarse to Fine Framework for Supervised Fine-tuning Large Language Model ACL 2024

Compressing Large Language Models using Low Rank and Low Precision Decomposition NIPS 2024

Over-parameterized Student Model via Tensor Decomposition Boosted Knowledge Distillation NIPS 2024

How Sparse Can We Prune A Deep Network: A Fundamental Limit Perspective NIPS 2024

United We Stand, Divided We Fall: Fingerprinting Deep Neural Networks via Adversarial Trajectories NIPS 2024

Optimal and Approximate Adaptive Stochastic Quantization NIPS 2024

A Deep Dive into the Trade-Offs of Parameter-Efficient Preference Alignment Techniques ACL 2024

Reasons and Solutions for the Decline in Model Performance after Editing NIPS 2024