← Optimization & Theory

Deep Learning › Optimization & Theory ›

Model Compression

1674 directly classified papers

Papers per year

Papers

Reasons and Solutions for the Decline in Model Performance after Editing NIPS 2024

Megalodon: Efficient LLM Pretraining and Inference with Unlimited Context Length NIPS 2024

2DQuant: Low-bit Post-Training Quantization for Image Super-Resolution NIPS 2024

Retaining Key Information under High Compression Ratios: Query-Guided Compressor for LLMs ACL 2024

Shaving Weights with Occam's Razor: Bayesian Sparsification for Neural Networks using the Marginal Likelihood NIPS 2024

Expanding Sparse Tuning for Low Memory Usage NIPS 2024

SwitchHead: Accelerating Transformers with Mixture-of-Experts Attention NIPS 2024

Value-Based Deep Multi-Agent Reinforcement Learning with Dynamic Sparse Training NIPS 2024

Orthogonal Adaptation for Modular Customization of Diffusion Models CVPR 2024

Resource-Efficient Transformer Pruning for Finetuning of Large Models CVPR 2024

LaRE^2: Latent Reconstruction Error Based Method for Diffusion-Generated Image Detection CVPR 2024

LayerSkip: Enabling Early Exit Inference and Self-Speculative Decoding ACL 2024

LLM in a flash: Efficient Large Language Model Inference with Limited Memory ACL 2024

How Far Can We Compress Instant-NGP-Based NeRF? CVPR 2024

Towards More Accurate Diffusion Model Acceleration with A Timestep Tuner CVPR 2024

ChunkAttention: Efficient Self-Attention with Prefix-Aware KV Cache and Two-Phase Partition ACL 2024

On the Impact of Calibration Data in Post-training Quantization and Pruning ACL 2024

USDN: A Unified Sample-Wise Dynamic Network With Mixed-Precision and Early-Exit WACV 2024

All Rivers Run to the Sea: Private Learning with Asymmetric Flows CVPR 2024

Adaptive Depth Networks with Skippable Sub-Paths NIPS 2024

Dodo: Dynamic Contextual Compression for Decoder-only LMs ACL 2024

Acquiring Clean Language Models from Backdoor Poisoned Datasets by Downscaling Frequency Space ACL 2024

Learning To Compose SuperWeights for Neural Parameter Allocation Search WACV 2024

NACL: A General and Effective KV Cache Eviction Framework for LLM at Inference Time ACL 2024

SwapMoE: Serving Off-the-shelf MoE-based Large Language Models with Tunable Memory Budget ACL 2024