model compression

3283 papers

Explore in graph

Also known as

MC

Co-occurring keywords

knowledge distillation (3680) large language model (12755) neural network (6616) efficient computing (779) neural network optimization (1293) transfer learning (5442) convolutional neural network (4216) neural network pruning (265) language model (4573) parameter efficiency (415)

Papers

Federated Model Heterogeneous Matryoshka Representation Learning NIPS 2024

Agile Multi-Source-Free Domain Adaptation AAAI 2024

SparseFlow: Accelerating Transformers by Sparsifying Information Flows ACL 2024

Revisiting Knowledge Distillation for Autoregressive Language Models ACL 2024

Simple and Fast Distillation of Diffusion Models NIPS 2024

HydraViT: Stacking Heads for a Scalable ViT NIPS 2024

DistilVPR: Cross-Modal Knowledge Distillation for Visual Place Recognition AAAI 2024

Layer-Adaptive State Pruning for Deep State Space Models NIPS 2024

BAM! Just Like That: Simple and Efficient Parameter Upcycling for Mixture of Experts NIPS 2024

UniADS: Universal Architecture-Distiller Search for Distillation Gap AAAI 2024

3-in-1: 2D Rotary Adaptation for Efficient Finetuning, Efficient Batching and Composability NIPS 2024

Not All Experts are Equal: Efficient Expert Pruning and Skipping for Mixture-of-Experts Large Language Models ACL 2024

LLM in a flash: Efficient Large Language Model Inference with Limited Memory ACL 2024

LayerSkip: Enabling Early Exit Inference and Self-Speculative Decoding ACL 2024

DB-LLM: Accurate Dual-Binarization for Efficient LLMs ACL 2024

ResLoRA: Identity Residual Mapping in Low-Rank Adaption ACL 2024

AS-ES Learning: Towards efficient CoT learning in small models ACL 2024

A Comprehensive Evaluation of Quantization Strategies for Large Language Models ACL 2024

Differentially Private Knowledge Distillation via Synthetic Text Generation ACL 2024

Sparsity-Accelerated Training for Large Language Models ACL 2024

Generative Model-Based Feature Knowledge Distillation for Action Recognition AAAI 2024

Practical Privacy-Preserving MLaaS: When Compressive Sensing Meets Generative Networks AAAI 2024

A Lightweight U-like Network Utilizing Neural Memory Ordinary Differential Equations for Slimming the Decoder IJCAI 2024

AQ-DETR: Low-Bit Quantized Detection Transformer with Auxiliary Queries AAAI 2024

S-STE: Continuous Pruning Function for Efficient 2:4 Sparse Pre-training NIPS 2024