model compression

3283 papers

Explore in graph

Also known as

MC

Co-occurring keywords

knowledge distillation (3680) large language model (12755) neural network (6616) efficient computing (779) neural network optimization (1293) transfer learning (5442) convolutional neural network (4216) neural network pruning (265) language model (4573) parameter efficiency (415)

Papers

LoRA-GA: Low-Rank Adaptation with Gradient Approximation NIPS 2024

ScaleKD: Strong Vision Transformers Could Be Excellent Teachers NIPS 2024

SpikedAttention: Training-Free and Fully Spike-Driven Transformer-to-SNN Conversion with Winner-Oriented Spike Shift for Softmax Operation NIPS 2024

On the Inductive Bias of Stacking Towards Improving Reasoning NIPS 2024

How Sparse Can We Prune A Deep Network: A Fundamental Limit Perspective NIPS 2024

QuanTA: Efficient High-Rank Fine-Tuning of LLMs with Quantum-Informed Tensor Adaptation NIPS 2024

Consistency Policy: Accelerated Visuomotor Policies via Consistency Distillation RSS 2024

Edge Inference With Fully Differentiable Quantized Mixed Precision Neural Networks WACV 2024

Mini but Mighty: Finetuning ViTs With Mini Adapters WACV 2024

Hundred-Kilobyte Lookup Tables for Efficient Single-Image Super-Resolution IJCAI 2024

D3ETR: Decoder Distillation for Detection Transformer IJCAI 2024

Can Small Language Models Help Large Language Models Reason Better?: LM-Guided Chain-of-Thought COLING 2024

Language-Specific Pruning for Efficient Reduction of Large Language Models COLING 2024

On Linearizing Structured Data in Encoder-Decoder Language Models: Insights from Text-to-SQL NAACL 2024

Adaptive Rank Selections for Low-Rank Approximation of Language Models NAACL 2024

MT-PATCHER: Selective and Extendable Knowledge Distillation from Large Language Models for Machine Translation NAACL 2024

PRILoRA: Pruned and Rank-Increasing Low-Rank Adaptation EACL 2024

Improved Techniques for Quantizing Deep Networks With Adaptive Bit-Widths WACV 2024

QuaRot: Outlier-Free 4-Bit Inference in Rotated LLMs NIPS 2024

CIFD: Controlled Information Flow to Enhance Knowledge Distillation NIPS 2024

PrivCirNet: Efficient Private Inference via Block Circulant Transformation NIPS 2024

FIARSE: Model-Heterogeneous Federated Learning via Importance-Aware Submodel Extraction NIPS 2024

S2HPruner: Soft-to-Hard Distillation Bridges the Discretization Gap in Pruning NIPS 2024

Learning from Offline Foundation Features with Tensor Augmentations NIPS 2024

VeXKD: The Versatile Integration of Cross-Modal Fusion and Knowledge Distillation for 3D Perception NIPS 2024