model compression

3283 papers

Explore in graph

Also known as

MC

Co-occurring keywords

knowledge distillation (3680) large language model (12755) neural network (6616) efficient computing (779) neural network optimization (1293) transfer learning (5442) convolutional neural network (4216) neural network pruning (265) language model (4573) parameter efficiency (415)

Papers

REPrune: Channel Pruning via Kernel Representative Selection AAAI 2024

Learning Performance Maximizing Ensembles with Explainability Guarantees AAAI 2024

Integer Is Enough: When Vertical Federated Learning Meets Rounding AAAI 2024

Building Variable-Sized Models via Learngene Pool AAAI 2024

Generative Model-Based Feature Knowledge Distillation for Action Recognition AAAI 2024

Practical Privacy-Preserving MLaaS: When Compressive Sensing Meets Generative Networks AAAI 2024

AQ-DETR: Low-Bit Quantized Detection Transformer with Auxiliary Queries AAAI 2024

BiPFT: Binary Pre-trained Foundation Transformer with Low-Rank Estimation of Binarization Residual Polynomials AAAI 2024

PTMQ: Post-training Multi-Bit Quantization of Neural Networks AAAI 2024

Progressively Knowledge Distillation via Re-parameterizing Diffusion Reverse Process AAAI 2024

ShareBERT: Embeddings Are Capable of Learning Hidden Layers AAAI 2024

Blind-Touch: Homomorphic Encryption-Based Distributed Neural Network Inference for Privacy-Preserving Fingerprint Authentication AAAI 2024

Hear You Say You: An Efficient Framework for Marine Mammal Sounds’ Classification AAAI 2024

VeriCompress: A Tool to Streamline the Synthesis of Verified Robust Compressed Neural Networks from Scratch AAAI 2024

Knowledge Transfer via Compact Model in Federated Learning (Student Abstract) AAAI 2024

Not All Experts are Equal: Efficient Expert Pruning and Skipping for Mixture-of-Experts Large Language Models ACL 2024

LLM in a flash: Efficient Large Language Model Inference with Limited Memory ACL 2024

Knowledge Distillation from Monolingual to Multilingual Models for Intelligent and Interpretable Multilingual Emotion Detection ACL 2024

Structured Unrestricted-Rank Matrices for Parameter Efficient Finetuning NIPS 2024

Optimal and Approximate Adaptive Stochastic Quantization NIPS 2024

Over-parameterized Student Model via Tensor Decomposition Boosted Knowledge Distillation NIPS 2024

Anchor-based Large Language Models ACL 2024

SPIN: Sparsifying and Integrating Internal Neurons in Large Language Models for Text Classification ACL 2024

ELAD: Explanation-Guided Large Language Models Active Distillation ACL 2024

LoRAPrune: Structured Pruning Meets Low-Rank Parameter-Efficient Fine-Tuning ACL 2024