model compression

3283 papers

Explore in graph

Also known as

MC

Co-occurring keywords

knowledge distillation (3680) large language model (12755) neural network (6616) efficient computing (779) neural network optimization (1293) transfer learning (5442) convolutional neural network (4216) neural network pruning (265) language model (4573) parameter efficiency (415)

Papers

Exploring the Trade-Offs: Quantization Methods, Task Difficulty, and Model Size in Large Language Models From Edge to Giant IJCAI 2025

GQSA: Group Quantization and Sparsity for Accelerating Large Language Model Inference IJCNLP 2025

PUP 3D-GS: Principled Uncertainty Pruning for 3D Gaussian Splatting CVPR 2025

FADA: Fast Diffusion Avatar Synthesis with Mixed-Supervised Multi-CFG Distillation CVPR 2025

FBQuant: FeedBack Quantization for Large Language Models IJCAI 2025

Logic Distillation: Learning from Code Function by Function for Decision-making Tasks IJCAI 2025

DKDM: Data-Free Knowledge Distillation for Diffusion Models with Any Architecture CVPR 2025

Integrating Independent Layer-Wise Rank Selection with Low-Rank SVD Training for Model Compression: A Theory-Driven Approach IJCAI 2025

EasyDistill: A Comprehensive Toolkit for Effective Knowledge Distillation of Large Language Models EMNLP 2025

Scaling Down, Serving Fast: Compressing and Deploying Efficient LLMs for Recommendation Systems EMNLP 2025

VL2Lite: Task-Specific Knowledge Distillation from Large Vision-Language Models to Lightweight Networks CVPR 2025

Multilingual Iterative Model Pruning: What Matters? IJCNLP 2025

LangCompress: Language-Aware Compression of Large Language Models IJCNLP 2025

Variance-Based Pruning for Accelerating and Compressing Trained Networks ICCV 2025

Sweeping Heterogeneity with Smart MoPs: Mixture of Prompts for LLM Task Adaptation AAAI 2025

Two Sparse Matrices are Better than One: Sparsifying Neural Networks with Double Sparse Factorization ICLR 2025

EA-Vit: Efficient Adaptation for Elastic Vision Transformer ICCV 2025

AMQ: Enabling AutoML for Mixed-precision Weight-Only Quantization of Large Language Models EMNLP 2025

One-Way Ticket: Time-Independent Unified Encoder for Distilling Text-to-Image Diffusion Models CVPR 2025

Efficient Diffusion as Low Light Enhancer CVPR 2025

Toward Adaptive Large Language Models Structured Pruning via Hybrid-grained Weight Importance Assessment AAAI 2025

Parameter-Efficient Fine-Tuning with Circulant and Diagonal Vectors IJCAI 2025

Boost Embodied AI Models with Robust Compression Boundary IJCAI 2025

HydraOpt: Navigating the Efficiency-Performance Trade-off of Adapter Merging EMNLP 2025

Aggregation Mechanism Based Graph Heterogeneous Networks Distillation IJCAI 2025