model compression

3283 papers

Explore in graph

Also known as

MC

Co-occurring keywords

knowledge distillation (3680) large language model (12755) neural network (6616) efficient computing (779) neural network optimization (1293) transfer learning (5442) convolutional neural network (4216) neural network pruning (265) language model (4573) parameter efficiency (415)

Papers

FISTAPruner: Layer-wise Post-training Pruning for Large Language Models EMNLP 2025

StepER: Step-wise Knowledge Distillation for Enhancing Reasoning Ability in Multi-Step Retrieval-Augmented Language Models EMNLP 2025

LeanK: Learnable K Cache Channel Pruning for Efficient Decoding EMNLP 2025

Word Salad Chopper: Reasoning Models Waste A Ton Of Decoding Budget On Useless Repetitions, Self-Knowingly EMNLP 2025

AMQ: Enabling AutoML for Mixed-precision Weight-Only Quantization of Large Language Models EMNLP 2025

EasyDistill: A Comprehensive Toolkit for Effective Knowledge Distillation of Large Language Models EMNLP 2025

Scaling Down, Serving Fast: Compressing and Deploying Efficient LLMs for Recommendation Systems EMNLP 2025

Low-Rank Interconnected Adaptation across Layers ACL 2025

CodecNeRF: Toward Fast Encoding and Decoding, Compact, and High-quality Novel-view Synthesis AAAI 2025

Efficient Vocabulary Reduction for Small Language Models COLING 2025

Mitigating Bias in Machine Learning: A Comprehensive Review and Novel Approaches AAAI 2025

Automated Fine-Grained Mixture-of-Experts Quantization ACL 2025

Q-DiT: Accurate Post-Training Quantization for Diffusion Transformers CVPR 2025

Concept Distillation from Strong to Weak Models via Hypotheses-to-Theories Prompting NAACL 2025

VeriFastScore: Speeding up long-form factuality evaluation EMNLP 2025

SWITCH: Studying with Teacher for Knowledge Distillation of Large Language Models NAACL 2025

Towards compact and efficient Slovak summarization models ACL 2025

QPruner: Probabilistic Decision Quantization for Structured Pruning in Large Language Models NAACL 2025

Scalable and Trustworthy Learning in Heterogeneous Networks AAAI 2025

How Much Knowledge Can You Pack into a LoRA Adapter without Harming LLM? NAACL 2025

DadmaTools V2: an Adapter-Based Natural Language Processing Toolkit for the Persian Language COLING 2025

Efficient Speech Translation through Model Compression and Knowledge Distillation ACL 2025

MPPQ: Enhancing Post-Training Quantization for LLMs via Mixed Supervision, Proxy Rounding, and Pre-Searching IJCAI 2025

RankAdaptor: Hierarchical Rank Allocation for Efficient Fine-Tuning Pruned LLMs via Performance Model NAACL 2025

AAIG at GenAI Detection Task 1: Exploring Syntactically-Aware, Resource-Efficient Small Autoregressive Decoders for AI Content Detection COLING 2025