Artificial Intelligence › Core AI ›

Model Compression

1928 directly classified papers

Papers per year

Papers

LoSiA: Efficient High-Rank Fine-Tuning via Subnet Localization and Optimization EMNLP 2025

Efficient Speech Translation through Model Compression and Knowledge Distillation ACL 2025

QuZO: Quantized Zeroth-Order Fine-Tuning for Large Language Models EMNLP 2025

DART: Distilling Autoregressive Reasoning to Silent Thought EMNLP 2025

TBA at BEA 2025 Shared Task: Transfer-Learning from DARE-TIES Merged Models for the Pedagogical Ability Assessment of LLM-Powered Math Tutors ACL 2025

Code-Optimise: Self-Generated Preference Data for Correctness and Efficiency NAACL 2025

RESF: Regularized-Entropy-Sensitive Fingerprinting for Black-Box Tamper Detection of Large Language Models EMNLP 2025

QSpec: Speculative Decoding with Complementary Quantization Schemes EMNLP 2025

Low-Rank Interconnected Adaptation across Layers ACL 2025

DisComp: A Two-Stage Prompt Optimization Framework Combining Task-Agnostic and Task-Aware Compression NAACL 2025

Q-Mamba: Towards more efficient Mamba models via post-training quantization ACL 2025

TACO-RL: Task Aware Prompt Compression Optimization with Reinforcement Learning ACL 2025

CORAL: Learning Consistent Representations across Multi-step Training with Lighter Speculative Drafter ACL 2025

CoT-Valve: Length-Compressible Chain-of-Thought Tuning ACL 2025

Mr. Snuffleupagus at SemEval-2025 Task 4: Unlearning Factual Knowledge from LLMs Using Adaptive RMU ACL 2025

Variable Layerwise Quantization: A Simple and Effective Approach to Quantize LLMs ACL 2025

RQT: Hierarchical Residual Quantization for Multi-Model Compression ACL 2025

ClusComp: A Simple Paradigm for Model Compression and Efficient Finetuning ACL 2025

WET: Overcoming Paraphrasing Vulnerabilities in Embeddings-as-a-Service with Linear Transformation Watermarks ACL 2025

GIL-IIMAS UNAM at SemEval-2025 Task 4: LA-Min(E): LLM Unlearning Approaches Under Function Minimizing Evaluation Constraints ACL 2025

TWIST: Text-encoder Weight-editing for Inserting Secret Trojans in Text-to-Image Models ACL 2025

HIGGS: Pushing the Limits of Large Language Model Quantization via the Linearity Theorem NAACL 2025

ShortGPT: Layers in Large Language Models are More Redundant Than You Expect ACL 2025

Not All Adapters Matter: Selective Adapter Freezing for Memory-Efficient Fine-Tuning of Language Models NAACL 2025

CLaSp: In-Context Layer Skip for Self-Speculative Decoding ACL 2025