← Application Areas

Machine Learning › Application Areas ›

Model Compression

1503 directly classified papers

Papers per year

Papers

Variable Layerwise Quantization: A Simple and Effective Approach to Quantize LLMs ACL 2025

Parameter-Efficient Fine-Tuning via Circular Convolution ACL 2025

A Semantic-Aware Layer-Freezing Approach to Computation-Efficient Fine-Tuning of Language Models ACL 2025

Q-Mamba: Towards more efficient Mamba models via post-training quantization ACL 2025

AnalyticKWS: Towards Exemplar-Free Analytic Class Incremental Learning for Small-footprint Keyword Spotting ACL 2025

Assigning Distinct Roles to Quantized and Low-Rank Matrices Toward Optimal Weight Decomposition ACL 2025

ProcrustesGPT: Compressing LLMs with Structured Matrices and Orthogonal Transformations ACL 2025

Automated Fine-Grained Mixture-of-Experts Quantization ACL 2025

Enhancing AI-Driven Farming Advisory in Kenya with Efficient RAG Agents via Quantized Fine-Tuned Language Models ACL 2025

Towards compact and efficient Slovak summarization models ACL 2025

Efficient Speech Translation through Model Compression and Knowledge Distillation ACL 2025

From Teacher to Student: Tracking Memorization Through Model Distillation ACL 2025

MT2ST: Adaptive Multi-Task to Single-Task Learning ACL 2025

Aligning Sizes of Intermediate Layers by LoRA Adapter for Knowledge Distillation NAACL 2025

Mitigating Sequential Dependencies: A Survey of Algorithms and Systems for Generation-Refinement Frameworks in Autoregressive Models EMNLP 2025

Do We Really Need All Those Dimensions? An Intrinsic Evaluation Framework for Compressed Embeddings EMNLP 2025

Recover-LoRA: Data-Free Accuracy Recovery of Degraded Language Models via Low-Rank Adaptation EMNLP 2025

Large Language Models Are Overparameterized Text Encoders NAACL 2025

On-device System of Compositional Multi-tasking in Large Language Models EMNLP 2025

Random Conditioning for Diffusion Model Compression with Distillation CVPR 2025

Interpretable Generative Models through Post-hoc Concept Bottlenecks CVPR 2025

Edge-SD-SR: Low Latency and Parameter Efficient On-device Super-Resolution with Stable Diffusion via Bidirectional Conditioning CVPR 2025

LLMs on a Budget? Say HOLA EMNLP 2025

Memory-Efficient Backpropagation for Fine-Tuning LLMs on Resource-Constrained Mobile Devices EMNLP 2025

Enhancing Autonomous Driving Systems with On-Board Deployed Large Language Models RSS 2025