model compression

3283 papers

Explore in graph

Also known as

MC

Co-occurring keywords

knowledge distillation (3680) large language model (12755) neural network (6616) efficient computing (779) neural network optimization (1293) transfer learning (5442) convolutional neural network (4216) neural network pruning (265) language model (4573) parameter efficiency (415)

Papers

Beyond One-Step Distillation: Bridging the Capacity Gap in Small Language Models via Multi-Step Knowledge Transfer EACL 2026

Iterative Structured Pruning for Large Language Models with Multi-Domain Calibration EACL 2026

KV Pareto: Systems-Level Optimization of KV Cache and Model Compression for Long Context Inference EACL 2026

Hala Technical Report Building Arabic-Centric Instruction & Translation Models at Scale EACL 2026

DITTO: A Spoofing Attack Framework on Watermarked LLMs via Knowledge Distillation EACL 2026

AfriNLLB: Efficient Translation Models for African Languages EACL 2026

Transferable Backdoor Attacks for Code Models via Sharpness-Aware Adversarial Perturbation AAAI 2026

NeuralGS: Bridging Neural Fields and 3D Gaussian Splatting for Compact 3D Representations AAAI 2026

Distillation Dynamics: Towards Understanding Feature-Based Distillation in Vision Transformers AAAI 2026

Turbo-VAED: Fast and Stable Transfer of Video-VAEs to Mobile Devices AAAI 2026

TOP-RL: Task-Optimized Progressive Token Pruning with Reinforcement Learning for Vision Language Models AAAI 2026

SPEED-Q: Staged Processing with Enhanced Distillation Towards Efficient Low-Bit On-Device VLM Quantization AAAI 2026

Stratified Knowledge-Density Super-Network for Scalable Vision Transformers AAAI 2026

ReLUPruner: Rethinking ReLU Importance with Taylor Expansion for Efficient Private Inference AAAI 2026

Direction Sensitivity–Based Knowledge Distillation: Optimization-Aware Low-Rank Knowledge Transfer AAAI 2026

OccamVTS: Distilling Vision Models to 1% Parameters for Time Series Forecasting AAAI 2026

DQT: Dynamic Quantization Training via Dequantization-Free Nested Integer Arithmetic AAAI 2026

Credal Ensemble Distillation for Uncertainty Quantification AAAI 2026

Federated CLIP for Resource-Efficient Heterogeneous Medical Image Classification AAAI 2026

RoSA: Enhancing Parameter-Efficient Fine-Tuning via RoPE-aware Selective Adaptation in Large Language Models AAAI 2026

Prune&Comp: Free Lunch for Layer-Pruned LLMs via Iterative Pruning with Magnitude Compensation AAAI 2026

CMedBench: A Comprehensive Benchmark for Efficient Medical Large Language Models AAAI 2026

HALO: Hardware-Aware Quantization with Low Critical-Path-Delay Weights for LLM Acceleration AAAI 2026

Learnable Permutation for Structured Sparsity on Transformer Models AAAI 2026

C-GNN-PRUNE: A Unified Graph-Based Framework for Structure-Aware Pruning of Mixture-of-Experts Models AAAI 2026