Co-occurring keywords
Papers
Extracting General-use Transformers for Low-resource Languages via Knowledge Distillation
COLING 2025
ZigZagKV: Dynamic KV Cache Compression for Long-context Modeling based on Layer Uncertainty
COLING 2025
Revisiting Disparity from Dual-Pixel Images: Physics-Informed Lightweight Depth Estimation
WACV 2025
Rethinking Kullback-Leibler Divergence in Knowledge Distillation for Large Language Models
COLING 2025
Iterative Structured Knowledge Distillation: Optimizing Language Models Through Layer-by-Layer Distillation
COLING 2025
DP-FROST: Differentially Private Fine-tuning of Pre-trained Models with Freezing Model Parameters
COLING 2025
Enhancing One-Shot Pruned Pre-trained Language Models through Sparse-Dense-Sparse Mechanism
COLING 2025