Co-occurring keywords
Papers
Context-Aware Assistant Selection for Improved Inference Acceleration with Large Language Models
EMNLP 2024
PromptKD: Distilling Student-Friendly Knowledge for Generative Language Models via Prompt Tuning
EMNLP 2024
Pruning before Fine-tuning: A Retraining-free Compression Framework for Pre-trained Language Models
COLING 2024
Probe Then Retrieve and Reason: Distilling Probing and Reasoning Capabilities into Smaller Language Models
COLING 2024
Multilingual Brain Surgeon: Large Language Models Can Be Compressed Leaving No Language behind
COLING 2024
Efficient Audio Captioning with Encoder-Level Knowledge Distillation
INTERSPEECH 2024
DeltaDEQ: Exploiting Heterogeneous Convergence for Accelerating Deep Equilibrium Iterations
NIPS 2024
FastAST: Accelerating Audio Spectrogram Transformer via Token Merging and Cross-Model Knowledge Distillation
INTERSPEECH 2024
EAVE: Efficient Product Attribute Value Extraction via Lightweight Sparse-layer Interaction
EMNLP 2024