Co-occurring keywords
Papers
Class-Incremental Few-Shot Event Detection
COLING 2024
A Small and Fast BERT for Chinese Medical Punctuation Restoration
INTERSPEECH 2024
Online Knowledge Distillation of Decoder-Only Large Language Models for Efficient Speech Recognition
INTERSPEECH 2024
Continual Compositional Zero-Shot Learning
IJCAI 2024
Align-to-Distill: Trainable Attention Alignment for Knowledge Distillation in Neural Machine Translation
COLING 2024
Choosy Babies Need One Coach: Inducing Mode-Seeking Behavior in BabyLlama with Reverse KL Divergence
CONLL 2024
GenDistiller: Distilling Pre-trained Language Models based on an Autoregressive Generative Model
INTERSPEECH 2024
Data Shunt: Collaboration of Small and Large Models for Lower Costs and Better Performance
AAAI 2024
Deep Classifier Mimicry without Data Access
AISTATS 2024