2023 EMNLP EMNLP 2023

Baby Llama: knowledge distillation from an ensemble of teachers trained on a small dataset with no performance penalty