conftrace_

knowledge distillation

3725 papers

Explore in graph

Also known as

KD

Co-occurring keywords

model compression (3302) large language model (13587) transfer learning (5449) domain adaptation (4595) representation learning (6206) neural network (6616) language model (4599) catastrophic forgetting (958) continual learning (1181) contrastive learning (4032)

Papers

An LLM-Enhanced Adversarial Editing System for Lexical Simplification COLING 2024

Class-Incremental Few-Shot Event Detection COLING 2024

Distillation with Explanations from Large Language Models COLING 2024

Effective Distillation of Table-based Reasoning Ability from LLMs COLING 2024

Evolving Knowledge Distillation with Large Language Models and Active Learning COLING 2024

Sinkhorn Distance Minimization for Knowledge Distillation COLING 2024

TAeKD: Teacher Assistant Enhanced Knowledge Distillation for Closed-Source Multilingual Neural Machine Translation COLING 2024

Task-agnostic Distillation of Encoder-Decoder Language Models COLING 2024

When Babies Teach Babies: Can student knowledge sharing outperform Teacher-Guided Distillation on small datasets? CONLL 2024

Rethinking Attention: Exploring Shallow Feed-Forward Neural Networks as an Alternative to Attention Layers in Transformers (Student Abstract) AAAI 2024

COSIGN: Contextual Facts Guided Generation for Knowledge Graph Completion NAACL 2024

A Small and Fast BERT for Chinese Medical Punctuation Restoration INTERSPEECH 2024

Online Knowledge Distillation of Decoder-Only Large Language Models for Efficient Speech Recognition INTERSPEECH 2024

MISA: MIning Saliency-Aware Semantic Prior for Box Supervised Instance Segmentation IJCAI 2024

Knowledge Transfer via Compact Model in Federated Learning (Student Abstract) AAAI 2024

Large Language Model as a Policy Teacher for Training Reinforcement Learning Agents IJCAI 2024

Continual Compositional Zero-Shot Learning IJCAI 2024

Align-to-Distill: Trainable Attention Alignment for Knowledge Distillation in Neural Machine Translation COLING 2024

Self-Knowledge Distillation for Knowledge Graph Embedding COLING 2024

MadEye: Boosting Live Video Analytics Accuracy with Adaptive Camera Configurations NSDI 2024

Choosy Babies Need One Coach: Inducing Mode-Seeking Behavior in BabyLlama with Reverse KL Divergence CONLL 2024

GenDistiller: Distilling Pre-trained Language Models based on an Autoregressive Generative Model INTERSPEECH 2024

Data Shunt: Collaboration of Small and Large Models for Lower Costs and Better Performance AAAI 2024

Prophecy Distillation for Boosting Abstractive Summarization COLING 2024

Deep Classifier Mimicry without Data Access AISTATS 2024