conftrace_

knowledge distillation

3725 papers

Explore in graph

Also known as

KD

Co-occurring keywords

model compression (3302) large language model (13587) transfer learning (5449) domain adaptation (4595) representation learning (6206) neural network (6616) language model (4599) catastrophic forgetting (958) continual learning (1181) contrastive learning (4032)

Papers

Are Intermediate Layers and Labels Really Necessary? A General Language Model Distillation Method ACL 2023

Modular Transformers: Compressing Transformers into Modularized Layers for Flexible Efficient Inference ACL 2023

A Study on Knowledge Distillation from Weak Teacher for Scaling Up Pre-trained Language Models ACL 2023

I2R’s End-to-End Speech Translation System for IWSLT 2023 Offline Shared Task ACL 2023

Continual Generalized Intent Discovery: Marching Towards Dynamic and Open-world Intent Recognition EMNLP 2023

Length-Adaptive Distillation: Customizing Small Language Model for Dynamic Token Pruning EMNLP 2023

Application of Knowledge Distillation to Multi-Task Speech Representation Learning INTERSPEECH 2023

Knowledge Distillation on Joint Task End-to-End Speech Translation INTERSPEECH 2023

FerKD: Surgical Label Adaptation for Efficient Distillation ICCV 2023

Connective Prediction for Implicit Discourse Relation Recognition via Knowledge Distillation ACL 2023

FutureTOD: Teaching Future Knowledge to Pre-trained Language Model for Task-Oriented Dialogue ACL 2023

Local or Global: Selective Knowledge Assimilation for Federated Learning with Limited Labels ICCV 2023

Bayesian Optimization Meets Self-Distillation ICCV 2023

BiViT: Extremely Compressed Binary Vision Transformers ICCV 2023

Learning to Distill Global Representation for Sparse-View CT ICCV 2023

Efficient Unified Demosaicing for Bayer and Non-Bayer Patterned Image Sensors ICCV 2023

SEFD: Learning to Distill Complex Pose and Occlusion ICCV 2023

Distill n’ Explain: explaining graph neural networks using simple surrogates AISTATS 2023

A Teacher-Student Approach for Extracting Informative Speaker Embeddings From Speech Mixtures INTERSPEECH 2023

Soft Target-Enhanced Matching Framework for Deep Entity Matching AAAI 2023

Structure Aware Incremental Learning with Personalized Imitation Weights for Recommender Systems AAAI 2023

AIO-P: Expanding Neural Performance Predictors beyond Image Classification AAAI 2023

Auxiliary Modality Learning with Generalized Curriculum Distillation ICML 2023

Label-Guided Knowledge Distillation for Continual Semantic Segmentation on 2D Images and 3D Point Clouds ICCV 2023

Masked Autoencoders Are Stronger Knowledge Distillers ICCV 2023