Co-occurring keywords
Papers
Super Tickets in Pre-Trained Language Models: From Model Compression to Improving Generalization
IJCNLP 2021
Marginal Utility Diminishes: Exploring the Minimum Knowledge for BERT Knowledge Distillation
IJCNLP 2021
Structural Knowledge Distillation: Tractably Distilling Information for Structured Predictor
IJCNLP 2021
Making DensePose Fast and Light
WACV 2021
AdaSGN: Adapting Joint Number and Model Size for Efficient Skeleton-Based Action Recognition
ICCV 2021
Does Knowledge Distillation Really Work?
NIPS 2021