Co-occurring keywords
Papers
Faster Depth-Adaptive Transformers
AAAI 2021
Super Tickets in Pre-Trained Language Models: From Model Compression to Improving Generalization
IJCNLP 2021
Marginal Utility Diminishes: Exploring the Minimum Knowledge for BERT Knowledge Distillation
IJCNLP 2021