Co-occurring keywords
Papers
No Train No Gain: Revisiting Efficient Training Algorithms For Transformer-based Language Models
NIPS 2023
Efficient Training of Neural Transducer for Speech Recognition
INTERSPEECH 2022
Marginal Utility Diminishes: Exploring the Minimum Knowledge for BERT Knowledge Distillation
ACL 2021
GRAD-MATCH: Gradient Matching based Data Subset Selection for Efficient Deep Model Training
ICML 2021
Accelerating Training of Transformer-Based Language Models with Progressive Layer Dropping
NIPS 2020