transformer architecture
1555 papers
Also known as
TTE
STTR
TGT
DIT
ENT
BERT
DETR
ROBERTA
TA
Co-occurring keywords
Papers
Very Deep Self-Attention Networks for End-to-End Speech Recognition
INTERSPEECH 2019
The Unreasonable Effectiveness of Transformer Language Models in Grammatical Error Correction
ACL 2019
NLPRL at WAT2019: Transformer-based Tamil – English Indic Task Neural Machine Translation System
EMNLP 2019
Text Summarization with Pretrained Encoders
IJCNLP 2019
Widening the Representation Bottleneck in Neural Machine Translation with Lexical Shortcuts
ACL 2019
Adapting Transformer to End-to-End Spoken Language Translation
INTERSPEECH 2019
The Evolved Transformer
ICML 2019
Analyzing Multi-Head Self-Attention: Specialized Heads Do the Heavy Lifting, the Rest Can Be Pruned
ACL 2019