transformer architecture

1555 papers

Explore in graph

Also known as

TTE STTR TGT DIT ENT BERT DETR ROBERTA TA

Co-occurring keywords

attention mechanism (3975) neural machine translation (2310) transformer model (1988) representation learning (6174) language model (4573) neural network (6616) self-attention mechanism (350) multimodal learning (4622) machine translation (2472) large language model (12755)

Papers

Visualizing and Measuring the Geometry of BERT NIPS 2019

Very Deep Self-Attention Networks for End-to-End Speech Recognition INTERSPEECH 2019

Tree Transformer: Integrating Tree Structures into Self-Attention EMNLP 2019

Question Answering Using Hierarchical Attention on Top of BERT Features EMNLP 2019

SYSTRAN @ WAT 2019: Russian-Japanese News Commentary task EMNLP 2019

FastSpeech: Fast, Robust and Controllable Text to Speech NIPS 2019

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation EMNLP 2019

From Balustrades to Pierre Vinken: Looking for Syntax in Transformer Self-Attentions ACL 2019

The Unreasonable Effectiveness of Transformer Language Models in Grammatical Error Correction ACL 2019

NLPRL at WAT2019: Transformer-based Tamil – English Indic Task Neural Machine Translation System EMNLP 2019

Text Summarization with Pretrained Encoders IJCNLP 2019

Latent Part-of-Speech Sequences for Neural Machine Translation IJCNLP 2019

Widening the Representation Bottleneck in Neural Machine Translation with Lexical Shortcuts ACL 2019

Multi-Source Transformer for Kazakh-Russian-English Neural Machine Translation ACL 2019

Adapting Transformer to End-to-End Spoken Language Translation INTERSPEECH 2019

The Evolved Transformer ICML 2019

The MLLP-UPV Spanish-Portuguese and Portuguese-Spanish Machine Translation Systems for WMT19 Similar Language Translation Task ACL 2019

NICT’s Machine Translation Systems for the WMT19 Similar Language Translation Task ACL 2019

Joey NMT: A Minimalist NMT Toolkit for Novices EMNLP 2019

Analyzing Multi-Head Self-Attention: Specialized Heads Do the Heavy Lifting, the Rest Can Be Pruned ACL 2019

Episodic Memory Reader: Learning What to Remember for Question Answering from Streaming Data ACL 2019

Syntactically Supervised Transformers for Faster Neural Machine Translation ACL 2019

Adaptive Attention Span in Transformers ACL 2019

BSC Participation in the WMT Translation of Biomedical Abstracts ACL 2019

Humor Detection: A Transformer Gets the Last Laugh EMNLP 2019