transformer architecture
1555 papers
Also known as
TTE
STTR
TGT
DIT
ENT
BERT
DETR
ROBERTA
TA
Co-occurring keywords
Papers
Attention is Turing-Complete
JMLR 2021
Generic Attention-Model Explainability for Interpreting Bi-Modal and Encoder-Decoder Transformers
ICCV 2021