transformer architecture
1555 papers
Also known as
TTE
STTR
TGT
DIT
ENT
BERT
DETR
ROBERTA
TA
Co-occurring keywords
Papers
A Closer Look at Parameter Contributions When Training Neural Language and Translation Models
COLING 2022
ByT5 model for massively multilingual grapheme-to-phoneme conversion
INTERSPEECH 2022