Co-occurring keywords
Papers
MMFT-BERT: Multimodal Fusion Transformer with BERT Encodings for Visual Question Answering
EMNLP 2020
CapWAP: Image Captioning with a Purpose
EMNLP 2020
STL-CQA: Structure-based Transformers with Localization and Encoding for Chart Question Answering
EMNLP 2020
VQA With No Questions-Answers Training
CVPR 2020