visual question answering
1000 papers
Also known as
VQAI
OK-VQA
VQA
VIDEOQA
TEXTVQA
IMAGEQA
Co-occurring keywords
Papers
Multi-Level Information Retrieval Augmented Generation for Knowledge-based Visual Question Answering
EMNLP 2024
Mask4Align: Aligned Entity Prompting with Color Masks for Multi-Entity Localization Problems
CVPR 2024
Synthesize Step-by-Step: Tools Templates and LLMs as Data Generators for Reasoning-Based Chart VQA
CVPR 2024
Large Language Models Know What is Key Visual Entity: An LLM-assisted Multimodal Retrieval for VQA
EMNLP 2024