conftrace
_
Papers
Trends
Conferences
Explore
More
Authors
Topics
Keywords
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Achievements
← Keywords
visual question answering
1000 papers
Explore in graph
Also known as
VQA
Co-occurring keywords
multimodal learning
(4622)
vision-language model
(2235)
image captioning
(728)
vision language model
(752)
multi-modal learning
(1276)
multimodal large language model
(865)
large language model
(12755)
visual reasoning
(479)
attention mechanism
(3975)
benchmark evaluation
(1539)
Papers
EHRXQA: A Multi-Modal Question Answering Dataset for Electronic Health Records with Chest X-ray Images
NIPS 2023
Affection: Learning Affective Explanations for Real-World Visual Data
CVPR 2023
LoRA: A Logical Reasoning Augmented Dataset for Visual Question Answering
NIPS 2023
Symbolic Replay: Scene Graph as Prompt for Continual Learning on VQA Task
AAAI 2023
Barlow Constrained Optimization for Visual Question Answering
WACV 2023
Guiding Visual Question Answering With Attention Priors
WACV 2023
VLC-BERT: Visual Question Answering With Contextualized Commonsense Knowledge
WACV 2023
Modular Visual Question Answering via Code Generation
ACL 2023
Why Did the Chicken Cross the Road? Rephrasing and Analyzing Ambiguous Questions in VQA
ACL 2023
Language Is Not All You Need: Aligning Perception with Language Models
NIPS 2023
LAVIS: A One-stop Library for Language-Vision Intelligence
ACL 2023
S3C: Semi-Supervised VQA Natural Language Explanation via Self-Critical Learning
CVPR 2023
RMLVQA: A Margin Loss Approach for Visual Question Answering With Language Biases
CVPR 2023
I Can't Believe There's No Images! Learning Visual Tasks Using only Language Supervision
ICCV 2023
MPMQA: Multimodal Question Answering on Product Manuals
AAAI 2023
mRedditSum: A Multimodal Abstractive Summarization Dataset of Reddit Threads with Images
EMNLP 2023
Large Language Models are Visual Reasoning Coordinators
NIPS 2023
TOA: Task-oriented Active VQA
NIPS 2023
Exploring the Effect of Primitives for Compositional Generalization in Vision-and-Language
CVPR 2023
LLaVA-Med: Training a Large Language-and-Vision Assistant for Biomedicine in One Day
NIPS 2023
Implicit Differentiable Outlier Detection Enable Robust Deep Multimodal Analysis
NIPS 2023
REVEAL: Retrieval-Augmented Visual-Language Pre-Training With Multi-Source Multimodal Knowledge Memory
CVPR 2023
VQACL: A Novel Visual Question Answering Continual Learning Setting
CVPR 2023
Improving Selective Visual Question Answering by Learning From Your Peers
CVPR 2023
Generate then Select: Open-ended Visual Question Answering Guided by World Knowledge
ACL 2023
<
1
…
21
22
23
…
40
>