2018
ACL
ACL 2018
SNAG: Spoken Narratives and Gaze Dataset
Abstract
AbstractHumans rely on multiple sensory modalities when examining and reasoning over images. In this paper, we describe a new multimodal dataset that consists of gaze measurements and spoken descriptions collected in parallel during an image inspection task. The task was performed by multiple participants on 100 general-domain images showing everyday objects and activities. We demonstrate the usefulness of the dataset by applying an existing visual-linguistic data fusion framework in order to label important image regions with appropriate linguistic labels.
🌉
Interdisciplinary Bridge
— Artificial Intelligence and Computer Vision and Machine Learning
🧭
Keyword Pioneer
— multimodal dataset
🐣
Hot Topic Early Bird
— multimodal dataset
🐝
Cross-Pollinator
— Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio