← Learning Types

Deep Learning › Learning Types ›

Multi-Modal Learning

3194 directly classified papers

Papers per year

Papers

Multi-Perspective Relevance Matching with Hierarchical ConvNets for Social Media Search AAAI 2019

KVQA: Knowledge-Aware Visual Question Answering AAAI 2019

A Layer-Based Sequential Framework for Scene Generation with GANs AAAI 2019

Hierarchical Photo-Scene Encoder for Album Storytelling AAAI 2019

Differential Networks for Visual Question Answering AAAI 2019

Multilevel Language and Vision Integration for Text-to-Clip Retrieval AAAI 2019

Adversarial Semantic Alignment for Improved Image Captions CVPR 2019

Answer Them All! Toward Universal Visual Question Answering Models CVPR 2019

Unsupervised Multi-Modal Neural Machine Translation CVPR 2019

Complete the Look: Scene-Based Complementary Product Recommendation CVPR 2019

The IIIT-H Gujarati-English Machine Translation System for WMT19 ACL 2019

Faithful Multimodal Explanation for Visual Question Answering ACL 2019

Exploring Deep Multimodal Fusion of Text and Photo for Hate Speech Classification ACL 2019

Multimodal Logical Inference System for Visual-Textual Entailment ACL 2019

Cross-domain and Cross-lingual Abusive Language Detection: A Hybrid Approach with Deep Learning and a Multilingual Lexicon ACL 2019

A Strong and Robust Baseline for Text-Image Matching ACL 2019

Multimodal Transformer for Unaligned Multimodal Language Sequences ACL 2019

Distilling Translations with Visual Awareness ACL 2019

Bridging by Word: Image Grounded Vocabulary Construction for Visual Captioning ACL 2019

CoDraw: Collaborative Drawing as a Testbed for Grounded Goal-driven Communication ACL 2019

Informative Image Captioning with External Sources of Information ACL 2019

Multi-step Reasoning via Recurrent Dual Attention for Visual Dialog ACL 2019

Symbolic Inductive Bias for Visually Grounded Learning of Spoken Language ACL 2019

What Should I Ask? Using Conversationally Informative Rewards for Goal-oriented Visual Dialog. ACL 2019

A Corpus for Reasoning about Natural Language Grounded in Photographs ACL 2019