Research Explorer
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Achievements
About
Methodology
← Learning Types
Deep Learning
›
Learning Types
›
Multi-Modal Learning
3194 directly classified papers
Papers per year
2003: 1
2010: 1
2011: 1
2013: 5
2014: 3
2015: 9
2016: 23
2017: 49
2018: 78
2019: 158
2020: 223
2021: 261
2022: 354
2023: 471
2024: 705
2025: 835
2026: 17
Papers
An End-To-End Network for Generating Social Relationship Graphs
CVPR 2019
Learning Cross-Modal Embeddings With Adversarial Networks for Cooking Recipes and Food Images
CVPR 2019
Weakly Supervised Video Moment Retrieval From Text Queries
CVPR 2019
Towards Multimodal Sarcasm Detection (An _Obviously_ Perfect Paper)
ACL 2019
Robust Neural Machine Translation with Joint Textual and Phonetic Embedding
ACL 2019
Multimodal, Multilingual Grapheme-to-Phoneme Conversion for Low-Resource Languages
EMNLP 2019
Context-aware Interactive Attention for Multi-modal Sentiment and Emotion Analysis
EMNLP 2019
Multi-Head Attention with Diversity for Learning Grounded Multilingual Multimodal Representations
EMNLP 2019
Multi-Interactive Memory Network for Aspect Based Multimodal Sentiment Analysis
AAAI 2019
Joint Representation Learning for Multi-Modal Transportation Recommendation
AAAI 2019
Play as You Like: Timbre-Enhanced Multi-Modal Music Style Transfer
AAAI 2019
Y2Seq2Seq: Cross-Modal Representation Learning for 3D Shape and Text by Joint Reconstruction and Prediction of View and Word Sequences
AAAI 2019
Categorizing and Inferring the Relationship between the Text and Image of Twitter Posts
ACL 2019
Situational Fusion of Visual Representation for Visual Navigation
ICCV 2019
Factor Graph Attention
CVPR 2019
Information Maximizing Visual Question Generation
CVPR 2019
Heterogeneous Memory Enhanced Multimodal Attention Model for Video Question Answering
CVPR 2019
MUREL: Multimodal Relational Reasoning for Visual Question Answering
CVPR 2019
Polysemous Visual-Semantic Embedding for Cross-Modal Retrieval
CVPR 2019
2.5D Visual Sound
CVPR 2019
Lipper: Synthesizing Thy Speech Using Multi-View Lipreading
AAAI 2019
Neural Collective Graphical Models for Estimating Spatio-Temporal Population Flow from Aggregated Data
AAAI 2019
Hashtag Recommendation for Photo Sharing Services
AAAI 2019
Cross-Modal Commentator: Automatic Machine Commenting Based on Cross-Modal Information
ACL 2019
Detecting Mismatch Between Speech and Transcription Using Cross-Modal Attention
INTERSPEECH 2019
<
1
…
115
116
117
…
128
>