Research Explorer
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Achievements
About
Methodology
← Learning Types
Deep Learning
›
Learning Types
›
Multi-Modal Learning
3194 directly classified papers
Papers per year
2003: 1
2010: 1
2011: 1
2013: 5
2014: 3
2015: 9
2016: 23
2017: 49
2018: 78
2019: 158
2020: 223
2021: 261
2022: 354
2023: 471
2024: 705
2025: 835
2026: 17
Papers
MMTM: Multimodal Transfer Module for CNN Fusion
CVPR 2020
What Makes Training Multi-Modal Classification Networks Hard?
CVPR 2020
12-in-1: Multi-Task Vision and Language Representation Learning
CVPR 2020
Object Relational Graph With Teacher-Recommended Learning for Video Captioning
CVPR 2020
Texture and Shape Biased Two-Stream Networks for Clothing Classification and Attribute Recognition
CVPR 2020
Temporal-Context Enhanced Detection of Heavily Occluded Pedestrians
CVPR 2020
Learning User Representations for Open Vocabulary Image Hashtag Prediction
CVPR 2020
Referring Image Segmentation via Cross-Modal Progressive Comprehension
CVPR 2020
Image-Chat: Engaging Grounded Conversations
ACL 2020
Multimodal and Multiresolution Speech Recognition with Transformers
ACL 2020
Human Consensus-Oriented Image Captioning
IJCAI 2020
TLPG-Tracker: Joint Learning of Target Localization and Proposal Generation for Visual Tracking
IJCAI 2020
Weakly Supervised Few-shot Object Segmentation using Co-Attention with Visual and Semantic Embeddings
IJCAI 2020
Arbitrary Talking Face Generation via Attentional Audio-Visual Coherence Learning
IJCAI 2020
BlueMemo: Depression Analysis through Twitter Posts
IJCAI 2020
Detecting Entailment in Code-Mixed Hindi-English Conversations
EMNLP 2020
MAST: Multimodal Abstractive Summarization with Trimodal Hierarchical Attention
EMNLP 2020
Unsupervised Keyword Extraction for Full-Sentence VQA
EMNLP 2020
Building a Bridge: A Method for Image-Text Sarcasm Detection Without Pretraining on Image-Text Data
EMNLP 2020
Catplayinginthesnow: Impact of Prior Segmentation on a Model of Visually Grounded Speech
EMNLP 2020
Utilizing Multimodal Feature Consistency to Detect Adversarial Examples on Clinical Summaries
EMNLP 2020
Deep Attentive Learning for Stock Movement Prediction From Social Media Text and Company Correlations
EMNLP 2020
Cross-Media Keyphrase Prediction: A Unified Framework with Multi-Modality Multi-Head Attention and Image Wordings
EMNLP 2020
HENIN: Learning Heterogeneous Neural Interaction Networks for Explainable Cyberbullying Detection on Social Media
EMNLP 2020
Enhancing Answer Boundary Detection for Multilingual Machine Reading Comprehension
ACL 2020
<
1
…
111
112
113
…
128
>