Research Explorer
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Achievements
About
Methodology
← Learning Types
Deep Learning
›
Learning Types
›
Multi-Modal Learning
3194 directly classified papers
Papers per year
2003: 1
2010: 1
2011: 1
2013: 5
2014: 3
2015: 9
2016: 23
2017: 49
2018: 78
2019: 158
2020: 223
2021: 261
2022: 354
2023: 471
2024: 705
2025: 835
2026: 17
Papers
Caption Enriched Samples for Improving Hateful Memes Detection
EMNLP 2021
Inflate and Shrink:Enriching and Reducing Interactions for Fast Text-Image Retrieval
EMNLP 2021
Visually Grounded Reasoning across Languages and Cultures
EMNLP 2021
A Web Scale Entity Extraction System
EMNLP 2021
Cross-Modal Retrieval Augmentation for Multi-Modal Classification
EMNLP 2021
Generating Mammography Reports from Multi-view Mammograms with BERT
EMNLP 2021
Compositional Networks Enable Systematic Generalization for Grounded Language Understanding
EMNLP 2021
Saliency-based Multi-View Mixed Language Training for Zero-shot Cross-lingual Classification
EMNLP 2021
Entity-level Cross-modal Learning Improves Multi-modal Machine Translation
EMNLP 2021
MIRTT: Learning Multimodal Interaction Representations from Trilinear Transformers for Visual Question Answering
EMNLP 2021
Progressive Transformer-Based Generation of Radiology Reports
EMNLP 2021
Visual Cues and Error Correction for Translation Robustness
EMNLP 2021
MSD: Saliency-aware Knowledge Distillation for Multimodal Understanding
EMNLP 2021
Image Retrieval for Arguments Using Stance-Aware Query Expansion
EMNLP 2021
Empathetic Dialog Generation with Fine-Grained Intents
EMNLP 2021
Coreference by Appearance: Visually Grounded Event Coreference Resolution
EMNLP 2021
FaBULOUS: Fact-checking Based on Understanding of Language Over Unstructured and Structured information
EMNLP 2021
Multi-modal Retrieval of Tables and Texts Using Tri-encoder Models
EMNLP 2021
Discriminative Multi-Modality Speech Recognition
CVPR 2020
Learning Longterm Representations for Person Re-Identification Using Radio Signals
CVPR 2020
Cops-Ref: A New Dataset and Task on Compositional Referring Expression Comprehension
CVPR 2020
Hi-CMD: Hierarchical Cross-Modality Disentanglement for Visible-Infrared Person Re-Identification
CVPR 2020
Speech2Action: Cross-Modal Supervision for Action Recognition
CVPR 2020
Violin: A Large-Scale Dataset for Video-and-Language Inference
CVPR 2020
A Local-to-Global Approach to Multi-Modal Movie Scene Segmentation
CVPR 2020
<
1
…
105
106
107
…
128
>