Research Explorer
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Achievements
About
Methodology
← Learning Types
Deep Learning
›
Learning Types
›
Multi-Modal Learning
3194 directly classified papers
Papers per year
2003: 1
2010: 1
2011: 1
2013: 5
2014: 3
2015: 9
2016: 23
2017: 49
2018: 78
2019: 158
2020: 223
2021: 261
2022: 354
2023: 471
2024: 705
2025: 835
2026: 17
Papers
Fast-CLOCs: Fast Camera-LiDAR Object Candidates Fusion for 3D Object Detection
WACV 2022
Domain Generalization Through Audio-Visual Relative Norm Alignment in First Person Action Recognition
WACV 2022
Learning To Answer Questions in Dynamic Audio-Visual Scenarios
CVPR 2022
X-Pool: Cross-Modal Language-Video Attention for Text-Video Retrieval
CVPR 2022
XMP-Font: Self-Supervised Cross-Modality Pre-Training for Few-Shot Font Generation
CVPR 2022
ContIG: Self-Supervised Multimodal Contrastive Learning for Medical Imaging With Genetics
CVPR 2022
UMT: Unified Multi-Modal Transformers for Joint Video Moment Retrieval and Highlight Detection
CVPR 2022
M3L: Language-Based Video Editing via Multi-Modal Multi-Level Transformers
CVPR 2022
DeepFusion: Lidar-Camera Deep Fusion for Multi-Modal 3D Object Detection
CVPR 2022
C3-STISR: Scene Text Image Super-resolution with Triple Clues
IJCAI 2022
MMT: Multi-way Multi-modal Transformer for Multimodal Learning
IJCAI 2022
Recipe2Vec: Multi-modal Recipe Representation Learning with Graph Neural Networks
IJCAI 2022
Data-Efficient Playlist Captioning With Musical and Linguistic Knowledge
EMNLP 2022
Self-Supervised Object Detection From Audio-Visual Correspondence
CVPR 2022
CroMo: Cross-Modal Learning for Monocular Depth Estimation
CVPR 2022
Visual Acoustic Matching
CVPR 2022
MERLOT Reserve: Neural Script Knowledge Through Vision and Language and Sound
CVPR 2022
3MASSIV: Multilingual, Multimodal and Multi-Aspect Dataset of Social Media Short Videos
CVPR 2022
Crossmodal-3600: A Massively Multilingual Multimodal Evaluation Dataset
EMNLP 2022
Multimodal Sarcasm Target Identification in Tweets
ACL 2022
Toolbox for Multimodal Learn (scikit-multimodallearn)
JMLR 2022
Reading To Listen at the Cocktail Party: Multi-Modal Speech Separation
CVPR 2022
An Empirical Study of Training End-to-End Vision-and-Language Transformers
CVPR 2022
When did you become so smart, oh wise one?! Sarcasm Explanation in Multi-modal Multi-party Dialogues
ACL 2022
Grafting Pre-trained Models for Multimodal Headline Generation
EMNLP 2022
<
1
…
91
92
93
…
128
>