Research Explorer
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Achievements
About
Methodology
← Learning Types
Deep Learning
›
Learning Types
›
Multi-Modal Learning
3194 directly classified papers
Papers per year
2003: 1
2010: 1
2011: 1
2013: 5
2014: 3
2015: 9
2016: 23
2017: 49
2018: 78
2019: 158
2020: 223
2021: 261
2022: 354
2023: 471
2024: 705
2025: 835
2026: 17
Papers
PICTURE: PhotorealistIC virtual Try-on from UnconstRained dEsigns
CVPR 2024
RedCore: Relative Advantage Aware Cross-Modal Representation Learning for Missing Modalities with Imbalanced Missing Rates
AAAI 2024
XKD: Cross-Modal Knowledge Distillation with Domain Alignment for Video Representation Learning
AAAI 2024
Probabilistic Conformal Distillation for Enhancing Missing Modality Robustness
NIPS 2024
DetCLIPv3: Towards Versatile Generative Open-vocabulary Object Detection
CVPR 2024
LangSplat: 3D Language Gaussian Splatting
CVPR 2024
Attention-Induced Embedding Imputation for Incomplete Multi-View Partial Multi-Label Classification
AAAI 2024
Data-Efficient Multimodal Fusion on a Single GPU
CVPR 2024
Step Differences in Instructional Video
CVPR 2024
Noise-Aware Image Captioning with Progressively Exploring Mismatched Words
AAAI 2024
RELI11D: A Comprehensive Multimodal Human Motion Dataset and Method
CVPR 2024
Binding Touch to Everything: Learning Unified Multimodal Tactile Representations
CVPR 2024
Multi-view Aggregation Network for Dichotomous Image Segmentation
CVPR 2024
Complementing Event Streams and RGB Frames for Hand Mesh Reconstruction
CVPR 2024
UniHuman: A Unified Model For Editing Human Images in the Wild
CVPR 2024
Improving Image Restoration through Removing Degradations in Textual Representations
CVPR 2024
U-VAP: User-specified Visual Appearance Personalization via Decoupled Self Augmentation
CVPR 2024
Linguistic-Aware Patch Slimming Framework for Fine-grained Cross-Modal Alignment
CVPR 2024
Open Vocabulary Semantic Scene Sketch Understanding
CVPR 2024
GLaMM: Pixel Grounding Large Multimodal Model
CVPR 2024
On Disentanglement of Asymmetrical Knowledge Transfer for Modality-Task Agnostic Federated Learning
AAAI 2024
ES3: Evolving Self-Supervised Learning of Robust Audio-Visual Speech Representations
CVPR 2024
FedDAT: An Approach for Foundation Model Finetuning in Multi-Modal Heterogeneous Federated Learning
AAAI 2024
SmartEdit: Exploring Complex Instruction-based Image Editing with Multimodal Large Language Models
CVPR 2024
CFPL-FAS: Class Free Prompt Learning for Generalizable Face Anti-spoofing
CVPR 2024
<
1
…
36
37
38
…
128
>