Research Explorer
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Achievements
About
Methodology
← Learning Types
Deep Learning
›
Learning Types
›
Multi-Modal Learning
3194 directly classified papers
Papers per year
2003: 1
2010: 1
2011: 1
2013: 5
2014: 3
2015: 9
2016: 23
2017: 49
2018: 78
2019: 158
2020: 223
2021: 261
2022: 354
2023: 471
2024: 705
2025: 835
2026: 17
Papers
NOTA: Multimodal Music Notation Understanding for Visual Large Language Model
NAACL 2025
Uncertainty Quantification for Clinical Outcome Predictions with (Large) Language Models
NAACL 2025
Target-Augmented Shared Fusion-based Multimodal Sarcasm Explanation Generation
NAACL 2025
Beyond Base Predictors: Using LLMs to Resolve Ambiguities in Akkadian Lemmatization
NAACL 2025
CUET-NLP_Big_O@DravidianLangTech 2025: A Multimodal Fusion-based Approach for Identifying Misogyny Memes
NAACL 2025
SSNTrio @ DravidianLangTech 2025: Hybrid Approach for Hate Speech Detection in Dravidian Languages with Text and Audio Modalities
NAACL 2025
Fired_from_NLP@DravidianLangTech 2025: A Multimodal Approach for Detecting Misogynistic Content in Tamil and Malayalam Memes
NAACL 2025
CUET_Novice@DravidianLangTech 2025: A Multimodal Transformer-Based Approach for Detecting Misogynistic Memes in Malayalam Language
NAACL 2025
teamiic@DravidianLangTech2025-NAACL 2025: Transformer-Based Multimodal Feature Fusion for Misogynistic Meme Detection in Low-Resource Dravidian Language
NAACL 2025
SemanticCuetSync@DravidianLangTech 2025: Multimodal Fusion for Hate Speech Detection - A Transformer Based Approach with Cross-Modal Attention
NAACL 2025
One_by_zero@DravidianLangTech 2025: A Multimodal Approach for Misogyny Meme Detection in Malayalam Leveraging Visual and Textual Features
NAACL 2025
CUET-NLP_MP@DravidianLangTech 2025: A Transformer-Based Approach for Bridging Text and Vision in Misogyny Meme Detection in Dravidian Languages
NAACL 2025
Exploring Multimodal Language Models for Sustainability Disclosure Extraction: A Comparative Study
NAACL 2025
Caption Generation in Cultural Heritage: Crowdsourced Data and Tuning Multimodal Large Language Models
NAACL 2025
Cross-Modal Learning for Music-to-Music-Video Description Generation
NAACL 2025
VLind-Bench: Measuring Language Priors in Large Vision-Language Models
NAACL 2025
Survival Prediction in Lung Cancer through Multi-Modal Representation Learning
WACV 2025
Multimodal Inverse Attention Network with Intrinsic Discriminant Feature Exploitation for Fake News Detection
IJCAI 2025
Bidirectional Multi-Step Domain Generalization for Visible-Infrared Person Re-Identification
WACV 2025
Unified Molecule-Text Language Model with Discrete Token Representation
IJCAI 2025
Temporally Streaming Audio-Visual Synchronization for Real-World Videos
WACV 2025
Consistency-Aware Padding for Incomplete Multi-Modal Alignment Clustering Based on Self-Repellent Greedy Anchor Search
IJCAI 2025
AIDE: Improving 3D Open-Vocabulary Semantic Segmentation by Aligned Vision-Language Learning
WACV 2025
Connecting Giants: Synergistic Knowledge Transfer of Large Multimodal Models for Few-Shot Learning
IJCAI 2025
ProbMED: A Probabilistic Framework for Medical Multimodal Binding
ICCV 2025
<
1
…
6
7
8
…
128
>