Research Explorer
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Achievements
About
Methodology
← Core AI
Artificial Intelligence
›
Core AI
›
Multimodal Learning
13057 directly classified papers
Papers per year
2003: 1
2006: 3
2007: 6
2008: 2
2009: 5
2010: 2
2011: 3
2012: 6
2013: 24
2014: 20
2015: 46
2016: 109
2017: 205
2018: 299
2019: 622
2020: 675
2021: 987
2022: 1084
2023: 1697
2024: 2500
2025: 3654
2026: 1107
Papers
TABED: Test-Time Adaptive Ensemble Drafting for Robust Speculative Decoding in LVLMs
EACL 2026
SpatialMath: Spatial Comprehension-Infused Symbolic Reasoning for Mathematical Problem-Solving
EACL 2026
VIGiA: Instructional Video Guidance via Dialogue Reasoning and Retrieval
EACL 2026
SCAN: Semantic Document Layout Analysis for Textual and Visual Retrieval-Augmented Generation
EACL 2026
VADER: Towards Causal Video Anomaly Understanding with Relation-Aware Large Language Models
WACV 2026
Generalizing Sports Feedback Generation by Watching Competitions and Reading Books: A Rock Climbing Case Study
WACV 2026
FujiView: Multimodal Late-Fusion for Predicting Scenic Visibility
WACV 2026
Explaining the Unseen: Multimodal Vision-Language Reasoning for Situational Awareness in Underground Mining Disasters
WACV 2026
Towards Fine-Grained Adaptation of CLIP via a Self-Trained Alignment Score
WACV 2026
CoReTab: Improving Multimodal Table Understanding with Code-driven Reasoning
EACL 2026
PromptGAR: Flexible Promptive Group Activity Recognition
WACV 2026
Data-Centric Approach at the LoResMT 2026 Turkic Translation Challenge: Russian-Kyrgyz
EACL 2026
Intra-Class Probabilistic Embeddings for Uncertainty Estimation in Vision-Language Models
WACV 2026
VisAffect at MWE-2026 AdMIRe 2: IMMCAN Idiom Multimodal Cross-Attention Network
EACL 2026
FreeCond: Free Lunch in the Input Conditions of Text-Guided Inpainting
WACV 2026
VRAgent: Self-Refining Agent for Zero-Shot Multimodal Video Retrieval
WACV 2026
Referring Change Detection in Remote Sensing Imagery
WACV 2026
CoreCaption: Core Caption based Text-to-Video Retrieval
WACV 2026
M4U: Evaluating Multilingual Understanding and Reasoning for Large Multimodal Models
WACV 2026
VLMs Guided Interpretable Decision Making in Autonomous Driving
WACV 2026
Gene-DML: Dual-Pathway Multi-Level Discrimination for Gene Expression Prediction from Histopathology Images
WACV 2026
Unlocking Vision-Language Models for Video Anomaly Detection via Fine-Grained Prompting
WACV 2026
Instruction Tuning with and without Context: Behavioral Shifts and Downstream Impact
EACL 2026
Do You See Me : A Multidimensional Benchmark for Evaluating Visual Perception in Multimodal LLMs
EACL 2026
When Does Auxiliary Modality Matter in Solving Geometric Problems? A Comprehensive Study of Textual, Formal, and Visual Modalities
EACL 2026
<
1
…
4
5
6
…
523
>