Research Explorer
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Achievements
About
Methodology
← Core AI
Artificial Intelligence
›
Core AI
›
Multimodal Learning
13057 directly classified papers
Papers per year
2003: 1
2006: 3
2007: 6
2008: 2
2009: 5
2010: 2
2011: 3
2012: 6
2013: 24
2014: 20
2015: 46
2016: 109
2017: 205
2018: 299
2019: 622
2020: 675
2021: 987
2022: 1084
2023: 1697
2024: 2500
2025: 3654
2026: 1107
Papers
UniCalib: Targetless LiDAR-camera Calibration via Probabilistic Flow on Unified Depth Representations
WACV 2026
RegionAligner: Bridging Ego-Exo Views for Object Correspondence via Unified Text-Visual Learning
WACV 2026
PoseGaussian: Pose-Driven Novel View Synthesis for Robust 3D Human Reconstruction
WACV 2026
Geo3DVQA: Evaluating Vision-Language Models for 3D Geospatial Reasoning from Aerial Imagery
WACV 2026
ORCA: Object Recognition and Comprehension for Archiving Marine Species
WACV 2026
DuPLUS: Dual-Prompt Vision-Language Model for Universal Medical Image Segmentation and Prognosis
WACV 2026
Bridging the Domain Gap in Small Multimodal Models: A Dual-level Alignment Perspective
WACV 2026
Referring Change Detection in Remote Sensing Imagery
WACV 2026
VLMs Guided Interpretable Decision Making in Autonomous Driving
WACV 2026
Large Sign Language Models: Toward 3D American Sign Language Translation
WACV 2026
KFS-Bench: Comprehensive Evaluation of Key Frame Sampling in Long Video Understanding
WACV 2026
Face-LLaVA: Facial Expression and Attribute Understanding through Instruction Tuning
WACV 2026
ITSELF: Attention Guided Fine-Grained Alignment for Vision-Language Retrieval
WACV 2026
SAVE: Sparse Autoencoder-Driven Visual Information Enhancement for Mitigating Object Hallucination
WACV 2026
MapVerse: A Benchmark for Geospatial Question Answering on Diverse Real-World Maps
WACV 2026
VADER: Towards Causal Video Anomaly Understanding with Relation-Aware Large Language Models
WACV 2026
Generalizing Sports Feedback Generation by Watching Competitions and Reading Books: A Rock Climbing Case Study
WACV 2026
Ego-EXTRA: video-language Egocentric Dataset for EXpert-TRAinee assistance
WACV 2026
VRAgent: Self-Refining Agent for Zero-Shot Multimodal Video Retrieval
WACV 2026
M4U: Evaluating Multilingual Understanding and Reasoning for Large Multimodal Models
WACV 2026
BanglaProtha: Evaluating Vision Language Models in Underrepresented Long-tail Cultural Contexts
WACV 2026
One-shot Portrait Stylizaiton via Geometric Alignment
WACV 2026
AuViRe: Audio-visual Speech Representation Reconstruction for Deepfake Temporal Localization
WACV 2026
Patch Your Matcher: Correspondence-Aware Image-to-Image Translation Unlocks Cross-Modal Matching via Single-Modality Priors
WACV 2026
Lost in Time? A Meta-Learning Framework for Time-Shift-Tolerant Physiological Signal Transformation
AAAI 2026
<
1
…
21
22
23
…
523
>