Research Explorer
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Achievements
About
Methodology
← Core AI
Artificial Intelligence
›
Core AI
›
Multimodal Learning
13057 directly classified papers
Papers per year
2003: 1
2006: 3
2007: 6
2008: 2
2009: 5
2010: 2
2011: 3
2012: 6
2013: 24
2014: 20
2015: 46
2016: 109
2017: 205
2018: 299
2019: 622
2020: 675
2021: 987
2022: 1084
2023: 1697
2024: 2500
2025: 3654
2026: 1107
Papers
Anatomy-VLM: A Fine-grained Vision-Language Model for Medical Interpretation
WACV 2026
Cross-Modal Event Encoder: Bridging Image-Text Knowledge to Event Streams
WACV 2026
Dual-Domain Multimodal Hyperbolic Fusion for Cardiopulmonary Disease Diagnosis in Emergency Care
WACV 2026
Training-Free Few-Shot Segmentation via Vision-Language Guided Prompting
WACV 2026
Ordinal-Aware Multimodal Engagement Recognition for Collaborative Learning
WACV 2026
CVP: Central-Peripheral Vision-Inspired Multimodal Model for Spatial Reasoning
WACV 2026
Fused Similarity Measure Based Alignment with Dual-Scale Adaptive Selection for Weakly Supervised Video Anomaly Detection
WACV 2026
mmWEAVER: Environment-Specific mmWave Signal Synthesis from a Photo and Activity Description
WACV 2026
LASER: Lip Landmark Assisted Speaker Detection for Robustness
WACV 2026
Sea-CLIP: Mining Semantic-Aware Representations for Few-Shot Anomaly Detection with CLIP
WACV 2026
Action Anticipation at a Glimpse: To What Extent Can Multimodal Cues Replace Video?
WACV 2026
Robust Multimodal Emotion Recognition from Incomplete Modalities via Query-Based Unimodal and Cross-Modal Learning
WACV 2026
UniCalib: Targetless LiDAR-camera Calibration via Probabilistic Flow on Unified Depth Representations
WACV 2026
RegionAligner: Bridging Ego-Exo Views for Object Correspondence via Unified Text-Visual Learning
WACV 2026
PoseGaussian: Pose-Driven Novel View Synthesis for Robust 3D Human Reconstruction
WACV 2026
Geo3DVQA: Evaluating Vision-Language Models for 3D Geospatial Reasoning from Aerial Imagery
WACV 2026
ORCA: Object Recognition and Comprehension for Archiving Marine Species
WACV 2026
DuPLUS: Dual-Prompt Vision-Language Model for Universal Medical Image Segmentation and Prognosis
WACV 2026
Bridging the Domain Gap in Small Multimodal Models: A Dual-level Alignment Perspective
WACV 2026
Referring Change Detection in Remote Sensing Imagery
WACV 2026
VLMs Guided Interpretable Decision Making in Autonomous Driving
WACV 2026
Large Sign Language Models: Toward 3D American Sign Language Translation
WACV 2026
KFS-Bench: Comprehensive Evaluation of Key Frame Sampling in Long Video Understanding
WACV 2026
Face-LLaVA: Facial Expression and Attribute Understanding through Instruction Tuning
WACV 2026
IPFormer: Instance Prompt-guided Transformer for Multi-modal Multi-shot Video Understanding
AAAI 2026
<
1
…
41
42
43
…
523
>