Research Explorer
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Achievements
About
Methodology
← Core AI
Artificial Intelligence
›
Core AI
›
Multimodal Learning
13057 directly classified papers
Papers per year
2003: 1
2006: 3
2007: 6
2008: 2
2009: 5
2010: 2
2011: 3
2012: 6
2013: 24
2014: 20
2015: 46
2016: 109
2017: 205
2018: 299
2019: 622
2020: 675
2021: 987
2022: 1084
2023: 1697
2024: 2500
2025: 3654
2026: 1107
Papers
FAMDR: Feature-Aligned Multimodal Denoising for Reliable Diagnostic Reconciliation in Medical Imaging
AAAI 2026
Spatial-Spectral Homogeneous Attacks on Physical-World Large Vision-Language Models
AAAI 2026
Accelerating Controllable Generation via Hybrid-grained Cache
AAAI 2026
Taming the Phantom: Token-Asymmetric Filtering for Hallucination Mitigation in Large Vision-Language Models
AAAI 2026
Image-Text Knowledge Modeling for Unsupervised Multi-Scenario Person Re-Identification
AAAI 2026
Unified Mixture-of-Experts Framework for Joint Cardiac and Vascular Ultrasound Analysis and Report Generation
AAAI 2026
Game Ground Bench: Probing the Limits of LVLMs in Complex Semantic Grounding Across Game Universes
AAAI 2026
RL-U2Net: A Dual-Branch UNet with Reinforcement Learning-Assisted Multimodal Feature Fusion for Accurate 3D Whole-Heart Segmentation
AAAI 2026
PromptMoE: Generalizable Zero-Shot Anomaly Detection via Visually-Guided Prompt Mixtures
AAAI 2026
UniAlignment: Semantic Alignment for Unified Image Generation, Understanding, Manipulation and Perception
AAAI 2026
Not All Tokens and Heads Are Equally Important: Dual-Level Attention Intervention for Hallucination Mitigation
AAAI 2026
Noisy Correspondence Learning with Modality Gap Direction Correction
AAAI 2026
Radar-APLANC: Unsupervised Radar-based Heartbeat Sensing via Augmented Pseudo-Label and Noise Contrast
AAAI 2026
Learning Knowledge from Textual Descriptions for 3D Human Pose Estimation
AAAI 2026
Not Just What’s There: Enabling CLIP to Comprehend Negated Visual Descriptions Without Fine-Tuning
AAAI 2026
NeuSpring: Neural Spring Fields for Reconstruction and Simulation of Deformable Objects from Videos
AAAI 2026
STMI: Segmentation-Guided Token Modulation with Cross-Modal Hypergraph Interaction for Multi-Modal Object Re-Identification
AAAI 2026
RobusTor3D: Robust Multimodal 3D Object Detector for Autonomous Driving by Vision-Language Knowledge Blending
AAAI 2026
Endowing Vision-Language Models with System 2 Thinking for Fine-grained Visual Recognition
AAAI 2026
CMMCoT: Enhancing Complex Multi-Image Comprehension via Multi-Modal Chain-of-Thought and Memory Augmentation
AAAI 2026
Frequency-Aware Vision-Language Multimodality Generalization Network for Remote Sensing Image Classification
AAAI 2026
What You See Is What You Reach: Towards Spatial Navigation with High-Level Human Instructions
AAAI 2026
Dual-Path Knowledge-Augmented Contrastive Alignment Network for Spatially Resolved Transcriptomics
AAAI 2026
KineST: A Kinematics-guided Spatiotemporal State Space Model for Human Motion Tracking from Sparse Signals
AAAI 2026
PEOCH: Online Cross-Modal Hashing with Semi-Supervised Streaming Data Driving Prototype Evolution
AAAI 2026
<
1
…
19
20
21
…
523
>