Research Explorer
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Achievements
About
Methodology
← Core AI
Artificial Intelligence
›
Core AI
›
Multimodal Learning
13057 directly classified papers
Papers per year
2003: 1
2006: 3
2007: 6
2008: 2
2009: 5
2010: 2
2011: 3
2012: 6
2013: 24
2014: 20
2015: 46
2016: 109
2017: 205
2018: 299
2019: 622
2020: 675
2021: 987
2022: 1084
2023: 1697
2024: 2500
2025: 3654
2026: 1107
Papers
Revisiting the Data Sampling in Multimodal Post-training from a Difficulty-Distinguish View
AAAI 2026
CR³: Boosting Compositional Reasoning in MLLMs Through Rule-Based Reinforcement Learning
AAAI 2026
URaG: Unified Retrieval and Generation in Multimodal LLMs for Efficient Long Document Understanding
AAAI 2026
KSS-MoE: Knowledge Space Synergy Framework in Mixture of Experts for Continual Visual Instruction Tuning
AAAI 2026
CrossCheck-Bench: Diagnosing Compositional Failures in Multimodal Conflict Resolution
AAAI 2026
X-SAM: From Segment Anything to Any Segmentation
AAAI 2026
Listening Between the Frames: Bridging Temporal Gaps in Large Audio-Language Models
AAAI 2026
LiViBench: An Omnimodal Benchmark for Interactive Livestream Video Understanding
AAAI 2026
ViCToR: Improving Visual Comprehension via Token Reconstruction for Pretraining LMMs
AAAI 2026
On Modality Weighting and Specificity for Multi-Modal Entity Alignment
AAAI 2026
DiagramGPT-Llama3:Enabling Editable, High-Fidelity Diagram Generation with Vision Large Language Models
AAAI 2026
CLER: Improving Multimodal Financial Reasoning by Cross-MLLM Error Reflection
AAAI 2026
Graph-of-Mark: Promote Spatial Reasoning in Multimodal Language Models with Graph-Based Visual Prompting
AAAI 2026
SageLM: A Multi-aspect and Explainable Large Language Model for Speech Judgement
AAAI 2026
Uncovering and Mitigating Transient Blindness in Multimodal Model Editing
AAAI 2026
Format Matters: The Robustness of Multimodal LLMs in Reviewing Evidence from Tables and Charts
AAAI 2026
Do Language Models Associate Sound with Meaning? A Multimodal Study of Sound Symbolism
AAAI 2026
MDF: A Modality-Aware Disentanglement and Fusion Framework for Multimodal Sentiment Analysis
AAAI 2026
Reinforce Trustworthiness in Multimodal Emotional Support System
AAAI 2026
HPSU: A Benchmark for Human-Level Perception in Real-World Spoken Speech Understanding
AAAI 2026
MacVQA: Adaptive Memory Allocation and Global Noise Filtering for Continual Visual Question Answering
AAAI 2026
Easy for Children, Hard for AI: The Limits of Multimodal LLMs in Early Childhood Learning
AAAI 2026
Towards Authentic Movie Dubbing with Retrieve-Augmented Director-Actor Interaction Learning
AAAI 2026
Look as You Think: Unifying Reasoning and Visual Evidence Attribution for Verifiable Document RAG via Reinforcement Learning
AAAI 2026
MCA-Bench: A Multimodal Benchmark for Evaluating CAPTCHA Robustness Against VLM-based Attacks
AAAI 2026
<
1
…
38
39
40
…
523
>