conftrace
_
Papers
Trends
Conferences
Explore
Authors
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Achievements
← Core AI
Artificial Intelligence
›
Core AI
›
Multimodal Learning
13,057 papers
Papers per year
2003: 1
2006: 3
2007: 6
2008: 2
2009: 5
2010: 2
2011: 3
2012: 6
2013: 24
2014: 20
2015: 46
2016: 109
2017: 205
2018: 299
2019: 622
2020: 675
2021: 987
2022: 1084
2023: 1697
2024: 2500
2025: 3654
2026: 1107
Papers
Measuring Cross-Modal Interactions in Multimodal Models
AAAI 2025
Spiking Point Transformer for Point Cloud Classification
AAAI 2025
Pilot: Building the Federated Multimodal Instruction Tuning Framework
AAAI 2025
Quality over Quantity: Boosting Data Efficiency Through Ensembled Multimodal Data Curation
AAAI 2025
Towards Multimodal Sentiment Analysis via Hierarchical Correlation Modeling with Semantic Distribution Constraints
AAAI 2025
Explanation Bottleneck Models
AAAI 2025
Fit and Prune: Fast and Training-free Visual Token Pruning for Multi-modal Large Language Models
AAAI 2025
Exploring the Better Multimodal Synergy Strategy for Vision-Language Models
AAAI 2025
S²MILE: Semantic-and-Structure-Aware Music-Driven Lyric Generation
AAAI 2025
BiMAC: Bidirectional Multimodal Alignment in Contrastive Learning
AAAI 2025
Zero-Shot Image Captioning with Multi-type Entity Representations
AAAI 2025
Text-Guided Nonverbal Enhancement Based on Modality-Invariant and -Specific Representations for Video Speaking Style Recognition
AAAI 2025
DiffCLIP: Few-shot Language-driven Multimodal Classifier
AAAI 2025
A-VL: Adaptive Attention for Large Vision-Language Models
AAAI 2025
MalDetectFormer: Leveraging Sparse SpatioTemporal Information for Effective Malicious Traffic Detection
AAAI 2025
Attention Bootstrapping for Multi-Modal Test-Time Adaptation
AAAI 2025
ProtCLIP: Function-Informed Protein Multi-Modal Learning
AAAI 2025
GVMGen: A General Video-to-Music Generation Model with Hierarchical Attentions
AAAI 2025
Mitigating Hallucinations in Large Vision-Language Models by Adaptively Constraining Information Flow
AAAI 2025
Text2midi: Generating Symbolic Music from Captions
AAAI 2025
MEDSAGE: Enhancing Robustness of Medical Dialogue Summarization to ASR Errors with LLM-generated Synthetic Dialogues
AAAI 2025
CoMT: A Novel Benchmark for Chain of Multi-modal Thought on Large Vision-Language Models
AAAI 2025
Audio Entailment: Assessing Deductive Reasoning for Audio Understanding
AAAI 2025
FigStep: Jailbreaking Large Vision-Language Models via Typographic Visual Prompts
AAAI 2025
DEQA: Descriptions Enhanced Question-Answering Framework for Multimodal Aspect-Based Sentiment Analysis
AAAI 2025
<
1
…
58
59
60
…
523
>