conftrace
_
Papers
Trends
Conferences
Explore
Authors
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Achievements
← Core AI
Artificial Intelligence
›
Core AI
›
Multimodal Learning
13,057 papers
Papers per year
2003: 1
2006: 3
2007: 6
2008: 2
2009: 5
2010: 2
2011: 3
2012: 6
2013: 24
2014: 20
2015: 46
2016: 109
2017: 205
2018: 299
2019: 622
2020: 675
2021: 987
2022: 1084
2023: 1697
2024: 2500
2025: 3654
2026: 1107
Papers
Cross-Modal Few-Shot Learning with Second-Order Neural Ordinary Differential Equations
AAAI 2025
Cobra: Extending Mamba to Multi-Modal Large Language Model for Efficient Inference
AAAI 2025
Excluding the Impossible for Open Vocabulary Semantic Segmentation
AAAI 2025
Training-free Open-Vocabulary Semantic Segmentation via Diverse Prototype Construction and Sub-region Matching
AAAI 2025
Audio-Visual Adaptive Fusion Network for Question Answering Based on Contrastive Learning
AAAI 2025
Hierarchical Cross-Modal Alignment for Open-Vocabulary 3D Object Detection
AAAI 2025
Heterogeneous Prompt-Guided Entity Inferring and Distilling for Scene-Text Aware Cross-Modal Retrieval
AAAI 2025
ContextualStory: Consistent Visual Storytelling with Spatially-Enhanced and Storyline Context
AAAI 2025
MMPF: Multi-Modal Perception Framework for Abnormal Medical Condition Detection
AAAI 2025
UrBench: A Comprehensive Benchmark for Evaluating Large Multimodal Models in Multi-View Urban Scenarios
AAAI 2025
Position-Aware Guided Point Cloud Completion with CLIP Model
AAAI 2025
Core-to-Global Reasoning for Compositional Visual Question Answering
AAAI 2025
Expanding the Scope of Negatives: Boosting Image-Text Matching with Negatives Distribution Guided Learning
AAAI 2025
Dense Audio-Visual Event Localization Under Cross-Modal Consistency and Multi-Temporal Granularity Collaboration
AAAI 2025
TAMER: Tree-Aware Transformer for Handwritten Mathematical Expression Recognition
AAAI 2025
A Comprehensive Overhaul of Multimodal Assistant with Small Language Models
AAAI 2025
MUC: Mixture of Uncalibrated Cameras for Robust 3D Human Body Reconstruction
AAAI 2025
ST3: Accelerating Multimodal Large Language Model by Spatial-Temporal Visual Token Trimming
AAAI 2025
RhythmMamba: Fast, Lightweight, and Accurate Remote Physiological Measurement
AAAI 2025
L-Man: A Large Multi-modal Model Unifying Human-centric Tasks
AAAI 2025
Revisiting Multimodal Emotion Recognition in Conversation from the Perspective of Graph Spectrum
AAAI 2025
Cross-modal Multi-task Learning for Multimedia Event Extraction
AAAI 2025
In-context Prompt-augmented Micro-video Popularity Prediction
AAAI 2025
Global Attribute-Association Pattern Aggregation for Graph Fraud Detection
AAAI 2025
Mixed-Curvature Multi-Modal Knowledge Graph Completion
AAAI 2025
<
1
…
54
55
56
…
523
>