conftrace
_
Papers
Trends
Conferences
Explore
Authors
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Achievements
← Core AI
Artificial Intelligence
›
Core AI
›
Multimodal Learning
13,057 papers
Papers per year
2003: 1
2006: 3
2007: 6
2008: 2
2009: 5
2010: 2
2011: 3
2012: 6
2013: 24
2014: 20
2015: 46
2016: 109
2017: 205
2018: 299
2019: 622
2020: 675
2021: 987
2022: 1084
2023: 1697
2024: 2500
2025: 3654
2026: 1107
Papers
Multimodal Commonsense Knowledge Distillation for Visual Question Answering (Student Abstract)
AAAI 2025
Utilizing Vision-Language Models for Detection of Leaf-Based Diseases in Tomatoes
AAAI 2025
AI-Driven Multicultural Identity Preservation
AAAI 2025
Visual Question Answering for Peruvian Cuisine in Regional Spanish
AAAI 2025
AI-Driven Personalized Fall Prevention for Older Adults
AAAI 2025
Falcon Medical Visual Question Answering
AAAI 2025
MAFT: Multimodal Automated Fact-Checking via Textualization
AAAI 2025
Rewind and Render: Towards Factually Accurate Text-to-Video Generation with Distilled Knowledge Retrieval
AAAI 2025
Pic2Prep: A Multimodal Conversational Agent for Cooking Assistance
AAAI 2025
Speech Is Not Enough: Interpreting Nonverbal Indicators of Common Knowledge and Engagement
AAAI 2025
GODDS: The Global Online Deepfake Detection System
AAAI 2025
StarVector: Generating Scalable Vector Graphics Code from Images and Text
AAAI 2025
AutoMV: An Autonomous Agent Framework for Real Estate Marketing Video Generation
AAAI 2025
Feature Decomposition-Augmentation Network for Multimodal Sentiment Analysis
AACL 2025
Chain of Functions: A Programmatic Pipeline for Fine-Grained Chart Reasoning Data Generation
AACL 2025
Breaking Language Barriers or Reinforcing Bias? A Study of Gender and Racial Disparities in Multilingual Contrastive Vision Language Models
AACL 2025
Synthetic Singers: A Review of Deep-Learning-based Singing Voice Synthesis Approaches
AACL 2025
ASAudio: A Survey of Advanced Spatial Audio Research
AACL 2025
StuD: A Multimodal Approach for Stuttering Detection with RAG and Fusion Strategies
AACL 2025
Captions Speak Louder than Images: Generalizing Foundation Models for E-commerce from High-quality Multimodal Instruction Data
AACL 2025
EcomMMMU: Strategic Utilization of Visuals for Robust Multimodal E-commerce Models
AACL 2025
HiPPO: Exploring A Novel Hierarchical Pronunciation Assessment Approach for Spoken Languages
AACL 2025
Multimodal Language Models for Financial Forecasting from Interleaved Sequences of Text and Time Series
AACL 2025
Beyond Classification: Towards Speech Emotion Reasoning with Multitask AudioLLMs
AACL 2025
R²-CoD: Understanding Text-Graph Complementarity in Relational Reasoning via Knowledge Co-Distillation
AACL 2025
<
1
…
62
63
64
…
523
>