conftrace
_
Papers
Trends
Conferences
Explore
Authors
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Achievements
← Core AI
Artificial Intelligence
›
Core AI
›
Multimodal Learning
13,057 papers
Papers per year
2003: 1
2006: 3
2007: 6
2008: 2
2009: 5
2010: 2
2011: 3
2012: 6
2013: 24
2014: 20
2015: 46
2016: 109
2017: 205
2018: 299
2019: 622
2020: 675
2021: 987
2022: 1084
2023: 1697
2024: 2500
2025: 3654
2026: 1107
Papers
SongEditor: Adapting Zero-Shot Song Generation Language Model as a Multi-Task Editor
AAAI 2025
UniMuMo: Unified Text, Music, and Motion Generation
AAAI 2025
MSE-Adapter: A Lightweight Plugin Endowing LLMs with the Capability to Perform Multimodal Sentiment Analysis and Emotion Recognition
AAAI 2025
StableVC: Style Controllable Zero-Shot Voice Conversion with Conditional Flow Matching
AAAI 2025
Codec Does Matter: Exploring the Semantic Shortcoming of Codec for Audio Language Model
AAAI 2025
SongGLM: Lyric-to-Melody Generation with 2D Alignment Encoding and Multi-Task Pre-Training
AAAI 2025
MEPNet: Medical Entity-Balanced Prompting Network for Brain CT Report Generation
AAAI 2025
Defeasible Visual Entailment: Benchmark, Evaluator, and Reward-Driven Optimization
AAAI 2025
Prototype-Guided Multimodal Relation Extraction based on Entity Attributes
AAAI 2025
Multi-Granular Multimodal Clue Fusion for Meme Understanding
AAAI 2025
One Token Can Help! Learning Scalable and Pluggable Virtual Tokens for Retrieval-Augmented Large Language Models
AAAI 2025
Math-PUMA: Progressive Upward Multimodal Alignment to Enhance Mathematical Reasoning
AAAI 2025
Mitigating Pervasive Modality Absence Through Multimodal Generalization and Refinement
AAAI 2025
Internal Activation Revision: Safeguarding Vision Language Models Without Parameter Update
AAAI 2025
Retention Score: Quantifying Jailbreak Risks for Vision Language Models
AAAI 2025
SYNAPSE: SYmbolic Neural-Aided Preference Synthesis Engine
AAAI 2025
Generalizing Alignment Paradigm of Text-to-Image Generation with Preferences Through f-Divergence Minimization
AAAI 2025
MMJ-Bench: A Comprehensive Study on Jailbreak Attacks and Defenses for Vision Language Models
AAAI 2025
Enhance Modality Robustness in Text-Centric Multimodal Alignment with Adversarial Prompting
AAAI 2025
Dust-Mamba: An Efficient Dust Storm Detection Network with Multiple Data Sources
AAAI 2025
FoMo: Multi-Modal, Multi-Scale and Multi-Task Remote Sensing Foundation Models for Forest Monitoring
AAAI 2025
PhishAgent: A Robust Multimodal Agent for Phishing Webpage Detection
AAAI 2025
Leveraging Computer Vision and Visual LLMs for Cost-Effective and Consistent Street Food Safety Assessment in Kolkata India
AAAI 2025
Enhancing Vision-Language Models with Morphological and Taxonomic Knowledge: Towards Coral Recognition for Ocean Health
AAAI 2025
UrbanVLP: Multi-Granularity Vision-Language Pretraining for Urban Socioeconomic Indicator Prediction
AAAI 2025
<
1
…
60
61
62
…
523
>