Research Explorer
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Achievements
About
Methodology
← Learning Types
Deep Learning
›
Learning Types
›
Multi-Modal Learning
3194 directly classified papers
Papers per year
2003: 1
2010: 1
2011: 1
2013: 5
2014: 3
2015: 9
2016: 23
2017: 49
2018: 78
2019: 158
2020: 223
2021: 261
2022: 354
2023: 471
2024: 705
2025: 835
2026: 17
Papers
Solar Power Generation Forecasting via Multimodal Feature Fusion (Student Abstract)
AAAI 2024
Bidirectional Contrastive Split Learning for Visual Question Answering
AAAI 2024
Early Detection of Extreme Storm Tide Events Using Multimodal Data Processing
AAAI 2024
Automated Defect Report Generation for Enhanced Industrial Quality Control
AAAI 2024
Visual Hallucination Elevates Speech Recognition
AAAI 2024
Multi-Modal Discussion Transformer: Integrating Text, Images and Graph Transformers to Detect Hate Speech on Social Media
AAAI 2024
CALVIN: Improved Contextual Video Captioning via Instruction Tuning
NIPS 2024
Mitigating Idiom Inconsistency: A Multi-Semantic Contrastive Learning Method for Chinese Idiom Reading Comprehension
AAAI 2024
Mixture of In-Context Experts Enhance LLMs' Long Context Awareness
NIPS 2024
Unity by Diversity: Improved Representation Learning for Multimodal VAEs
NIPS 2024
GenWarp: Single Image to Novel Views with Semantic-Preserving Generative Warping
NIPS 2024
DIUSum: Dynamic Image Utilization for Multimodal Summarization
AAAI 2024
DevBench: A multimodal developmental benchmark for language learning
NIPS 2024
IKEA Manuals at Work: 4D Grounding of Assembly Instructions on Internet Videos
NIPS 2024
AI-Based Energy Transportation Safety: Pipeline Radial Threat Estimation Using Intelligent Sensing System
AAAI 2024
Deep Correlated Prompting for Visual Recognition with Missing Modalities
NIPS 2024
BoostAdapter: Improving Vision-Language Test-Time Adaptation via Regional Bootstrapping
NIPS 2024
Learning Representations for Robust Human-Robot Interaction
AAAI 2024
HEALNet: Multimodal Fusion for Heterogeneous Biomedical Data
NIPS 2024
All-in-One Image Coding for Joint Human-Machine Vision with Multi-Path Aggregation
NIPS 2024
RaVL: Discovering and Mitigating Spurious Correlations in Fine-Tuned Vision-Language Models
NIPS 2024
Cracking the Code of Juxtaposition: Can AI Models Understand the Humorous Contradictions
NIPS 2024
Calibrated Self-Rewarding Vision Language Models
NIPS 2024
HENASY: Learning to Assemble Scene-Entities for Interpretable Egocentric Video-Language Model
NIPS 2024
E2E-MFD: Towards End-to-End Synchronous Multimodal Fusion Detection
NIPS 2024
<
1
…
40
41
42
…
128
>