Research Explorer
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Achievements
About
Methodology
← Learning Types
Deep Learning
›
Learning Types
›
Multi-Modal Learning
3194 directly classified papers
Papers per year
2003: 1
2010: 1
2011: 1
2013: 5
2014: 3
2015: 9
2016: 23
2017: 49
2018: 78
2019: 158
2020: 223
2021: 261
2022: 354
2023: 471
2024: 705
2025: 835
2026: 17
Papers
Bidirectional Contrastive Split Learning for Visual Question Answering
AAAI 2024
AI-Based Energy Transportation Safety: Pipeline Radial Threat Estimation Using Intelligent Sensing System
AAAI 2024
Mitigating Idiom Inconsistency: A Multi-Semantic Contrastive Learning Method for Chinese Idiom Reading Comprehension
AAAI 2024
DeWinder: Single-Channel Wind Noise Reduction using Ultrasound Sensing
INTERSPEECH 2024
DIUSum: Dynamic Image Utilization for Multimodal Summarization
AAAI 2024
Video Event Extraction with Multi-View Interaction Knowledge Distillation
AAAI 2024
Automated Defect Report Generation for Enhanced Industrial Quality Control
AAAI 2024
Learning Representations for Robust Human-Robot Interaction
AAAI 2024
Towards Holistic, Pragmatic and Multimodal Conversational Systems
AAAI 2024
Mol2Lang-VLM: Vision- and Text-Guided Generative Pre-trained Language Models for Advancing Molecule Captioning through Multimodal Fusion
ACL 2024
SciMind: A Multimodal Mixture-of-Experts Model for Advancing Pharmaceutical Sciences
ACL 2024
CLASP: Cross-modal Alignment Using Pre-trained Unimodal Models
ACL 2024
Enhancing Adverse Drug Event Detection with Multimodal Dataset: Corpus Creation and Model Development
ACL 2024
Multi-modal Stance Detection: New Datasets and Model
ACL 2024
MODDP: A Multi-modal Open-domain Chinese Dataset for Dialogue Discourse Parsing
ACL 2024
II-MMR: Identifying and Improving Multi-modal Multi-hop Reasoning in Visual Question Answering
ACL 2024
Enhanced BioT5+ for Molecule-Text Translation: A Three-Stage Approach with Data Distillation, Diverse Training, and Voting Ensemble
ACL 2024
TransFace: Unit-Based Audio-Visual Speech Synthesizer for Talking Head Translation
ACL 2024
Visual Hallucinations of Multi-modal Large Language Models
ACL 2024
An Empirical Study on Parameter-Efficient Fine-Tuning for MultiModal Large Language Models
ACL 2024
Mitigating Hallucinations in Large Vision-Language Models (LVLMs) via Language-Contrastive Decoding (LCD)
ACL 2024
Joint Inference of Retrieval and Generation for Passage Re-ranking
EACL 2024
MS2SL: Multimodal Spoken Data-Driven Continuous Sign Language Production
ACL 2024
ChartInstruct: Instruction Tuning for Chart Comprehension and Reasoning
ACL 2024
Calibrated Self-Rewarding Vision Language Models
NIPS 2024
<
1
…
39
40
41
…
128
>