Research Explorer
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Achievements
About
Methodology
← Learning Types
Deep Learning
›
Learning Types
›
Multi-Modal Learning
3194 directly classified papers
Papers per year
2003: 1
2010: 1
2011: 1
2013: 5
2014: 3
2015: 9
2016: 23
2017: 49
2018: 78
2019: 158
2020: 223
2021: 261
2022: 354
2023: 471
2024: 705
2025: 835
2026: 17
Papers
DCU ADAPT at WMT24: English to Low-resource Multi-Modal Translation Task
EMNLP 2024
Multilingual Synopses of Movie Narratives: A Dataset for Vision-Language Story Understanding
EMNLP 2024
Video Discourse Parsing and Its Application to Multimodal Summarization: A Dataset and Baseline Approaches
EMNLP 2024
MMAR: Multilingual and Multimodal Anaphora Resolution in Instructional Videos
EMNLP 2024
Diversify, Rationalize, and Combine: Ensembling Multiple QA Strategies for Zero-shot Knowledge-based VQA
EMNLP 2024
Improving Hierarchical Text Clustering with LLM-guided Multi-view Cluster Representation
EMNLP 2024
Retrieval-enriched zero-shot image classification in low-resource domains
EMNLP 2024
Investigating Multilingual Instruction-Tuning: Do Polyglot Models Demand for Multilingual Instructions?
EMNLP 2024
VIEWS: Entity-Aware News Video Captioning
EMNLP 2024
VHASR: A Multimodal Speech Recognition System With Vision Hotwords
EMNLP 2024
Kiss up, Kick down: Exploring Behavioral Changes in Multi-modal Large Language Models with Assigned Visual Personas
EMNLP 2024
CARER - ClinicAl Reasoning-Enhanced Representation for Temporal Health Risk Prediction
EMNLP 2024
MMoE: Enhancing Multimodal Models with Mixtures of Multimodal Interaction Experts
EMNLP 2024
MIND: Multimodal Shopping Intention Distillation from Large Vision-language Models for E-commerce Purchase Understanding
EMNLP 2024
Bridging Modalities: Enhancing Cross-Modality Hate Speech Detection with Few-Shot In-Context Learning
EMNLP 2024
Divide and Conquer Radiology Report Generation via Observation Level Fine-grained Pretraining and Prompt Tuning
EMNLP 2024
Towards Injecting Medical Visual Knowledge into Multimodal LLMs at Scale
EMNLP 2024
SciMind: A Multimodal Mixture-of-Experts Model for Advancing Pharmaceutical Sciences
ACL 2024
Embracing Language Inclusivity and Diversity in CLIP through Continual Language Learning
AAAI 2024
Mol2Lang-VLM: Vision- and Text-Guided Generative Pre-trained Language Models for Advancing Molecule Captioning through Multimodal Fusion
ACL 2024
CLASP: Cross-modal Alignment Using Pre-trained Unimodal Models
ACL 2024
Enhancing Adverse Drug Event Detection with Multimodal Dataset: Corpus Creation and Model Development
ACL 2024
Multi-modal Stance Detection: New Datasets and Model
ACL 2024
Multi-Modal Prompting for Open-Vocabulary Video Visual Relationship Detection
AAAI 2024
Enhanced BioT5+ for Molecule-Text Translation: A Three-Stage Approach with Data Distillation, Diverse Training, and Voting Ensemble
ACL 2024
<
1
…
44
45
46
…
128
>