Research Explorer
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Achievements
About
Methodology
← Learning Types
Deep Learning
›
Learning Types
›
Multi-Modal Learning
3194 directly classified papers
Papers per year
2003: 1
2010: 1
2011: 1
2013: 5
2014: 3
2015: 9
2016: 23
2017: 49
2018: 78
2019: 158
2020: 223
2021: 261
2022: 354
2023: 471
2024: 705
2025: 835
2026: 17
Papers
Chitranuvad: Adapting Multi-lingual LLMs for Multimodal Translation
EMNLP 2024
DCU ADAPT at WMT24: English to Low-resource Multi-Modal Translation Task
EMNLP 2024
Multilingual Synopses of Movie Narratives: A Dataset for Vision-Language Story Understanding
EMNLP 2024
Video Discourse Parsing and Its Application to Multimodal Summarization: A Dataset and Baseline Approaches
EMNLP 2024
MMAR: Multilingual and Multimodal Anaphora Resolution in Instructional Videos
EMNLP 2024
Diversify, Rationalize, and Combine: Ensembling Multiple QA Strategies for Zero-shot Knowledge-based VQA
EMNLP 2024
Improving Hierarchical Text Clustering with LLM-guided Multi-view Cluster Representation
EMNLP 2024
Retrieval-enriched zero-shot image classification in low-resource domains
EMNLP 2024
Investigating Multilingual Instruction-Tuning: Do Polyglot Models Demand for Multilingual Instructions?
EMNLP 2024
VIEWS: Entity-Aware News Video Captioning
EMNLP 2024
VHASR: A Multimodal Speech Recognition System With Vision Hotwords
EMNLP 2024
Kiss up, Kick down: Exploring Behavioral Changes in Multi-modal Large Language Models with Assigned Visual Personas
EMNLP 2024
CARER - ClinicAl Reasoning-Enhanced Representation for Temporal Health Risk Prediction
EMNLP 2024
MMoE: Enhancing Multimodal Models with Mixtures of Multimodal Interaction Experts
EMNLP 2024
MIND: Multimodal Shopping Intention Distillation from Large Vision-language Models for E-commerce Purchase Understanding
EMNLP 2024
Bridging Modalities: Enhancing Cross-Modality Hate Speech Detection with Few-Shot In-Context Learning
EMNLP 2024
Divide and Conquer Radiology Report Generation via Observation Level Fine-grained Pretraining and Prompt Tuning
EMNLP 2024
Towards Injecting Medical Visual Knowledge into Multimodal LLMs at Scale
EMNLP 2024
VIXEN: Visual Text Comparison Network for Image Difference Captioning
AAAI 2024
DIUSum: Dynamic Image Utilization for Multimodal Summarization
AAAI 2024
Adaptive Graph Learning for Multimodal Conversational Emotion Detection
AAAI 2024
Video Event Extraction with Multi-View Interaction Knowledge Distillation
AAAI 2024
JoLT: Jointly Learned Representations of Language and Time-Series for Clinical Time-Series Interpretation (Student Abstract)
AAAI 2024
Bootstrapping Large Language Models for Radiology Report Generation
AAAI 2024
Hierarchical Aligned Multimodal Learning for NER on Tweet Posts
AAAI 2024
<
1
…
45
46
47
…
128
>