Research Explorer
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Papers
Trends
Conferences
Explore
Authors
Topics
Keywords
Achievements
About
Methodology
← Learning Types
Machine Learning
›
Learning Types
›
Multi-Modal Learning
1213 directly classified papers
Papers per year
2007: 2
2008: 1
2009: 1
2011: 2
2012: 5
2013: 5
2014: 1
2015: 5
2016: 8
2017: 21
2018: 42
2019: 42
2020: 69
2021: 72
2022: 149
2023: 143
2024: 258
2025: 370
2026: 17
Papers
GeoGPT4V: Towards Geometric Multi-modal Large Language Models with Geometric Image Generation
EMNLP 2024
Eliciting Better Multilingual Structured Reasoning from LLMs through Code
ACL 2024
Do You Remember? Dense Video Captioning with Cross-Modal Memory Retrieval
CVPR 2024
Accelerating Pre-training of Multimodal LLMs via Chain-of-Sight
NIPS 2024
Unity by Diversity: Improved Representation Learning for Multimodal VAEs
NIPS 2024
Successfully Guiding Humans with Imperfect Instructions by Highlighting Potential Errors and Suggesting Corrections
EMNLP 2024
FuseMoE: Mixture-of-Experts Transformers for Fleximodal Fusion
NIPS 2024
Cross-modal Representation Flattening for Multi-modal Domain Generalization
NIPS 2024
When LLMs Meets Acoustic Landmarks: An Efficient Approach to Integrate Speech into Large Language Models for Depression Detection
EMNLP 2024
HEALNet: Multimodal Fusion for Heterogeneous Biomedical Data
NIPS 2024
Facilitating Multimodal Classification via Dynamically Learning Modality Gap
NIPS 2024
Findings of WMT 2024’s MultiIndic22MT Shared Task for Machine Translation of 22 Indian Languages
EMNLP 2024
MmCows: A Multimodal Dataset for Dairy Cattle Monitoring
NIPS 2024
Density-based User Representation using Gaussian Process Regression for Multi-interest Personalized Retrieval
NIPS 2024
RecomMind: Movie Recommendation Dialogue with Seeker’s Internal State
EMNLP 2024
Samsung R&D Institute Philippines @ WMT 2024 Indic MT Task
EMNLP 2024
OSWorld: Benchmarking Multimodal Agents for Open-Ended Tasks in Real Computer Environments
NIPS 2024
Exploiting Descriptive Completeness Prior for Cross Modal Hashing with Incomplete Labels
NIPS 2024
Crisis counselor language and perceived genuine concern in crisis conversations
EMNLP 2024
II-Bench: An Image Implication Understanding Benchmark for Multimodal Large Language Models
NIPS 2024
Interfacing Foundation Models' Embeddings
NIPS 2024
Visual Pivoting Unsupervised Multimodal Machine Translation in Low-Resource Distant Language Pairs
EMNLP 2024
Financial Forecasting from Textual and Tabular Time Series
EMNLP 2024
Generative Hierarchical Materials Search
NIPS 2024
Text to Blind Motion
NIPS 2024
<
1
…
20
21
22
…
49
>