Papers

498 papers found
A Large-Scale Chinese Multimodal NER Dataset with Speech Clues
Dianbo Sui, Zhengkun Tian, Yubo Chen et al.
2021 ACL
Large Language Models and Multimodal Retrieval for Visual Word Sense Disambiguation
Anastasia Kritharoula, Maria Lymperaiou, Giorgos Stamou
2023 EMNLP
MMAT-1M: A Large Reasoning Dataset for Multimodal Agent Tuning
Tianhong Gao, Yannian Fu, Weiqun Wu et al.
2025 ICCV
TerraMind: Large-Scale Generative Multimodality for Earth Observation
Johannes Jakubik, Felix Yang, Benedikt Blumenstiel et al.
2025 ICCV
2025 ICLR
A Large-Scale Chinese Multimodal NER Dataset with Speech Clues
Dianbo Sui, Zhengkun Tian, Yubo Chen et al.
2021 IJCNLP
2024 NAACL
2025 SEMEVAL
MEVA: A Large-Scale Multiview, Multimodal Video Dataset for Activity Detection
Kellie Corona, Katie Osterdahl, Roderic Collins et al.
2021 WACV
Multimodal and Multilingual Embeddings for Large-Scale Speech Mining
Paul-Ambroise Duquenne, Hongyu Gong, Holger Schwenk
2021 NIPS