Research Explorer

DrivAerNet++: A Large-Scale Multimodal Car Dataset with Computational Fluid Dynamics Simulations and Deep Learning Benchmarks

Mohamed Elrefaie, Florin Morar, Angela Dai et al.

2024 NIPS

The Multimodal Universe: Enabling Large-Scale Machine Learning with 100 TB of Astronomical Scientific Data

Eirini Angeloudi, Jeroen Audenaert, Micah Bowles et al.

2024 NIPS

Integrating Multimodal Information in Large Pretrained Transformers

Wasifur Rahman, Md Kamrul Hasan, Sangwu Lee et al.

2020 ACL

Large Scale Generative Multimodal Attribute Extraction for E-commerce Attributes

Anant Khandelwal, Happy Mittal, Shreyas Kulkarni et al.

2023 ACL

CaMML: Context-Aware Multimodal Learner for Large Models

Yixin Chen, Shuai Zhang, Boran Han et al.

2024 ACL

Beyond Text: Unveiling Multimodal Proficiency of Large Language Models with MultiAPI Benchmark

Xiao Liu, Jianfeng Lin, Jiawei Zhang

2024 ACL

Speaking Beyond Language: A Large-Scale Multimodal Dataset for Learning Nonverbal Cues from Video-Grounded Dialogues

Youngmin Kim, Jiwan Chung, Jisoo Kim et al.

2025 ACL

AlignMMBench: Evaluating Chinese Multimodal Alignment in Large Vision-Language Models

Yuhang Wu, Wenmeng Yu, Yean Cheng et al.

2025 ACL

Testing Spatial Intuitions of Humans and Large Language and Multimodal Models in Analogies

Ivo Bueno, Anna Bavaresco, João Miguel Cunha et al.

2025 ACL

How2Sign: A Large-Scale Multimodal Dataset for Continuous American Sign Language

Amanda Duarte, Shruti Palaskar, Lucas Ventura et al.

2021 CVPR

Active Data Curation Effectively Distills Large-Scale Multimodal Models

Vishaal Udandarao, Nikhil Parthasarathy, Muhammad Ferjad Naeem et al.

2025 CVPR

M3GYM: A Large-Scale Multimodal Multi-view Multi-person Pose Dataset for Fitness Activity Understanding in Real-world Settings

Qingzheng Xu, Ru Cao, Xin Shen et al.

2025 CVPR

Aligning Dialogue Agents with Global Feedback via Large Language Model Multimodal Reward Decomposition

Dong Won Lee, Hae Won Park, Cynthia Breazeal et al.

2025 EMNLP

GenieBlue: Integrating both Linguistic and Multimodal Capabilities for Large Language Models on Mobile Devices

Xudong Lu, Yinghao Chen, Renshou Wu et al.

2025 ICCV

MedTrinity-25M: A Large-scale Multimodal Dataset with Multigranular Annotations for Medicine

Yunfei Xie, Ce Zhou, Lang Gao et al.

2025 ICLR

DeToxy: A Large-Scale Multimodal Dataset for Toxicity Classification in Spoken Utterances

Sreyan Ghosh, Samden Lepcha, S Sakshi et al.

2022 INTERSPEECH

SciOL and MuLMS-Img: Introducing a Large-Scale Multimodal Scientific Dataset and Models for Image-Text Tasks in the Scientific Domain

Tim Tarsi, Heike Adel, Jan Hendrik Metzen et al.

2024 WACV

PerVL-Bench: Benchmarking Multimodal Personalization for Large Vision-Language Models

Minsung Kim

2026 WACV

Advancing Multimodal LLMs by Large-Scale 3D Visual Instruction Dataset Generation

Liu He, Xiao Zeng, Yizhi Song et al.

2026 WACV

Subspace-Aware Graph Construction and Contrastive Alignment for Multimodal Recommendation with Large Language Models

Haodong Li, Lianyong Qi, Weiming Liu et al.

2026 AAAI

DMGIN: How Multimodal LLMs Enhance Large Recommendation Models for Lifelong User Post-click Behaviors

Zhuoxing Wei, Qingchen Xie, Qi Liu et al.

2026 AAAI

Training and Evaluating Multimodal Word Embeddings with Large-scale Web Annotated Images

Junhua Mao, Jiajing Xu, Kevin Jing et al.

2016 NIPS

IMPACT: A Large-scale Integrated Multimodal Patent Analysis and Creation Dataset for Design Patents

Homaira Huda Shomee, Zhu Wang, Sourav Medya et al.

2024 NIPS

Learning Multimodal Volumetric Features for Large-Scale Neuron Tracing

Qihua Chen, Xuejin Chen, Chenxuan Wang et al.

2024 AAAI

GPT4MTS: Prompt-based Large Language Model for Multimodal Time-series Forecasting

Furong Jia, Kevin Wang, Yixiang Zheng et al.

2024 AAAI

Papers