Papers
498 papers found
DrivAerNet++: A Large-Scale Multimodal Car Dataset with Computational Fluid Dynamics Simulations and Deep Learning Benchmarks
Mohamed Elrefaie, Florin Morar, Angela Dai et al.
The Multimodal Universe: Enabling Large-Scale Machine Learning with 100 TB of Astronomical Scientific Data
Eirini Angeloudi, Jeroen Audenaert, Micah Bowles et al.
Integrating Multimodal Information in Large Pretrained Transformers
Wasifur Rahman, Md Kamrul Hasan, Sangwu Lee et al.
Large Scale Generative Multimodal Attribute Extraction for E-commerce Attributes
Anant Khandelwal, Happy Mittal, Shreyas Kulkarni et al.
CaMML: Context-Aware Multimodal Learner for Large Models
Yixin Chen, Shuai Zhang, Boran Han et al.
Beyond Text: Unveiling Multimodal Proficiency of Large Language Models with MultiAPI Benchmark
Xiao Liu, Jianfeng Lin, Jiawei Zhang
Speaking Beyond Language: A Large-Scale Multimodal Dataset for Learning Nonverbal Cues from Video-Grounded Dialogues
Youngmin Kim, Jiwan Chung, Jisoo Kim et al.
AlignMMBench: Evaluating Chinese Multimodal Alignment in Large Vision-Language Models
Yuhang Wu, Wenmeng Yu, Yean Cheng et al.
Testing Spatial Intuitions of Humans and Large Language and Multimodal Models in Analogies
Ivo Bueno, Anna Bavaresco, João Miguel Cunha et al.
How2Sign: A Large-Scale Multimodal Dataset for Continuous American Sign Language
Amanda Duarte, Shruti Palaskar, Lucas Ventura et al.
Active Data Curation Effectively Distills Large-Scale Multimodal Models
Vishaal Udandarao, Nikhil Parthasarathy, Muhammad Ferjad Naeem et al.
M3GYM: A Large-Scale Multimodal Multi-view Multi-person Pose Dataset for Fitness Activity Understanding in Real-world Settings
Qingzheng Xu, Ru Cao, Xin Shen et al.
Aligning Dialogue Agents with Global Feedback via Large Language Model Multimodal Reward Decomposition
Dong Won Lee, Hae Won Park, Cynthia Breazeal et al.
GenieBlue: Integrating both Linguistic and Multimodal Capabilities for Large Language Models on Mobile Devices
Xudong Lu, Yinghao Chen, Renshou Wu et al.
MedTrinity-25M: A Large-scale Multimodal Dataset with Multigranular Annotations for Medicine
Yunfei Xie, Ce Zhou, Lang Gao et al.
DeToxy: A Large-Scale Multimodal Dataset for Toxicity Classification in Spoken Utterances
Sreyan Ghosh, Samden Lepcha, S Sakshi et al.
SciOL and MuLMS-Img: Introducing a Large-Scale Multimodal Scientific Dataset and Models for Image-Text Tasks in the Scientific Domain
Tim Tarsi, Heike Adel, Jan Hendrik Metzen et al.
Advancing Multimodal LLMs by Large-Scale 3D Visual Instruction Dataset Generation
Liu He, Xiao Zeng, Yizhi Song et al.
Subspace-Aware Graph Construction and Contrastive Alignment for Multimodal Recommendation with Large Language Models
Haodong Li, Lianyong Qi, Weiming Liu et al.
DMGIN: How Multimodal LLMs Enhance Large Recommendation Models for Lifelong User Post-click Behaviors
Zhuoxing Wei, Qingchen Xie, Qi Liu et al.
Training and Evaluating Multimodal Word Embeddings with Large-scale Web Annotated Images
Junhua Mao, Jiajing Xu, Kevin Jing et al.
IMPACT: A Large-scale Integrated Multimodal Patent Analysis and Creation Dataset for Design Patents
Homaira Huda Shomee, Zhu Wang, Sourav Medya et al.
Learning Multimodal Volumetric Features for Large-Scale Neuron Tracing
Qihua Chen, Xuejin Chen, Chenxuan Wang et al.
GPT4MTS: Prompt-based Large Language Model for Multimodal Time-series Forecasting
Furong Jia, Kevin Wang, Yixiang Zheng et al.