Papers
498 papers found
HotelMatch-LLM: Joint Multi-Task Training of Small and Large Language Models for Efficient Multimodal Hotel Retrieval
Arian Askari, Emmanouil Stergiadis, Ilya Gusev et al.
mOSCAR: A Large-scale Multilingual and Multimodal Document-level Corpus
Matthieu Futeral, Armel Randy Zebaze, Pedro Ortiz Suarez et al.
SignAlignLM: Integrating Multimodal Sign Language Processing into Large Language Models
Mert Inan, Anthony Sicilia, Malihe Alikhani
MEIT: Multimodal Electrocardiogram Instruction Tuning on Large Language Models for Report Generation
Zhongwei Wan, Che Liu, Xin Wang et al.
MIND: Multimodal Shopping Intention Distillation from Large Vision-language Models for E-commerce Purchase Understanding
Baixuan Xu, Weiqi Wang, Haochen Shi et al.
MM-ChatAlign: A Novel Multimodal Reasoning Framework based on Large Language Models for Entity Alignment
Xuhui Jiang, Yinghan Shen, Zhichao Shi et al.
Recent Advances in Online Hate Speech Moderation: Multimodality and the Role of Large Models
Ming Shan Hee, Shivam Sharma, Rui Cao et al.
Synergizing Multimodal Temporal Knowledge Graphs and Large Language Models for Social Relation Recognition
Haorui Wang, Zheng Wang, Yuxuan Zhang et al.
Zenseact Open Dataset: A Large-Scale and Diverse Multimodal Dataset for Autonomous Driving
Mina Alibeigi, William Ljungbergh, Adam Tonderski et al.
MMIE: Massive Multimodal Interleaved Comprehension Benchmark for Large Vision-Language Models
Peng Xia, Siwei Han, Shi Qiu et al.
WangLab at MEDIQA-M3G 2024: Multimodal Medical Answer Generation using Large Language Models
Ronald Xie, Steven Palayew, Augustin Toma et al.
Advancing Multimodal Teacher Sentiment Analysis: The Large-Scale T-MED Dataset & the Effective AAM-TSA Model
Zhiyi Duan, Xiangren Wang, Hongyu Yuan et al.
Expanding Large Pre-Trained Unimodal Models With Multimodal Information Injection for Image-Text Multimodal Classification
Tao Liang, Guosheng Lin, Mingyang Wan et al.
MMRC: A Large-Scale Benchmark for Understanding Multimodal Large Language Model in Real-World Conversation
Haochen Xue, Feilong Tang, Ming Hu et al.
Multimodal Autoregressive Pre-training of Large Vision Encoders
Enrico Fini, Mustafa Shukor, Xiujun Li et al.
YouMakeup: A Large-Scale Domain-Specific Multimodal Dataset for Fine-Grained Semantic Comprehension
Weiying Wang, Yongcheng Wang, Shizhe Chen et al.
UnCo: Uncertainty-Driven Collaborative Framework of Large and Small Models for Grounded Multimodal NER
Jielong Tang, Yang Yang, Jianxing Yu et al.
Looking Beyond Text: Reducing Language Bias in Large Vision-Language Models via Multimodal Dual-Attention and Soft-Image Guidance
Haozhe Zhao, Shuzheng Si, Liang Chen et al.
LIFTED: Multimodal Clinical Trial Outcome Prediction via Large Language Models and Mixture-of-Experts
Wenhao Zheng, Liaoyaqi Wang, Dongshen Peng et al.
MIntRec2.0: A Large-scale Benchmark Dataset for Multimodal Intent Recognition and Out-of-scope Detection in Conversations
Hanlei Zhang, Xin Wang, Hua Xu et al.
ZooProbe: A Data Engine for Evaluating, Exploring, and Evolving Large-scale Training Data for Multimodal LLMs
Yi-Kai Zhang, Shiyin Lu, Qing-Guo Chen et al.
YouMakeup: A Large-Scale Domain-Specific Multimodal Dataset for Fine-Grained Semantic Comprehension
Weiying Wang, Yongcheng Wang, Shizhe Chen et al.
Beyond Guardrails: Advanced Safety for Large Language Models — Monolingual, Multilingual and Multimodal Frontiers
Somnath Banerjee, Rima Hazra, Animesh Mukherjee
NOTA: Multimodal Music Notation Understanding for Visual Large Language Model
Mingni Tang, Jiajia Li, Lu Yang et al.