Research Explorer

Sample then Identify: A General Framework for Risk Control and Assessment in Multimodal Large Language Models

Qingni Wang, Tiantian Geng, Zhiyuan Wang et al.

2025 ICLR

Grounding Multimodal Large Language Model in GUI World

Weixian Lei, Difei Gao, Mike Zheng Shou

2025 ICLR

Mitigating Modality Prior-Induced Hallucinations in Multimodal Large Language Models via Deciphering Attention Causality

Guanyu Zhou, Yibo Yan, Xin Zou et al.

2025 ICLR

MMAD: A Comprehensive Benchmark for Multimodal Large Language Models in Industrial Anomaly Detection

Xi Jiang, Jian Li, Hanqiu Deng et al.

2025 ICLR

Feast Your Eyes: Mixture-of-Resolution Adaptation for Multimodal Large Language Models

Gen Luo, Yiyi Zhou, Yuxin Zhang et al.

2025 ICLR

$\gamma-$MoD: Exploring Mixture-of-Depth Adaptation for Multimodal Large Language Models

Yaxin Luo, Gen Luo, Jiayi Ji et al.

2025 ICLR

Interpretable Bilingual Multimodal Large Language Model for Diverse Biomedical Tasks

Lehan Wang, Haonan Wang, Honglong Yang et al.

2025 ICLR

Safety of Multimodal Large Language Models on Images and Text

Xin Liu, Yichen Zhu, Yunshi Lan et al.

2024 IJCAI

Incorporating Visual Experts to Resolve the Information Loss in Multimodal Large Language Models

Xin He, Longhui Wei, Lingxi Xie et al.

2025 IJCAI

Words Over Pixels? Rethinking Vision in Multimodal Large Language Models

Anubhooti Jain, Mayank Vatsa, Richa Singh

2025 IJCAI

Multimodal Large Language Models with Fusion Low Rank Adaptation for Device Directed Speech Detection

Shruti Palaskar, Ognjen Rudovic, Sameer Dharur et al.

2024 INTERSPEECH

Multimodal large language models for inclusive collaboration learning tasks

Armanda Lewis

2022 NAACL

Gemini Goes to Med School: Exploring the Capabilities of Multimodal Large Language Models on Medical Challenge Problems & Hallucinations

Ankit Pal, Malaikannan Sankarasubbu

2024 NAACL

DeepPavlov at SemEval-2024 Task 3: Multimodal Large Language Models in Emotion Reasoning

Julia Belikova, Dmitrii Kosenko

2024 NAACL

Protecting Privacy in Multimodal Large Language Models with MLLMU-Bench

Zheyuan Liu, Guangyao Dou, Mengzhao Jia et al.

2025 NAACL

DREAM: Disentangling Risks to Enhance Safety Alignment in Multimodal Large Language Models

Jianyu Liu, Hangyu Guo, Ranjie Duan et al.

2025 NAACL

WHEN TOM EATS KIMCHI: Evaluating Cultural Awareness of Multimodal Large Language Models in Cultural Mixture Contexts

Jun Seong Kim, Kyaw Ye Thu, Javad Ismayilzada et al.

2025 NAACL

Caption Generation in Cultural Heritage: Crowdsourced Data and Tuning Multimodal Large Language Models

Artem Reshetnikov, Maria-Cristina Marinescu

2025 NAACL

DeepPavlov at SemEval-2024 Task 3: Multimodal Large Language Models in Emotion Reasoning

Julia Belikova, Dmitrii Kosenko

2024 SEMEVAL

MLLM-LLaVA-FL: Multimodal Large Language Model Assisted Federated Learning

Jianyi Zhang, Hao Yang, Ang Li et al.

2025 WACV

MLLM-Tool: A Multimodal Large Language Model for Tool Agent Learning

Chenyu Wang, Weixin Luo, Sixun Dong et al.

2025 WACV

FG-TRACER: Tracing Information Flow in Multimodal Large Language Models in Free-Form Generation

Alessia Saporita, Vittorio Pipoli, Federico Bolelli et al.

2026 WACV

SmokeBench: Evaluating Multimodal Large Language Models for Wildfire Smoke Detection

Tianye Qi, Weihao Li, Nick Barnes

2026 WACV

ImageChain: Advancing Sequential Image-to-Text Reasoning in Multimodal Large Language Models

Danae Sanchez Villegas, Ingo Ziegler, Desmond Elliott

2026 WACV

ST-Think: How Multimodal Large Language Models Reason About 4D Worlds from Ego-Centric Videos

Peiran Wu, Yunze Liu, Miao Liu et al.

2026 WACV

Papers