Papers
Mixture-of-Trees: Learning to Select and Weigh Reasoning Paths for Efficient LLM Inference
Yangbo Wei, Zhen Huang, Shaoqiang Lu et al.
MizanQA: A Benchmark for Multi-Answer Moroccan Legal QA
Adil Bahaj, Mounir Ghogho
MLLM Enriched Explainable Multiple Clustering
Shan Zhang, Liangrui Ren, Qiaoyu Tan et al.
mllm-shap: A Shapley Value Explainability Platform for Text-Audio Multimodal Large Language Models
Jakub Muszyński, Paweł Pozorski, Maria Ganzha
M-Loss: Quantifying Model Merging Compatibility with Limited Unlabeled Data
Tiantong Wang, Yiyang Duan, Haoyu Chen et al.
MM4Rec: Multi-Source and Multi-Scenario Recommender for Unified User Preference
Chu-Chun Yu, Ming-Yi Hong, Miao-Chen Chiang et al.
MMAC: A Multilingual, Multimodal Alignment Framework for Cultural Grounding Evaluation
Weihua Zheng, Zhengyuan Liu, Tanmoy Chakraborty et al.
MMAU-Pro: A Challenging and Comprehensive Benchmark for Holistic Evaluation of Audio General Intelligence
Sonal Kumar, Šimon Sedláček, Vaibhavi Lokegaonkar et al.
MMBERT: Scaled Mixture-of-Experts Multimodal BERT for Robust Chinese Hate Speech Detection Under Cloaking Perturbations
Qiyao Xue, Yuchen Dou, Zheyuan Ryan Shi et al.
MM-BizRAG: Rethinking Multimodal Retrieval-Augmented Generation for General Purpose Enterprise Q&A
Hanoz Bhathena, Parin Rajesh Jhaveri, Rohan Mittal et al.
MMCLIP: Cross-Modal Attention Masked Modelling for Medical Language-Image Pre-Training
Biao Wu, Yutong Xie, Zeyu Zhang et al.
MMCM: Multimodality-aware Metric using Clustering-based Modes for Probabilistic Human Motion Prediction
Kyotaro Tokoro, Hiromu Taketsugu, Norimichi Ukita
MMErroR: A Benchmark for Erroneous Reasoning in Vision-Language Models
Yang Shi, Yifeng Xie, Minzhe Guo et al.
MME-SCI: A Comprehensive and Challenging Science Benchmark for Multimodal Large Language Models
Jiacheng Ruan, Dan Jiang, Xian Gao et al.
MMG-Vid: Maximizing Marginal Gains at Segment-level and Token-level for Efficient Video LLMs
Junpeng Ma, Qizhe Zhang, Ming Lu et al.
MMG-VL: A Vision-Language Driven Approach for Multi-Person Motion Generation
Songyuan Yang, Wanrong Huang, Yinuo Liu et al.
MMHOI: Modeling Complex 3D Multi-Human Multi-Object Interactions
Kaen Kogashi, Anoop Cherian, Meng-Yu Jennifer Kuo
MMhops-R1: Multimodal Multi-hop Reasoning
Tao Zhang, Ziqi Zhang, Zongyang Ma et al.
MMIFEvol: Towards Evolutionary Multimodal Instruction Following
Haoyu Wang, Sihang Jiang, Xiangru Zhu et al.
M-MiniGPT4: Multilingual VLLM Alignment via Translated Data
Seung Hun Eddie Han, Youssef Mohamed, Mohamed Elhoseiny
mmJEPA-ECG: Cross-Posture Robust Contactless Electrocardiogram Monitoring via Millimeter Wave Radar Sensing
Ziyang Liu, Siyuan He, Feng Liang et al.
MM-JudgeBias: A Benchmark for Evaluating Compositional Biases in MLLM-as-a-Judge
Sua Lee, Sanghee Park, Jinbae Im
MMMamba: A Versatile Cross-Modal in Context Fusion Framework for Pan-Sharpening and Zero-Shot Image Enhancement
Yingying Wang, Xuanhua He, Chen Wu et al.
MMPG: MoE-based Adaptive Multi-Perspective Graph Fusion for Protein Representation Learning
Yusong Wang, Jialun Shen, Zhihao Wu et al.
MM-PoisonRAG: Disrupting Multimodal RAG with Local and Global Knowledge Poisoning Attacks
Hyeonjeong Ha, Qiusi Zhan, Jeonghwan Kim et al.