Papers
Mixture of Ranks with Degradation-Aware Routing for One-Step Real-World Image Super-Resolution
Xiao He, Zhijun Tu, Kun Cheng et al.
Mixture-of-Trees: Learning to Select and Weigh Reasoning Paths for Efficient LLM Inference
Yangbo Wei, Zhen Huang, Shaoqiang Lu et al.
MizanQA: A Benchmark for Multi-Answer Moroccan Legal QA
Adil Bahaj, Mounir Ghogho
MLLM Enriched Explainable Multiple Clustering
Shan Zhang, Liangrui Ren, Qiaoyu Tan et al.
M-Loss: Quantifying Model Merging Compatibility with Limited Unlabeled Data
Tiantong Wang, Yiyang Duan, Haoyu Chen et al.
MM4Rec: Multi-Source and Multi-Scenario Recommender for Unified User Preference
Chu-Chun Yu, Ming-Yi Hong, Miao-Chen Chiang et al.
MMAU-Pro: A Challenging and Comprehensive Benchmark for Holistic Evaluation of Audio General Intelligence
Sonal Kumar, Šimon Sedláček, Vaibhavi Lokegaonkar et al.
MMBERT: Scaled Mixture-of-Experts Multimodal BERT for Robust Chinese Hate Speech Detection Under Cloaking Perturbations
Qiyao Xue, Yuchen Dou, Zheyuan Ryan Shi et al.
MMCM: Multimodality-aware Metric using Clustering-based Modes for Probabilistic Human Motion Prediction
Kyotaro Tokoro, Hiromu Taketsugu, Norimichi Ukita
MME-SCI: A Comprehensive and Challenging Science Benchmark for Multimodal Large Language Models
Jiacheng Ruan, Dan Jiang, Xian Gao et al.
MMG-Vid: Maximizing Marginal Gains at Segment-level and Token-level for Efficient Video LLMs
Junpeng Ma, Qizhe Zhang, Ming Lu et al.
MMG-VL: A Vision-Language Driven Approach for Multi-Person Motion Generation
Songyuan Yang, Wanrong Huang, Yinuo Liu et al.
MMHOI: Modeling Complex 3D Multi-Human Multi-Object Interactions
Kaen Kogashi, Anoop Cherian, Meng-Yu Jennifer Kuo
MMhops-R1: Multimodal Multi-hop Reasoning
Tao Zhang, Ziqi Zhang, Zongyang Ma et al.
MMIFEvol: Towards Evolutionary Multimodal Instruction Following
Haoyu Wang, Sihang Jiang, Xiangru Zhu et al.
M-MiniGPT4: Multilingual VLLM Alignment via Translated Data
Seung Hun Eddie Han, Youssef Mohamed, Mohamed Elhoseiny
mmJEPA-ECG: Cross-Posture Robust Contactless Electrocardiogram Monitoring via Millimeter Wave Radar Sensing
Ziyang Liu, Siyuan He, Feng Liang et al.
MMMamba: A Versatile Cross-Modal in Context Fusion Framework for Pan-Sharpening and Zero-Shot Image Enhancement
Yingying Wang, Xuanhua He, Chen Wu et al.
MMPG: MoE-based Adaptive Multi-Perspective Graph Fusion for Protein Representation Learning
Yusong Wang, Jialun Shen, Zhihao Wu et al.
mmPred: Radar-based Human Motion Prediction in the Dark
Junqiao Fan, Haocong Rao, Jiarui Zhang et al.
MM-R1: Unleashing the Power of Unified Multimodal Large Language Models for Personalized Image Generation
Qian Liang, Yujia Wu, Kuncheng Li et al.
MMRA: A Benchmark for Evaluating Multi-Granularity and Multi-Image Relational Association Capabilities in Large Visual Language Models
Siwei Wu, King Zhu, Yu Bai et al.
MMRAG-RFT: Two-stage Reinforcement Fine-tuning for Explainable Multi-modal Retrieval-augmented Generation
Shengwei Zhao, Jingwen Yao, Sitong Wei et al.
MM-TS: Multi-Modal Temperature and Margin Schedules for Contrastive Learning with Long-Tail Data
Siarhei Sheludzko, Dhimitrios Duka, Bernt Schiele et al.
MMUIE: Massive Multi-Domain Universal Information Extraction for Long Documents
Shuyi Zhang, Zhenbin Chen, Shuting Li et al.