Papers
11,015 papers found
Mixture Compressor for Mixture-of-Experts LLMs Gains More
Wei Huang, Yue Liao, Jianhui Liu et al.
Mixture-of-Agents Enhances Large Language Model Capabilities
Junlin Wang, Jue WANG, Ben Athiwaratkun et al.
Mixture of Attentions For Speculative Decoding
Matthieu Zimmer, Milan Gritta, Gerasimos Lampouras et al.
Mixture of Experts Made Personalized: Federated Prompt Learning for Vision-Language Models
Jun Luo, Chen Chen, Shandong Wu
Mixture of In-Context Prompters for Tabular PFNs
Derek Qiang Xu, F Olcay Cirit, Reza Asadi et al.
Mixture of Parrots: Experts improve memorization more than reasoning
Samy Jelassi, Clara Mohri, David Brandfonbrener et al.
MLE-bench: Evaluating Machine Learning Agents on Machine Learning Engineering
Jun Shern Chan, Neil Chowdhury, Oliver Jaffe et al.
MLLM as Retriever: Interactively Learning Multimodal Retrieval for Embodied Agents
Junpeng Yue, Xinrun Xu, Börje F. Karlsson et al.
MLLM can see? Dynamic Correction Decoding for Hallucination Mitigation
Chenxi Wang, Xiang Chen, Ningyu Zhang et al.
MLLMs Know Where to Look: Training-free Perception of Small Visual Details with Multimodal LLMs
Jiarui Zhang, Mahyar Khayatkhoei, Prateek Chhikara et al.
MLPs Learn In-Context on Regression and Classification Tasks
William Lingxiao Tong, Cengiz Pehlevan
MM1.5: Methods, Analysis & Insights from Multimodal LLM Fine-tuning
Haotian Zhang, Mingfei Gao, Zhe Gan et al.
MMAD: A Comprehensive Benchmark for Multimodal Large Language Models in Industrial Anomaly Detection
Xi Jiang, Jian Li, Hanqiu Deng et al.
MMAU: A Massive Multi-Task Audio Understanding and Reasoning Benchmark
S Sakshi, Utkarsh Tyagi, Sonal Kumar et al.
MMDisCo: Multi-Modal Discriminator-Guided Cooperative Diffusion for Joint Audio and Video Generation
Akio Hayakawa, Masato Ishii, Takashi Shibuya et al.
MMDT: Decoding the Trustworthiness and Safety of Multimodal Foundation Models
Chejian Xu, Jiawei Zhang, Zhaorun Chen et al.
MMed-RAG: Versatile Multimodal RAG System for Medical Vision Language Models
Peng Xia, Kangyu Zhu, Haoran Li et al.
MMEgo: Towards Building Egocentric Multimodal LLMs for Video QA
Hanrong Ye, Haotian Zhang, Erik Daxberger et al.
MM-EMBED: UNIVERSAL MULTIMODAL RETRIEVAL WITH MULTIMODAL LLMS
Sheng-Chieh Lin, Chankyu Lee, Mohammad Shoeybi et al.
MME-RealWorld: Could Your Multimodal LLM Challenge High-Resolution Real-World Scenarios that are Difficult for Humans?
YiFan Zhang, Huanyu Zhang, Haochen Tian et al.
MMFakeBench: A Mixed-Source Multimodal Misinformation Detection Benchmark for LVLMs
Xuannan Liu, Zekun Li, Pei Pei Li et al.
MMIE: Massive Multimodal Interleaved Comprehension Benchmark for Large Vision-Language Models
Peng Xia, Siwei Han, Shi Qiu et al.
MMIU: Multimodal Multi-image Understanding for Evaluating Large Vision-Language Models
Fanqing Meng, Jin Wang, Chuanhao Li et al.
MMKE-Bench: A Multimodal Editing Benchmark for Diverse Visual Knowledge
Yuntao Du., Kailin Jiang, Zhi Gao et al.
MMQA: Evaluating LLMs with Multi-Table Multi-Hop Complex Questions
Jian Wu, Linyi Yang, Dongyuan Li et al.