Papers
8,506 papers found
Mixture of Experts Guided by Gaussian Splatters Matters: A new Approach to Weakly-Supervised Video Anomaly Detection
Giacomo D' Amicantonio, Snehashis Majhi, Quan Kong et al.
Mixture-of-Scores: Robust Image-Text Data Valuation via Three Lines of Code
Sitong Wu, Haoru Tan, Yukang Chen et al.
MMAD: Multi-label Micro-Action Detection in Videos
Kun Li, Pengyu Liu, Dan Guo et al.
MMAIF: Multi-task and Multi-degradation All-in-One for Image Fusion with Language Guidance
Zihan Cao, Yu Zhong, Ziqi Wang et al.
MMAT-1M: A Large Reasoning Dataset for Multimodal Agent Tuning
Tianhong Gao, Yannian Fu, Weiqun Wu et al.
mmCooper: A Multi-agent Multi-stage Communication-efficient and Collaboration-robust Cooperative Perception Framework
Bingyi Liu, Jian Teng, Hongfei Xue et al.
MMCR: Benchmarking Cross-Source Reasoning in Scientific Papers
Yang Tian, Zheng Lu, Mingqi Gao et al.
MMGeo: Multimodal Compositional Geo-Localization for UAVs
Yuxiang Ji, Boyong He, Zhuoyue Tan et al.
MM-IFEngine: Towards Multimodal Instruction Following
Shengyuan Ding, Shenxi Wu, Xiangyu Zhao et al.
MMOne: Representing Multiple Modalities in One Scene
Zhifeng Gu, Bing Wang
MMReason: An Open-Ended Multi-Modal Multi-Step Reasoning Benchmark for MLLMs Toward AGI
Huanjin Yao, Jiaxing Huang, Yawen Qiu et al.
MM-Spatial: Exploring 3D Spatial Understanding in Multimodal LLMs
Erik Daxberger, Nina Wenzel, David Griffiths et al.
M-Net: MRI Brain Tumor Sequential Segmentation Network via Mesh-Cast
Jiacheng Lu, Hui Ding, Shiyu Zhang et al.
MobileIE: An Extremely Lightweight and Effective ConvNet for Real-Time Image Enhancement on Mobile Devices
Hailong Yan, Ao Li, Xiangtao Zhang et al.
MobileViCLIP: An Efficient Video-Text Model for Mobile Devices
Min Yang, Zihan Jia, Zhilin Dai et al.
Mobile Video Diffusion
Haitam Ben Yahia, Denis Korzhenkov, Ioannis Lelekas et al.
MOBIUS: Big-to-Mobile Universal Instance Segmentation via Multi-modal Bottleneck Fusion and Calibrated Decoder Pruning
Mattia Segu, Marta Tintore Gazulla, Yongqin Xian et al.
ModalTune: Fine-Tuning Slide-Level Foundation Models with Multi-Modal Information for Multi-task Learning in Digital Pathology
Vishwesh Ramanathan, Tony Xu, Pushpak Pati et al.
Modeling Human Gaze Behavior with Diffusion Models for Unified Scanpath Prediction
Giuseppe Cartella, Vittorio Cuculo, Alessandro D'Amelio et al.
Modeling Saliency Dataset Bias
Matthias Kümmerer, Harneet Singh Khanuja, Matthias Bethge
Model Reveals What to Cache: Profiling-Based Feature Reuse for Video Diffusion Models
Xuran Ma, Yexin Liu, Yaofu Liu et al.
Moderating the Generalization of Score-based Generative Model
Wan Jiang, He Wang, Xin Zhang et al.
ModSkill: Physical Character Skill Modularization
Yiming Huang, Zhiyang Dou, Lingjie Liu
MOERL: When Mixture-of-Experts Meet Reinforcement Learning for Adverse Weather Image Restoration
Tao Wang, Peiwen Xia, Bo Li et al.
MoFRR: Mixture of Diffusion Models for Face Retouching Restoration
Jiaxin Liu, Qichao Ying, Zhenxing Qian et al.