Papers
MirageRoom: 3D Scene Segmentation with 2D Pre-trained Models by Mirage Projection
Haowen Sun, Yueqi Duan, Juncheng Yan et al.
Mirasol3B: A Multimodal Autoregressive Model for Time-Aligned and Contextual Modalities
AJ Piergiovanni, Isaac Noble, Dahun Kim et al.
Misalignment-Robust Frequency Distribution Loss for Image Transformation
Zhangkai Ni, Juncheng Wu, Zian Wang et al.
Mitigating Motion Blur in Neural Radiance Fields with Events and Frames
Marco Cannici, Davide Scaramuzza
Mitigating Noisy Correspondence by Geometrical Structure Consistency Learning
Zihua Zhao, Mengxi Chen, Tianjie Dai et al.
Mitigating Object Dependencies: Improving Point Cloud Self-Supervised Learning through Object Exchange
Yanhao Wu, Tong Zhang, Wei Ke et al.
Mitigating Object Hallucinations in Large Vision-Language Models through Visual Contrastive Decoding
Sicong Leng, Hang Zhang, Guanzheng Chen et al.
Mixed-Precision Quantization for Federated Learning on Resource-Constrained Heterogeneous Devices
Huancheng Chen, Haris Vikalo
MLIP: Enhancing Medical Visual Representation with Divergence Encoder and Knowledge-guided Contrastive Learning
Zhe Li, Laurence T. Yang, Bocheng Ren et al.
MLP Can Be A Good Transformer Learner
Sihao Lin, Pumeng Lyu, Dongrui Liu et al.
MMA-Diffusion: MultiModal Attack on Diffusion Models
Yijun Yang, Ruiyuan Gao, Xiaosen Wang et al.
MMA: Multi-Modal Adapter for Vision-Language Models
Lingxiao Yang, Ru-Yuan Zhang, Yanchen Wang et al.
MMCert: Provable Defense against Adversarial Attacks to Multi-modal Models
Yanting Wang, Hongye Fu, Wei Zou et al.
MMM: Generative Masked Motion Model
Ekkasit Pinyoanuntapong, Pu Wang, Minwoo Lee et al.
MMMU: A Massive Multi-discipline Multimodal Understanding and Reasoning Benchmark for Expert AGI
Xiang Yue, Yuansheng Ni, Kai Zhang et al.
MM-Narrator: Narrating Long-form Videos with Multimodal In-Context Learning
Chaoyi Zhang, Kevin Lin, Zhengyuan Yang et al.
MMSum: A Dataset for Multimodal Summarization and Thumbnail Generation of Videos
Jielin Qiu, Jiacheng Zhu, William Han et al.
MMVP: A Multimodal MoCap Dataset with Vision and Pressure Sensors
He Zhang, Shenghao Ren, Haolei Yuan et al.
M&M VTO: Multi-Garment Virtual Try-On and Editing
Luyang Zhu, Yingwei Li, Nan Liu et al.
MobileCLIP: Fast Image-Text Models through Multi-Modal Reinforced Training
Pavan Kumar Anasosalu Vasu, Hadi Pouransari, Fartash Faghri et al.
MoCha-Stereo: Motif Channel Attention Network for Stereo Matching
Ziyang Chen, Wei Long, He Yao et al.
Modality-agnostic Domain Generalizable Medical Image Segmentation by Multi-Frequency in Multi-Scale Attention
Ju-Hyeon Nam, Nur Suriza Syazwany, Su Jung Kim et al.
Modality-Agnostic Structural Image Representation Learning for Deformable Multi-Modality Medical Image Registration
Tony C. W. Mok, Zi Li, Yunhao Bai et al.
Modality-Collaborative Test-Time Adaptation for Action Recognition
Baochen Xiong, Xiaoshan Yang, Yaguang Song et al.