Papers
Mono-InternVL: Pushing the Boundaries of Monolithic Multimodal Large Language Models with Endogenous Visual Pre-training
Gen Luo, Xue Yang, Wenhan Dou et al.
MonoPlace3D: Learning 3D-Aware Object Placement for 3D Monocular Detection
Rishubh Parihar, Srinjay Sarkar, Sarthak Vora et al.
MonoSplat: Generalizable 3D Gaussian Splatting from Monocular Depth Foundation Models
Yifan Liu, Keyu Fan, Weihao Yu et al.
MonoTAKD: Teaching Assistant Knowledge Distillation for Monocular 3D Object Detection
Hou-I Liu, Christine Wu, Jen-Hao Cheng et al.
MonSter: Marry Monodepth to Stereo Unleashes Power
Junda Cheng, Longliang Liu, Gangwei Xu et al.
Morpheus: Text-Driven 3D Gaussian Splat Shape and Color Stylization
Jamie Wynn, Zawar Qureshi, Jakub Powierza et al.
Mosaic3D: Foundation Dataset and Model for Open-Vocabulary 3D Segmentation
Junha Lee, Chunghyun Park, Jaesung Choe et al.
Mosaic of Modalities: A Comprehensive Benchmark for Multimodal Graph Learning
Jing Zhu, Yuhang Zhou, Shengyi Qian et al.
MOS-Attack: A Scalable Multi-objective Adversarial Attack Framework
Ping Guo, Cheng Gong, Xi Lin et al.
MoSca: Dynamic Gaussian Fusion from Casual Videos via 4D Motion Scaffolds
Jiahui Lei, Yijia Weng, Adam W. Harley et al.
MOS: Modeling Object-Scene Associations in Generalized Category Discovery
Zhengyuan Peng, Jinpeng Ma, Zhimin Sun et al.
MoST: Efficient Monarch Sparse Tuning for 3D Representation Learning
Xu Han, Yuan Tang, Jinfeng Xu et al.
MotiF: Making Text Count in Image Animation with Motion Focal Loss
Shijie Wang, Samaneh Azadi, Rohit Girdhar et al.
MotionBench: Benchmarking and Improving Fine-grained Video Motion Understanding for Vision Language Models
Wenyi Hong, Yean Cheng, Zhuoyi Yang et al.
Motion-Grounded Video Reasoning: Understanding and Perceiving Motion at Pixel Level
Andong Deng, Tongjia Chen, Shoubin Yu et al.
MotionMap: Representing Multimodality in Human Pose Forecasting
Reyhaneh Hosseininejad, Megh Shukla, Saeed Saadatnejad et al.
Motion Modes: What Could Happen Next?
Karran Pandey, Yannick Hold-Geoffroy, Matheus Gadelha et al.
MotionPro: A Precise Motion Controller for Image-to-Video Generation
Zhongwei Zhang, Fuchen Long, Zhaofan Qiu et al.
MotionPRO: Exploring the Role of Pressure in Human MoCap and Beyond
Shenghao Ren, Yi Lu, Jiayi Huang et al.
Motion Prompting: Controlling Video Generation with Motion Trajectories
Daniel Geng, Charles Herrmann, Junhwa Hur et al.
Motions as Queries: One-Stage Multi-Person Holistic Human Motion Capture
Kenkun Liu, Yurong Fu, Weihao Yuan et al.
MotionStone: Decoupled Motion Intensity Modulation with Diffusion Transformer for Image-to-Video Generation
Shuwei Shi, Biao Gong, Xi Chen et al.
Move-in-2D: 2D-Conditioned Human Motion Generation
Hsin-Ping Huang, Yang Zhou, Jui-Hsien Wang et al.
MoVE-KD: Knowledge Distillation for VLMs with Mixture of Visual Encoders
Jiajun Cao, Yuan Zhang, Tao Huang et al.
MovieBench: A Hierarchical Movie Level Dataset for Long Video Generation
Weijia Wu, Mingyu Liu, Zeyu Zhu et al.