Papers
Mind the Label Shift of Augmentation-Based Graph OOD Generalization
Junchi Yu, Jian Liang, Ran He
Minimizing Maximum Model Discrepancy for Transferable Black-Box Targeted Attacks
Anqi Zhao, Tong Chu, Yahao Liu et al.
Minimizing the Accumulated Trajectory Error To Improve Dataset Distillation
Jiawei Du, Yidi Jiang, Vincent Y. F. Tan et al.
MISC210K: A Large-Scale Dataset for Multi-Instance Semantic Correspondence
Yixuan Sun, Yiwen Huang, Haijing Guo et al.
MIST: Multi-Modal Iterative Spatial-Temporal Transformer for Long-Form Video Question Answering
Difei Gao, Luowei Zhou, Lei Ji et al.
Mitigating Task Interference in Multi-Task Learning via Explicit Task Routing With Non-Learnable Primitives
Chuntao Ding, Zhichao Lu, Shangguang Wang et al.
Mixed Autoencoder for Self-Supervised Visual Representation Learning
Kai Chen, Zhili Liu, Lanqing Hong et al.
MixMAE: Mixed and Masked Autoencoder for Efficient Pretraining of Hierarchical Vision Transformers
Jihao Liu, Xin Huang, Jinliang Zheng et al.
MixNeRF: Modeling a Ray With Mixture Density for Novel View Synthesis From Sparse Inputs
Seunghyeon Seo, Donghoon Han, Yeonjin Chang et al.
MixPHM: Redundancy-Aware Parameter-Efficient Tuning for Low-Resource Visual Question Answering
Jingjing Jiang, Nanning Zheng
MixSim: A Hierarchical Framework for Mixed Reality Traffic Simulation
Simon Suo, Kelvin Wong, Justin Xu et al.
MixTeacher: Mining Promising Labels With Mixed Scale Teacher for Semi-Supervised Object Detection
Liang Liu, Boshen Zhang, Jiangning Zhang et al.
(ML)$^2$P-Encoder: On Exploration of Channel-Class Correlation for Multi-Label Zero-Shot Learning
Ziming Liu, Song Guo, Xiaocheng Lu et al.
MM-3DScene: 3D Scene Understanding by Customizing Masked Modeling With Informative-Preserved Reconstruction and Self-Distilled Consistency
Mingye Xu, Mutian Xu, Tong He et al.
MMANet: Margin-Aware Distillation and Modality-Aware Regularization for Incomplete Multimodal Learning
Shicai Wei, Chunbo Luo, Yang Luo
MM-Diffusion: Learning Multi-Modal Diffusion Models for Joint Audio and Video Generation
Ludan Ruan, Yiyang Ma, Huan Yang et al.
MMG-Ego4D: Multimodal Generalization in Egocentric Action Recognition
Xinyu Gong, Sreyas Mohan, Naina Dhingra et al.
MMVC: Learned Multi-Mode Video Compression With Block-Based Prediction Mode Selection and Density-Adaptive Entropy Coding
Bowen Liu, Yu Chen, Rakesh Chowdary Machineni et al.
MobileBrick: Building LEGO for 3D Reconstruction on Mobile Devices
Kejie Li, Jia-Wang Bian, Robert Castle et al.
MobileNeRF: Exploiting the Polygon Rasterization Pipeline for Efficient Neural Field Rendering on Mobile Architectures
Zhiqin Chen, Thomas Funkhouser, Peter Hedman et al.
MobileOne: An Improved One Millisecond Mobile Backbone
Pavan Kumar Anasosalu Vasu, James Gabriel, Jeff Zhu et al.
Mobile User Interface Element Detection via Adaptively Prompt Tuning
Zhangxuan Gu, Zhuoer Xu, Haoxing Chen et al.
MobileVOS: Real-Time Video Object Segmentation Contrastive Learning Meets Knowledge Distillation
Roy Miles, Mehmet Kerim Yucel, Bruno Manganelli et al.
Modality-Agnostic Debiasing for Single Domain Generalization
Sanqing Qu, Yingwei Pan, Guang Chen et al.
Modality-Invariant Visual Odometry for Embodied Vision
Marius Memmel, Roman Bachmann, Amir Zamir