Papers
MACE: Mass Concept Erasure in Diffusion Models
Shilin Lu, Zilan Wang, Leyang Li et al.
MADTP: Multimodal Alignment-Guided Dynamic Token Pruning for Accelerating Vision-Language Transformer
Jianjian Cao, Peng Ye, Shengze Li et al.
MAFA: Managing False Negatives for Vision-Language Pre-training
Jaeseok Byun, Dohoon Kim, Taesup Moon
MaGGIe: Masked Guided Gradual Human Instance Matting
Chuong Huynh, Seoung Wug Oh, Abhinav Shrivastava et al.
MagicAnimate: Temporally Consistent Human Image Animation using Diffusion Model
Zhongcong Xu, Jianfeng Zhang, Jun Hao Liew et al.
MAGICK: A Large-scale Captioned Dataset from Matting Generated Images using Chroma Keying
Ryan D. Burgert, Brian L. Price, Jason Kuen et al.
Magic Tokens: Select Diverse Tokens for Multi-modal Object Re-Identification
Pingping Zhang, Yuhao Wang, Yang Liu et al.
Make-It-Vivid: Dressing Your Animatable Biped Cartoon Characters from Text
Junshu Tang, Yanhong Zeng, Ke Fan et al.
Make Me a BNN: A Simple Strategy for Estimating Bayesian Uncertainty from Pre-trained Models
Gianni Franchi, Olivier Laurent, Maxence Leguery et al.
Make Pixels Dance: High-Dynamic Video Generation
Yan Zeng, Guoqiang Wei, Jiani Zheng et al.
Makeup Prior Models for 3D Facial Makeup Estimation and Applications
Xingchao Yang, Takafumi Taketomi, Yuki Endo et al.
Make-Your-Anchor: A Diffusion-based 2D Avatar Generation Framework
Ziyao Huang, Fan Tang, Yong Zhang et al.
Making Vision Transformers Truly Shift-Equivariant
Renan A. Rojas-Gomez, Teck-Yian Lim, Minh N. Do et al.
Making Visual Sense of Oracle Bones for You and Me
Runqi Qiao, Lan Yang, Kaiyue Pang et al.
MA-LMM: Memory-Augmented Large Multimodal Model for Long-Term Video Understanding
Bo He, Hengduo Li, Young Kyun Jang et al.
ManiFPT: Defining and Analyzing Fingerprints of Generative Models
Hae Jin Song, Mahyar Khayatkhoei, Wael AbdAlmageed
ManipLLM: Embodied Multimodal Large Language Model for Object-Centric Robotic Manipulation
Xiaoqi Li, Mingxu Zhang, Yiran Geng et al.
MANUS: Markerless Grasp Capture using Articulated 3D Gaussians
Chandradeep Pokhariya, Ishaan Nikhil Shah, Angela Xing et al.
MAPLM: A Real-World Large-Scale Vision-Language Benchmark for Map and Traffic Scene Understanding
Xu Cao, Tong Zhou, Yunsheng Ma et al.
MAP: MAsk-Pruning for Source-Free Model Intellectual Property Protection
Boyang Peng, Sanqing Qu, Yong Wu et al.
Map-Relative Pose Regression for Visual Re-Localization
Shuai Chen, Tommaso Cavallari, Victor Adrian Prisacariu et al.
MAPSeg: Unified Unsupervised Domain Adaptation for Heterogeneous Medical Image Segmentation Based on 3D Masked Autoencoding and Pseudo-Labeling
Xuzhe Zhang, Yuhao Wu, Elsa Angelini et al.
MarkovGen: Structured Prediction for Efficient Text-to-Image Generation
Sadeep Jayasumana, Daniel Glasner, Srikumar Ramalingam et al.
MART: Masked Affective RepresenTation Learning via Masked Temporal Distribution Distillation
Zhicheng Zhang, Pancheng Zhao, Eunil Park et al.
Mask4Align: Aligned Entity Prompting with Color Masks for Multi-Entity Localization Problems
Haoquan Zhang, Ronggang Huang, Yi Xie et al.