Papers
WAVE: Weight Templates for Adaptive Initialization of Variable-sized Models
Fu Feng, Yucheng Xie, Jing Wang et al.
Weakly Supervised Contrastive Adversarial Training for Learning Robust Features from Semi-supervised Data
Lilin Zhang, Chengpei Wu, Ning Yang
Weakly Supervised Semantic Segmentation via Progressive Confidence Region Expansion
Xiangfeng Xu, Pinyi Zhang, Wenxuan Huang et al.
Weakly Supervised Temporal Action Localization via Dual-Prior Collaborative Learning Guided by Multimodal Large Language Models
Quan Zhang, Jinwei Fang, Rui Yuan et al.
WeakMCN: Multi-task Collaborative Network for Weakly Supervised Referring Expression Comprehension and Segmentation
Silin Cheng, Yang Liu, Xinwei He et al.
WeatherGen: A Unified Diverse Weather Generator for LiDAR Point Clouds via Spider Mamba Diffusion
Yang Wu, Yun Zhu, Kaihua Zhang et al.
WeGen: A Unified Model for Interactive Multimodal Generation as We Chat
Zhipeng Huang, Shaobin Zhuang, Canmiao Fu et al.
WF-VAE: Enhancing Video VAE by Wavelet-Driven Energy Flow for Latent Video Diffusion Model
Zongjian Li, Bin Lin, Yang Ye et al.
What Makes a Good Dataset for Knowledge Distillation?
Logan Frank, Jim Davis
What's in the Image? A Deep-Dive into the Vision of Vision Language Models
Omri Kaduri, Shai Bagon, Tali Dekel
When Domain Generalization meets Generalized Category Discovery: An Adaptive Task-Arithmetic Driven Approach
Vaibhav Rathore, Shubhranil B, Saikat Dutta et al.
When the Future Becomes the Past: Taming Temporal Correspondence for Self-supervised Video Representation Learning
Yang Liu, Qianqian Xu, Peisong Wen et al.
Where's the Liability in the Generative Era? Recovery-based Black-Box Detection of AI-Generated Content
Haoyue Bai, Yiyou Sun, Wei Cheng et al.
Where the Devil Hides: Deepfake Detectors Can No Longer Be Trusted
Shuaiwei Yuan, Junyu Dong, Yuezun Li
Which Viewpoint Shows it Best? Language for Weakly Supervising View Selection in Multi-view Instructional Videos
Sagnik Majumder, Tushar Nagarajan, Ziad Al-Halah et al.
WildAvatar: Learning In-the-wild 3D Avatars from the Web
Zihao Huang, Shoukang Hu, Guangcong Wang et al.
WildGS-SLAM: Monocular Gaussian Splatting SLAM in Dynamic Environments
Jianhao Zheng, Zihan Zhu, Valentin Bieri et al.
WiLoR: End-to-end 3D Hand Localization and Reconstruction in-the-wild
Rolandos Alexandros Potamias, Jinglei Zhang, Jiankang Deng et al.
WISE: A Framework for Gigapixel Whole-Slide-Image Lossless Compression
Yu Mao, Jun Wang, Nan Guan et al.
WISH: Weakly Supervised Instance Segmentation using Heterogeneous Labels
Hyeokjun Kweon, Kuk-Jin Yoon
WISNet: Pseudo Label Generation on Unbalanced and Patch Annotated Waste Images
Shifan Zhang, Hongzi Zhu, Yinan He et al.
Wonderland: Navigating 3D Scenes from a Single Image
Hanwen Liang, Junli Cao, Vidit Goel et al.
WonderWorld: Interactive 3D Scene Generation from a Single Image
Hong-Xing Yu, Haoyi Duan, Charles Herrmann et al.
Words or Vision: Do Vision-Language Models Have Blind Faith in Text?
Ailin Deng, Tri Cao, Zhirui Chen et al.
World-consistent Video Diffusion with Explicit 3D Modeling
Qihang Zhang, Shuangfei Zhai, Miguel Ángel Bautista Martin et al.