Papers
VS: Reconstructing Clothed 3D Human from Single Image via Vertex Shift
Leyuan Liu, Yuhan Li, Yunqi Gao et al.
VTimeLLM: Empower LLM to Grasp Video Moments
Bin Huang, Xin Wang, Hong Chen et al.
VTQA: Visual Text Question Answering via Entity Alignment and Cross-Media Reasoning
Kang Chen, Xiangqian Wu
WALT3D: Generating Realistic Training Data from Time-Lapse Imagery for Reconstructing Dynamic Objects Under Occlusion
Khiem Vuong, N Dinesh Reddy, Robert Tamburo et al.
WANDR: Intention-guided Human Motion Generation
Markos Diomataris, Nikos Athanasiou, Omid Taheri et al.
WateRF: Robust Watermarks in Radiance Fields for Protection of Copyrights
Youngdong Jang, Dong In Lee, MinHyuk Jang et al.
Watermark-embedded Adversarial Examples for Copyright Protection against Diffusion Models
Peifei Zhu, Tsubasa Takahashi, Hirokatsu Kataoka
WaveFace: Authentic Face Restoration with Efficient Frequency Recovery
Yunqi Miao, Jiankang Deng, Jungong Han
Wavelet-based Fourier Information Interaction with Frequency Diffusion Adjustment for Underwater Image Restoration
Chen Zhao, Weiling Cai, Chenyu Dong et al.
WaveMo: Learning Wavefront Modulations to See Through Scattering
Mingyang Xie, Haiyun Guo, Brandon Y. Feng et al.
Weakly Misalignment-free Adaptive Feature Alignment for UAVs-based Multimodal Object Detection
Chen Chen, Jiahao Qi, Xingyue Liu et al.
Weakly-Supervised Audio-Visual Video Parsing with Prototype-based Pseudo-Labeling
Kranthi Kumar Rachavarapu, Kalyan Ramakrishnan, Rajagopalan A. N.
Weakly-Supervised Emotion Transition Learning for Diverse 3D Co-speech Gesture Generation
Xingqun Qi, Jiahao Pan, Peng Li et al.
Weakly Supervised Monocular 3D Detection with a Single-View Image
Xueying Jiang, Sheng Jin, Lewei Lu et al.
Weakly Supervised Point Cloud Semantic Segmentation via Artificial Oracle
Hyeokjun Kweon, Jihun Kim, Kuk-Jin Yoon
Weakly Supervised Video Individual Counting
Xinyan Liu, Guorong Li, Yuankai Qi et al.
Weak-to-Strong 3D Object Detection with X-Ray Distillation
Alexander Gambashidze, Aleksandr Dadukin, Maxim Golyadkin et al.
WHAM: Reconstructing World-grounded Humans with Accurate 3D Motion
Soyong Shin, Juyong Kim, Eni Halilaj et al.
What Do You See in Vehicle? Comprehensive Vision Solution for In-Vehicle Gaze Estimation
Yihua Cheng, Yaning Zhu, Zongji Wang et al.
What How and When Should Object Detectors Update in Continually Changing Test Domains?
Jayeon Yoo, Dongkwan Lee, Inseop Chung et al.
What If the TV Was Off? Examining Counterfactual Reasoning Abilities of Multi-modal Language Models
Letian Zhang, Xiaotong Zhai, Zhongkai Zhao et al.
What Sketch Explainability Really Means for Downstream Tasks?
Hmrishav Bandyopadhyay, Pinaki Nath Chowdhury, Ayan Kumar Bhunia et al.
What When and Where? Self-Supervised Spatio-Temporal Grounding in Untrimmed Multi-Action Videos from Narrated Instructions
Brian Chen, Nina Shvetsova, Andrew Rouditchenko et al.
What You See is What You GAN: Rendering Every Pixel for High-Fidelity Geometry in 3D GANs
Alex Trevithick, Matthew Chan, Towaki Takikawa et al.
When StyleGAN Meets Stable Diffusion: a W+ Adapter for Personalized Image Generation
Xiaoming Li, Xinyu Hou, Chen Change Loy