Papers
WarpHE4D: Dense 4D Head Map toward Full Head Reconstruction
Jongseob Yun, Yong-Hoon Kwon, Min-Gyu Park et al.
Wasserstein Style Distribution Analysis and Transform for Stylized Image Generation
Xi Yu, Xiang Gu, Zhihao Shi et al.
Wavelet Policy: Lifting Scheme for Policy Learning in Long-Horizon Tasks
Hao Huang, Shuaihang Yuan, Geeta Chandra Raju Bethala et al.
Wave-MambaAD: Wavelet-driven State Space Model for Multi-class Unsupervised Anomaly Detection
Qiao Zhang, Mingwen Shao, Xinyuan Chen et al.
WaveMamba: Wavelet-Driven Mamba Fusion for RGB-Infrared Object Detection
Haodong Zhu, Wenhao Dong, Linlin Yang et al.
WAVE: Warp-Based View Guidance for Consistent Novel View Synthesis Using a Single Image
Jiwoo Park, Tae Eun Choi, Youngjun Jun et al.
Weakly-Supervised Learning of Dense Functional Correspondences
Stefan Stojanov, Linan Zhao, Yunzhi Zhang et al.
Weakly Supervised Visible-Infrared Person Re-Identification via Heterogeneous Expert Collaborative Consistency Learning
Yafei Zhang, Lingqi Kong, Huafeng Li et al.
WeaveSeg: Iterative Contrast-weaving and Spectral Feature-refining for Nuclei Instance Segmentation
Jiajia Li, Huisi Wu, Jing Qin
Web Artifact Attacks Disrupt Vision Language Models
Maan Qraitem, Piotr Teterwak, Kate Saenko et al.
What Changed and What Could Have Changed? State-Change Counterfactuals for Procedure-Aware Video Representation Learning
Chi-Hsi Kung, Frangil Ramirez, Juhyung Ha et al.
What Changed? Detecting and Evaluating Instruction-Guided Image Edits with Multimodal Large Language Models
Lorenzo Baraldi, Davide Bucciarelli, Federico Betti et al.
What If: Understanding Motion Through Sparse Interactions
Stefan Andreas Baumann, Nick Stracke, Timy Phan et al.
What Makes for Text to 360-degree Panorama Generation with Stable Diffusion?
Jinhong Ni, Chang-Bin Zhang, Qiang Zhang et al.
What's in a Latent? Leveraging Diffusion Latent Space for Domain Generalization
Xavier Thomas, Deepti Ghadiyaram
What's Making That Sound Right Now? Video-centric Audio-Visual Localization
Hahyeon Choi, Junhoo Lee, Nojun Kwak
What to Distill? Fast Knowledge Distillation with Adaptive Sampling
Byungchul Chae, Seonyeong Heo
What we need is explicit controllability: Training 3D gaze estimator using only facial images
Tingwei Li, Jun Bao, Zhenzhong Kuang et al.
What You Have is What You Track: Adaptive and Robust Multimodal Tracking
Yuedong Tan, Jiawei Shao, Eduard Zamfir et al.
When Anchors Meet Cold Diffusion: A Multi-Stage Approach to Lane Detection
Bo-Lun Huang, Zi-Xiang Ni, Feng-Kai Huang et al.
When and Where do Data Poisons Attack Textual Inversion?
Jeremy Styborski, Mingzhi Lyu, Jiayou Lu et al.
When Large Vision-Language Model Meets Large Remote Sensing Imagery: Coarse-to-Fine Text-Guided Token Pruning
Junwei Luo, Yingying Zhang, Xue Yang et al.
When Lighting Deceives: Exposing Vision-Language Models' Illumination Vulnerability Through Illumination Transformation Attack
Hanqing Liu, Shouwei Ruan, Yao Huang et al.
When Pixel Difference Patterns Meet ViT: PiDiViT for Few-Shot Object Detection
Hongliang Zhou, Yongxiang Liu, Canyu Mo et al.