Papers
Vision Transformer With Deformable Attention
Zhuofan Xia, Xuran Pan, Shiji Song et al.
VISOLO: Grid-Based Space-Time Aggregation for Efficient Online Video Instance Segmentation
Su Ho Han, Sukjun Hwang, Seoung Wug Oh et al.
VISTA: Boosting 3D Object Detection via Dual Cross-VIew SpaTial Attention
Shengheng Deng, Zhihao Liang, Lin Sun et al.
ViSTA: Vision and Scene Text Aggregation for Cross-Modal Retrieval
Mengjun Cheng, Yipeng Sun, Longchao Wang et al.
Visual Abductive Reasoning
Chen Liang, Wenguan Wang, Tianfei Zhou et al.
Visual Acoustic Matching
Changan Chen, Ruohan Gao, Paul Calamia et al.
VisualGPT: Data-Efficient Adaptation of Pretrained Language Models for Image Captioning
Jun Chen, Han Guo, Kai Yi et al.
VisualHow: Multimodal Problem Solving
Jinhui Yang, Xianyu Chen, Ming Jiang et al.
Visual Vibration Tomography: Estimating Interior Material Properties From Monocular Video
Berthy T. Feng, Alexander C. Ogren, Chiara Daraio et al.
VL-Adapter: Parameter-Efficient Transfer Learning for Vision-and-Language Tasks
Yi-Lin Sung, Jaemin Cho, Mohit Bansal
VL-InterpreT: An Interactive Visualization Tool for Interpreting Vision-Language Transformers
Estelle Aflalo, Meng Du, Shao-Yen Tseng et al.
Vox2Cortex: Fast Explicit Reconstruction of Cortical Surfaces From 3D MRI Scans With Geometric Deep Neural Networks
Fabian Bongratz, Anne-Marie Rickmann, Sebastian Pölsterl et al.
Voxel Field Fusion for 3D Object Detection
Yanwei Li, Xiaojuan Qi, Yukang Chen et al.
Voxel Set Transformer: A Set-to-Set Approach to 3D Object Detection From Point Clouds
Chenhang He, Ruihuang Li, Shuai Li et al.
VRDFormer: End-to-End Video Visual Relation Detection With Transformers
Sipeng Zheng, Shizhe Chen, Qin Jin
WALT: Watch and Learn 2D Amodal Representation From Time-Lapse Imagery
N. Dinesh Reddy, Robert Tamburo, Srinivasa G. Narasimhan
WarpingGAN: Warping Multiple Uniform Priors for Adversarial 3D Point Cloud Generation
Yingzhi Tang, Yue Qian, Qijian Zhang et al.
Watch It Move: Unsupervised Discovery of 3D Joints for Re-Posing of Articulated Objects
Atsuhiro Noguchi, Umar Iqbal, Jonathan Tremblay et al.
Wavelet Knowledge Distillation: Towards Efficient Image-to-Image Translation
Linfeng Zhang, Xin Chen, Xiaobing Tu et al.
Weakly but Deeply Supervised Occlusion-Reasoned Parametric Road Layouts
Buyu Liu, Bingbing Zhuang, Manmohan Chandraker
Weakly Paired Associative Learning for Sound and Image Representations via Bimodal Associative Memory
Sangmin Lee, Hyung-Il Kim, Yong Man Ro
Weakly-Supervised Action Transition Learning for Stochastic Human Motion Prediction
Wei Mao, Miaomiao Liu, Mathieu Salzmann
Weakly-Supervised Generation and Grounding of Visual Descriptions With Conditional Generative Models
Effrosyni Mavroudi, René Vidal
Weakly Supervised High-Fidelity Clothing Model Generation
Ruili Feng, Cheng Ma, Chengji Shen et al.