Papers
VIGOR: Cross-View Image Geo-Localization Beyond One-to-One Retrieval
Sijie Zhu, Taojiannan Yang, Chen Chen
VinVL: Revisiting Visual Representations in Vision-Language Models
Pengchuan Zhang, Xiujun Li, Xiaowei Hu et al.
VIP-DeepLab: Learning Visual Perception With Depth-Aware Video Panoptic Segmentation
Siyuan Qiao, Yukun Zhu, Hartwig Adam et al.
ViPNAS: Efficient Video Pose Estimation via Neural Architecture Search
Lumin Xu, Yingda Guan, Sheng Jin et al.
VirFace: Enhancing Face Recognition via Unlabeled Shallow Data
Wenyu Li, Tianchu Guo, Pengyu Li et al.
VirTex: Learning Visual Representations From Textual Annotations
Karan Desai, Justin Johnson
Virtual Fully-Connected Layer: Training a Large-Scale Face Recognition Dataset With Limited Computational Resources
Pengyu Li, Biao Wang, Lei Zhang
Visualizing Adapted Knowledge in Domain Transfer
Yunzhong Hou, Liang Zheng
Visually Informed Binaural Audio Generation without Binaural Audios
Xudong Xu, Hang Zhou, Ziwei Liu et al.
Visual Navigation With Spatial Attention
Bar Mayo, Tamir Hazan, Ayellet Tal
Visual Room Rearrangement
Luca Weihs, Matt Deitke, Aniruddha Kembhavi et al.
Visual Semantic Role Labeling for Video Understanding
Arka Sadhu, Tanmay Gupta, Mark Yatskar et al.
VisualVoice: Audio-Visual Speech Separation With Cross-Modal Consistency
Ruohan Gao, Kristen Grauman
VITON-HD: High-Resolution Virtual Try-On via Misalignment-Aware Normalization
Seunghwan Choi, Sunghyun Park, Minsoo Lee et al.
VLN BERT: A Recurrent Vision-and-Language BERT for Navigation
Yicong Hong, Qi Wu, Yuankai Qi et al.
VoxelContext-Net: An Octree Based Framework for Point Cloud Compression
Zizheng Que, Guo Lu, Dong Xu
VS-Net: Voting With Segmentation for Visual Localization
Zhaoyang Huang, Han Zhou, Yijin Li et al.
VSPW: A Large-scale Dataset for Video Scene Parsing in the Wild
Jiaxu Miao, Yunchao Wei, Yu Wu et al.
Vx2Text: End-to-End Learning of Video-Based Text Generation From Multimodal Inputs
Xudong Lin, Gedas Bertasius, Jue Wang et al.
Wasserstein Barycenter for Multi-Source Domain Adaptation
Eduardo Fernandes Montesuma, Fred Maurice Ngole Mboula
Wasserstein Contrastive Representation Distillation
Liqun Chen, Dong Wang, Zhe Gan et al.
Watching You: Global-Guided Reciprocal Learning for Video-Based Person Re-Identification
Xuehu Liu, Pingping Zhang, Chenyang Yu et al.
Weakly Supervised Action Selection Learning in Video
Junwei Ma, Satya Krishna Gorti, Maksims Volkovs et al.
Weakly Supervised Instance Segmentation for Videos With Temporal Mask Consistency
Qing Liu, Vignesh Ramanathan, Dhruv Mahajan et al.
Weakly-Supervised Instance Segmentation via Class-Agnostic Learning With Salient Images
Xinggang Wang, Jiapei Feng, Bin Hu et al.