Papers
Viewpoint-Agnostic Change Captioning With Cycle Consistency
Hoeseong Kim, Jongseok Kim, Hyungseok Lee et al.
Viewpoint Invariant Dense Matching for Visual Geolocalization
Gabriele Berton, Carlo Masone, Valerio Paolicelli et al.
VIL-100: A New Dataset and a Baseline Model for Video Instance Lane Detection
Yujun Zhang, Lei Zhu, Wei Feng et al.
Virtual Light Transport Matrices for Non-Line-of-Sight Imaging
Julio Marco, Adrian Jarabo, Ji Hyun Nam et al.
Virtual Multi-Modality Self-Supervised Foreground Matting for Human-Object Interaction
Bo Xu, Han Huang, Cheng Lu et al.
Vis2Mesh: Efficient Mesh Reconstruction From Unstructured Point Clouds of Large Scenes With Learned Virtual View Visibility
Shuang Song, Zhaopeng Cui, Rongjun Qin
Visformer: The Vision-Friendly Transformer
Zhengsu Chen, Lingxi Xie, Jianwei Niu et al.
Vision-Language Navigation With Random Environmental Mixup
Chong Liu, Fengda Zhu, Xiaojun Chang et al.
Vision-Language Transformer and Query Generation for Referring Segmentation
Henghui Ding, Chang Liu, Suchen Wang et al.
Vision Transformers for Dense Prediction
René Ranftl, Alexey Bochkovskiy, Vladlen Koltun
Vision Transformer With Progressive Sampling
Xiaoyu Yue, Shuyang Sun, Zhanghui Kuang et al.
Visio-Temporal Attention for Multi-Camera Multi-Target Association
Yu-Jhe Li, Xinshuo Weng, Yan Xu et al.
Visual Alignment Constraint for Continuous Sign Language Recognition
Yuecong Min, Aiming Hao, Xiujuan Chai et al.
Visual Distant Supervision for Scene Graph Generation
Yuan Yao, Ao Zhang, Xu Han et al.
Visual Graph Memory With Unsupervised Representation for Visual Navigation
Obin Kwon, Nuri Kim, Yunho Choi et al.
Visual Relationship Detection Using Part-and-Sum Transformers With Composite Queries
Qi Dong, Zhuowen Tu, Haofu Liao et al.
Visual Saliency Transformer
Nian Liu, Ni Zhang, Kaiyuan Wan et al.
Visual Scene Graphs for Audio Source Separation
Moitreya Chatterjee, Jonathan Le Roux, Narendra Ahuja et al.
Visual-Textual Attentive Semantic Consistency for Medical Report Generation
Yi Zhou, Lei Huang, Tao Zhou et al.
Visual Transformers: Where Do Transformers Really Belong in Vision Models?
Bichen Wu, Chenfeng Xu, Xiaoliang Dai et al.
ViViT: A Video Vision Transformer
Anurag Arnab, Mostafa Dehghani, Georg Heigold et al.
VLGrammar: Grounded Grammar Induction of Vision and Language
Yining Hong, Qing Li, Song-Chun Zhu et al.
VMNet: Voxel-Mesh Network for Geodesic-Aware 3D Semantic Segmentation
Zeyu Hu, Xuyang Bai, Jiaxiang Shang et al.
VolumeFusion: Deep Depth Fusion for 3D Scene Reconstruction
Jaesung Choe, Sunghoon Im, Francois Rameau et al.
von Mises-Fisher Loss: An Exploration of Embedding Geometries for Supervised Learning
Tyler R. Scott, Andrew C. Gallagher, Michael C. Mozer