Papers
Video Geo-Localization Employing Geo-Temporal Feature Learning and GPS Trajectory Smoothing
Krishna Regmi, Mubarak Shah
Video Instance Segmentation With a Propose-Reduce Paradigm
Huaijia Lin, Ruizheng Wu, Shu Liu et al.
VideoLT: Large-Scale Long-Tailed Video Recognition
Xing Zhang, Zuxuan Wu, Zejia Weng et al.
Video Matting via Consistency-Regularized Graph Neural Networks
Tiantian Wang, Sifei Liu, Yapeng Tian et al.
Video Object Segmentation With Dynamic Memory Networks and Adaptive Object Alignment
Shuxian Liang, Xu Shen, Jianqiang Huang et al.
Video Pose Distillation for Few-Shot, Fine-Grained Sports Action Recognition
James Hong, Matthew Fisher, Michaël Gharbi et al.
Video Question Answering Using Language-Guided Deep Compressed-Domain Video Feature
Nayoung Kim, Seong Jong Ha, Je-Won Kang
Video Self-Stitching Graph Network for Temporal Action Localization
Chen Zhao, Ali K. Thabet, Bernard Ghanem
VidTr: Video Transformer Without Convolutions
Yanyi Zhang, Xinyu Li, Chunhui Liu et al.
Viewing Graph Solvability via Cycle Consistency
Federica Arrigoni, Andrea Fusiello, Elisa Ricci et al.
ViewNet: Unsupervised Viewpoint Estimation From Conditional Generation
Octave Mariotti, Oisin Mac Aodha, Hakan Bilen
Viewpoint-Agnostic Change Captioning With Cycle Consistency
Hoeseong Kim, Jongseok Kim, Hyungseok Lee et al.
Viewpoint Invariant Dense Matching for Visual Geolocalization
Gabriele Berton, Carlo Masone, Valerio Paolicelli et al.
VIL-100: A New Dataset and a Baseline Model for Video Instance Lane Detection
Yujun Zhang, Lei Zhu, Wei Feng et al.
Virtual Light Transport Matrices for Non-Line-of-Sight Imaging
Julio Marco, Adrian Jarabo, Ji Hyun Nam et al.
Virtual Multi-Modality Self-Supervised Foreground Matting for Human-Object Interaction
Bo Xu, Han Huang, Cheng Lu et al.
Vis2Mesh: Efficient Mesh Reconstruction From Unstructured Point Clouds of Large Scenes With Learned Virtual View Visibility
Shuang Song, Zhaopeng Cui, Rongjun Qin
Visformer: The Vision-Friendly Transformer
Zhengsu Chen, Lingxi Xie, Jianwei Niu et al.
Vision-Language Navigation With Random Environmental Mixup
Chong Liu, Fengda Zhu, Xiaojun Chang et al.
Vision-Language Transformer and Query Generation for Referring Segmentation
Henghui Ding, Chang Liu, Suchen Wang et al.
Vision Transformers for Dense Prediction
René Ranftl, Alexey Bochkovskiy, Vladlen Koltun
Vision Transformer With Progressive Sampling
Xiaoyu Yue, Shuyang Sun, Zhanghui Kuang et al.
Visio-Temporal Attention for Multi-Camera Multi-Target Association
Yu-Jhe Li, Xinshuo Weng, Yan Xu et al.
Visual Alignment Constraint for Continuous Sign Language Recognition
Yuecong Min, Aiming Hao, Xiujuan Chai et al.
Visual Distant Supervision for Scene Graph Generation
Yuan Yao, Ao Zhang, Xu Han et al.