Papers
Which Model To Transfer? Finding the Needle in the Growing Haystack
Cedric Renggli, André Susano Pinto, Luka Rimanic et al.
Whose Hands Are These? Hand Detection and Hand-Body Association in the Wild
Supreeth Narasimhaswamy, Thanh Nguyen, Mingzhen Huang et al.
Whose Track Is It Anyway? Improving Robustness to Tracking Errors With Affinity-Based Trajectory Prediction
Xinshuo Weng, Boris Ivanovic, Kris Kitani et al.
Why Discard if You Can Recycle?: A Recycling Max Pooling Module for 3D Point Cloud Analysis
Jiajing Chen, Burak Kakillioglu, Huantao Ren et al.
WildNet: Learning Domain Generalized Semantic Segmentation From the Wild
Suhyeon Lee, Hongje Seong, Seongwon Lee et al.
Winoground: Probing Vision and Language Models for Visio-Linguistic Compositionality
Tristan Thrush, Ryan Jiang, Max Bartolo et al.
Wnet: Audio-Guided Video Object Segmentation via Wavelet-Based Cross-Modal Denoising Networks
Wenwen Pan, Haonan Shi, Zhou Zhao et al.
XMP-Font: Self-Supervised Cross-Modality Pre-Training for Few-Shot Font Generation
Wei Liu, Fangyue Liu, Fei Ding et al.
X-Pool: Cross-Modal Language-Video Attention for Text-Video Retrieval
Satya Krishna Gorti, Noël Vouitsis, Junwei Ma et al.
X-Trans2Cap: Cross-Modal Knowledge Transfer Using Transformer for 3D Dense Captioning
Zhihao Yuan, Xu Yan, Yinghong Liao et al.
XYDeblur: Divide and Conquer for Single Image Deblurring
Seo-Won Ji, Jeongmin Lee, Seung-Wook Kim et al.
XYLayoutLM: Towards Layout-Aware Multimodal Networks for Visually-Rich Document Understanding
Zhangxuan Gu, Changhua Meng, Ke Wang et al.
YouMVOS: An Actor-Centric Multi-Shot Video Object Segmentation Dataset
Donglai Wei, Siddhant Kharbanda, Sarthak Arora et al.
ZebraPose: Coarse To Fine Surface Encoding for 6DoF Object Pose Estimation
Yongzhi Su, Mahdi Saleh, Torben Fetzer et al.
ZeroCap: Zero-Shot Image-to-Text Generation for Visual-Semantic Arithmetic
Yoad Tewel, Yoav Shalev, Idan Schwartz et al.
Zero Experience Required: Plug & Play Modular Transfer Learning for Semantic Visual Navigation
Ziad Al-Halah, Santhosh Kumar Ramakrishnan, Kristen Grauman
Zero-Query Transfer Attacks on Context-Aware Object Detectors
Zikui Cai, Shantanu Rane, Alejandro E. Brito et al.
Zero-Shot Text-Guided Object Generation With Dream Fields
Ajay Jain, Ben Mildenhall, Jonathan T. Barron et al.
ZeroWaste Dataset: Towards Deformable Object Segmentation in Cluttered Scenes
Dina Bashkirova, Mohamed Abdelfattah, Ziliang Zhu et al.
Zoom in and Out: A Mixed-Scale Triplet Network for Camouflaged Object Detection
Youwei Pang, Xiaoqi Zhao, Tian-Zhu Xiang et al.
ZZ-Net: A Universal Rotation Equivariant Architecture for 2D Point Clouds
Georg Bökman, Fredrik Kahl, Axel Flinth
2D or not 2D? Adaptive 3D Convolution Selection for Efficient Video Recognition
Hengduo Li, Zuxuan Wu, Abhinav Shrivastava et al.
3D AffordanceNet: A Benchmark for Visual Object Affordance Understanding
Shengheng Deng, Xun Xu, Chaozheng Wu et al.
3DCaricShop: A Dataset and a Baseline Method for Single-View 3D Caricature Face Reconstruction
Yuda Qiu, Xiaojie Xu, Lingteng Qiu et al.
3D CNNs With Adaptive Temporal Feature Resolutions
Mohsen Fayyaz, Emad Bahrami, Ali Diba et al.