Papers
Text-Guided Object Detector for Multi-Modal Video Question Answering
Ruoyue Shen, Nakamasa Inoue, Koichi Shinoda
The Box Size Confidence Bias Harms Your Object Detector
Johannes Gilg, Torben Teepe, Fabian Herzog et al.
The Change You Want To See
Ragav Sachdeva, Andrew Zisserman
The CropAndWeed Dataset: A Multi-Modal Learning Approach for Efficient Crop and Weed Manipulation
Daniel Steininger, Andreas Trondl, Gerardus Croonen et al.
The Fully Convolutional Transformer for Medical Image Segmentation
Athanasios Tragakis, Chaitanya Kaul, Roderick Murray-Smith et al.
THOR-Net: End-to-End Graformer-Based Realistic Two Hands and Object Reconstruction With Self-Supervision
Ahmed Tawfik Aboukhadra, Jameel Malik, Ahmed Elhayek et al.
TI2Net: Temporal Identity Inconsistency Network for Deepfake Detection
Baoping Liu, Bo Liu, Ming Ding et al.
TinyHD: Efficient Video Saliency Prediction With Heterogeneous Decoders Using Hierarchical Maps Distillation
Feiyan Hu, Simone Palazzo, Federica Proietto Salanitri et al.
Token Pooling in Vision Transformers for Image Classification
Dmitrii Marin, Jen-Hao Rick Chang, Anurag Ranjan et al.
Toward Edge-Efficient Dense Predictions With Synergistic Multi-Task Neural Architecture Search
Thanh Vu, Yanqi Zhou, Chunfeng Wen et al.
Towards a Framework for Privacy-Preserving Pedestrian Analysis
Anil Kunchala, Mélanie Bouroche, Bianca Schoen-Phelan
Towards Discriminative and Transferable One-Stage Few-Shot Object Detectors
Karim Guirguis, Mohamed Abdelsamad, George Eskandar et al.
Towards Disturbance-Free Visual Mobile Manipulation
Tianwei Ni, Kiana Ehsani, Luca Weihs et al.
Towards Equivariant Optical Flow Estimation With Deep Learning
Stefano Savian, Pietro Morerio, Alessio Del Bue et al.
Towards Few-Annotation Learning for Object Detection: Are Transformer-Based Models More Efficient?
Quentin Bouniot, Angélique Loesch, Romaric Audigier et al.
Towards Generating Ultra-High Resolution Talking-Face Videos With Lip Synchronization
Anchit Gupta, Rudrabha Mukhopadhyay, Sindhu Balachandra et al.
Towards Interpretable Video Anomaly Detection
Keval Doshi, Yasin Yilmaz
Towards MOOCs for Lipreading: Using Synthetic Talking Heads To Train Humans in Lipreading at Scale
Aditya Agarwal, Bipasha Sen, Rudrabha Mukhopadhyay et al.
Towards Online Domain Adaptive Object Detection
Vibashan VS, Poojan Oza, Vishal M. Patel
Tracking Growth and Decay of Plant Roots in Minirhizotron Images
Alexander Gillert, Bo Peters, Uwe Freiherr von Lukas et al.
Training Auxiliary Prototypical Classifiers for Explainable Anomaly Detection in Medical Image Segmentation
Wonwoo Cho, Jeonghoon Park, Jaegul Choo
Trans4Map: Revisiting Holistic Bird's-Eye-View Mapping From Egocentric Images to Allocentric Semantics With Vision Transformers
Chang Chen, Jiaming Zhang, Kailun Yang et al.
Transformers for Recognition in Overhead Imagery: A Reality Check
Francesco Luzi, Aneesh Gupta, Leslie Collins et al.
TransMOT: Spatial-Temporal Graph Transformer for Multiple Object Tracking
Peng Chu, Jiang Wang, Quanzeng You et al.
TransPillars: Coarse-To-Fine Aggregation for Multi-Frame 3D Object Detection
Zhipeng Luo, Gongjie Zhang, Changqing Zhou et al.