Papers
4,428 papers found
A Spatio-Temporal Representation Learning as an Alternative to Traditional Glosses in Sign Language Translation and Production
Eui Jun Hwang, Sukmin Cho, Huije Lee et al.
Assessing the Quality of 3D Reconstruction in the Absence of Ground Truth: Application to a Multimodal Archaeological Dataset
Benjamin Coupry, Baptiste Brument, Antoine Laurent et al.
Assessing Visually-Continuous Corruption Robustness of Neural Networks Relative to Human Performance
Huakun Shen, Boyue Hu, Krzysztof Czarnecki et al.
Attention-Based Class-Conditioned Alignment for Multi-Source Domain Adaptation of Object Detectors
Atif Belal, Akhil Meethal, Francisco Perdigon Romero et al.
Attention-Guided Masked Autoencoders for Learning Image Representations
Leon Sick, Dominik Engel, Pedro Hermosilla et al.
Attribute Diffusion: Diffusion Driven Diverse Attribute Editing
Rishubh Parihar, Prasanna Balaji, Raghav Magazine et al.
A Two-Head Loss Function for Deep Average-K Classification
Camille Garcin, Maximilien Servajean, Alexis Joly et al.
Automated Evaluation of Large Vision-Language Models on Self-Driving Corner Cases
Kai Chen, Yanze Li, Wenhua Zhang et al.
Automated Patient Positioning with Learned 3D Hand Gestures
Zhongpai Gao, Abhishek Sharma, Meng Zheng et al.
AutoProSAM: Automated Prompting SAM for 3D Multi-Organ Segmentation
Chengyin Li, Rafi Ibn Sultan, Prashant Khanduri et al.
Autoregressive Adaptive Hypergraph Transformer for Skeleton-Based Activity Recognition
Abhisek Ray, Ayush Raj, Maheshkumar H. Kolekar
A Versatile and Differentiable Hand-Object Interaction Representation
Théo Morales, Omid Taheri, Gerard Lacey
A Video is Worth 10000 Words: Training and Benchmarking with Diverse Captions for Better Long Video Retrieval
Matthew Gwilliam, Michael Cogswell, Meng Ye et al.
Background-Aware Moment Detection for Video Moment Retrieval
Minjoon Jung, Youwon Jang, Seongho Choi et al.
Bandit Based Attention Mechanism in Vision Transformers
Amartya Roy Chowdhury, Raghuram Bharadwaj Diddigi, Prabuchandran K J et al.
Bandwidth-Efficient Communication Modelling for Autonomous Vehicle Collaborative Perception
Dinghao Jin, Yuan Zeng, Yi Gong
BASED: Bundle-Adjusting Surgical Endoscopic Dynamic Video Reconstruction using Neural Radiance Fields
Shreya Saha, Zekai Liang, Shan Lin et al.
Bayesian Optimal Latent Projection for Noisy Image Restoration
Ziqiang Shi, Rujie Liu, Jun Takahashi et al.
BeautyBank: Encoding Facial Makeup in Latent Space
Qianwen Lu, Xingchao Yang, Takafumi Taketomi
@BENCH: Benchmarking Vision-Language Models for Human-Centered Assistive Technology
Xin Jiang, Junwei Zheng, Ruiping Liu et al.
Benchmarking VLMs' Reasoning About Persuasive Atypical Images
Sina Malakouti, Aysan Aghazadeh, Ashmit Khandelwal et al.
Beta Sampling is All You Need: Efficient Image Generation Strategy for Diffusion Models using Stepwise Spectral Analysis
Haeil Lee, Hansang Lee, Seoyeon Gye et al.
Beyond Boxes: Mask-Guided Spatio-Temporal Feature Aggregation for Video Object Detection
Khurram Azeem Hashmi, Talha Uddin Sheikh, Didier Stricker et al.
Beyond Grids: Exploring Elastic Input Sampling for Vision Transformers
Adam Pardyl, Grzegorz Kurzejamski, Jan Olszewski et al.