Papers
4,428 papers found
AuthGuard: Generalizable Deepfake Detection via Language Guidance
Guangyu Shen, Zhihua Li, Xiang Xu et al.
Autocorrelation-based Fiducial Markers for Traceability
Ismail Bencheikh, Max Dunitz, Marie d'Autume et al.
Automated Pore Detection from In-Situ FDM 3D Printing Video: A Comparative Evaluation of Modern Segmentation Models
Abdullah Al Ahad Khan, Md Shariful Islam, Lin Li et al.
Automated Suturing Skill Assessment in Robot-assisted Surgery from Endoscopic Videos using Clinically-guided Evaluation Criteria
Atharva Sunil Deo, Ujjwal Pasupulety, Nicholas Matsumoto et al.
Autoregressive Styled Text Image Generation, but Make it Reliable
Carmine Zaccagnino, Fabio Quattrini, Vittorio Pippi et al.
AutoSew: A Geometric Approach to Stitching Prediction with Graph Neural Networks
Pablo Ríos-Navarro, Elena Garces, Jorge Lopez-Moreno
AuViRe: Audio-visual Speech Representation Reconstruction for Deepfake Temporal Localization
Christos Koutlis, Symeon Papadopoulos
A-V Representation Learning via Audio Shift Prediction for Multimodal Deepfake Detection and Temporal Localization
Ashutosh Anshul, Eng Siong Chng, Deepu Rajan
A Woman with a Knife or A Knife with a Woman? Measuring Directional Bias Amplification in Image Captions
Rahul Nair, Bhanu Tokas, Hannah Kerner
BAFIS: Dataset + Framework to Assess Occupational Bias and Human Preference in Modern Text-to-image Models
Thomas Klassert, Adrian Ulges, Biying Fu
BAFLE-DCT: Bypassing Adversarial Filters via Frequency-Selective Embedding in the DCT Domain
Thilina Mendis, Farah Kandah, Sathyanarayanan N. Aakur
BanglaProtha: Evaluating Vision Language Models in Underrepresented Long-tail Cultural Contexts
Md Fahim, Md Sakib Ul Rahman, Akm Moshiur Rahman et al.
Being Positive about Negative Queries: Exclusion Aware Multimodal Retrieval using Disentangled Representations
Prachi Jha, Sumit Bhatia, Srikanta Bedathur
Better Safe Than Sorry? Overreaction Problem of Vision Language Models in Visual Emergency Recognition
Dasol Choi, Seunghyun Lee, Youngsook Song
Beyond Faces: A Multimodal Person Clustering for Unconstrained Environments
Sahngmin Yoo, Sangwon Lee, Seongin Jo
Beyond Paired Data: Self-Supervised UAV Geo-Localization from Reference Imagery Alone
Tristan Amadei, Enric Meinhardt-Llopis, Benedicte Bascle et al.
Beyond Realism: Learning the Art of Expressive Composition with StickerNet
Haoming Lu, David Kocharian, Humphrey Shi
Beyond Real Weights: Hypercomplex Representations for Stable Quantization
Jawad Ibn Ahad, Maisha Rahman, Amrijit Biswas et al.
Beyond the Encoder: Joint Encoder-Decoder Contrastive Pre-Training Improves Dense Prediction
Sébastien Quetin, Tapotosh Ghosh, Farhad Maleki
Beyond the Highlights: Video Retrieval with Salient and Surrounding Contexts
Jaehun Bang, Moon Ye-Bin, Tae-Hyun Oh et al.
Bi-ICE: An Inner Interpretable Framework for Image Classification via Bi-directional Interactions between Concept and Input Embeddings
Jinyung Hong, Yearim Kim, Keun Hee Park et al.
BiNAR: A Bi-Modal Framework for Non-Aligned RGB-IR 3D Reconstruction via Gaussian Splatting
Zhongwen Wang, Han Ling, Weihao Zhang et al.
BiPO: Bidirectional Partial Occlusion Network for Text-to-Motion Synthesis
Seong-Eun Hong, SooBin Lim, JuYeong Hwang et al.
BlendCLIP: Bridging Synthetic and Real Domains for Zero-Shot 3D Object Classification with Multimodal Pretraining
Ajinkya Khoche, Gergő László Nagy, Maciej Wozniak et al.
Blur2Sharp: Human Novel Pose and View Synthesis with Generative Prior Refinement
Chia-Hern Lai, I-Hsuan Lo, Yen-Ku Yeh et al.