Co-occurring keywords
Papers
G2D: From Global to Dense Radiography Representation Learning via Vision-Language Pre-training
NIPS 2024
Learning to Count without Annotations
CVPR 2024
Vision Transformer Segmentation for Visual Bird Sound Denoising
INTERSPEECH 2024
Multimodal Segmentation for Vocal Tract Modeling
INTERSPEECH 2024
COCONut: Modernizing COCO Segmentation
CVPR 2024