Papers
Search and Detect: Training-Free Long Tail Object Detection via Web-Image Retrieval
Mankeerat Sidhu, Hetarth Chopra, Ansel Blume et al.
SeCap: Self-Calibrating and Adaptive Prompts for Cross-view Person Re-Identification in Aerial-Ground Networks
Shining Wang, Yunlong Wang, Ruiqi Wu et al.
Secret Lies in Color: Enhancing AI-Generated Images Detection with Color Distribution Analysis
Zexi Jia, Chuanwei Huang, Yeshuang Zhu et al.
SeedVR: Seeding Infinity in Diffusion Transformer Towards Generic Video Restoration
Jianyi Wang, Zhijie Lin, Meng Wei et al.
See Further When Clear: Curriculum Consistency Model
Yunpeng Liu, Boxiao Liu, Yi Zhang et al.
SeeGround: See and Ground for Zero-Shot Open-Vocabulary 3D Visual Grounding
Rong Li, Shijie Li, Lingdong Kong et al.
Seeing A 3D World in A Grain of Sand
Yufan Zhang, Yu Ji, Yu Guo et al.
Seeing Far and Clearly: Mitigating Hallucinations in MLLMs with Attention Causal Decoding
Feilong Tang, Chengzhi Liu, Zhongxing Xu et al.
Seeing More with Less: Human-like Representations in Vision Models
Andrey Gizdov, Shimon Ullman, Daniel Harari
Seeing Speech and Sound: Distinguishing and Locating Audio Sources in Visual Scenes
Hyeonggon Ryu, Seongyu Kim, Joon Son Chung et al.
Seeing the Abstract: Translating the Abstract Language for Vision Language Models
Davide Talon, Federico Girella, Ziyue Liu et al.
Seeing What Matters: Empowering CLIP with Patch Generation-to-Selection
Gensheng Pei, Tao Chen, Yujia Wang et al.
Seek Common Ground While Reserving Differences: Semi-Supervised Image-Text Sentiment Recognition
Wuyou Xia, Guoli Jia, Sicheng Zhao et al.
Seeking Consistent Flat Minima for Better Domain Generalization via Refining Loss Landscapes
Aodi Li, Liansheng Zhuang, Xiao Long et al.
SEEN-DA: SEmantic ENtropy guided Domain-aware Attention for Domain Adaptive Object Detection
Haochen Li, Rui Zhang, Hantao Yao et al.
SegAgent: Exploring Pixel Understanding Capabilities in MLLMs by Imitating Human Annotator Trajectories
Muzhi Zhu, Yuzhuo Tian, Hao Chen et al.
SegEarth-OV: Towards Training-Free Open-Vocabulary Segmentation for Remote Sensing Images
Kaiyu Li, Ruixun Liu, Xiangyong Cao et al.
SegMAN: Omni-scale Context Modeling with State Space Models and Local Attention for Semantic Segmentation
Yunxiang Fu, Meng Lou, Yizhou Yu
Segment Any Motion in Videos
Nan Huang, Wenzhao Zheng, Chenfeng Xu et al.
Segment Any-Quality Images with Generative Latent Space Enhancement
Guangqian Guo, Yong Guo, Xuehui Yu et al.
Segment Anything, Even Occluded
Wei-En Tai, Yu-Lin Shih, Cheng Sun et al.
Segmenting Maxillofacial Structures in CBCT Volumes
Federico Bolelli, Kevin Marchesini, Niels van Nistelrooij et al.
Segment This Thing: Foveated Tokenization for Efficient Point-Prompted Segmentation
Tanner Schmidt, Richard Newcombe