Papers
SnowMaster: Comprehensive Real-world Image Desnowing via MLLM with Multi-Model Feedback Optimization
Jianyu Lai, Sixiang Chen, Yunlong Lin et al.
SOAP: Vision-Centric 3D Semantic Scene Completion with Scene-Adaptive Decoder and Occluded Region-Aware View Projection
Hyo-Jun Lee, Yeong Jun Koh, Hanul Kim et al.
SocialGesture: Delving into Multi-person Gesture Understanding
Xu Cao, Pranav Virupaksha, Wenqi Jia et al.
SocialMOIF: Multi-Order Intention Fusion for Pedestrian Trajectory Prediction
Kai Chen, Xiaodong Zhao, Yujie Huang et al.
Soft Self-labeling and Potts Relaxations for Weakly-supervised Segmentation
Zhongwen Zhang, Yuri Boykov
SoftShadow: Leveraging Soft Masks for Penumbra-Aware Shadow Removal
Xinrui Wang, Lanqing Guo, Xiyu Wang et al.
SoftVQ-VAE: Efficient 1-Dimensional Continuous Tokenizer
Hao Chen, Ze Wang, Xiang Li et al.
SOGS: Second-Order Anchor for Advanced 3D Gaussian Splatting
Jiahui Zhang, Fangneng Zhan, Ling Shao et al.
SOLAMI: Social Vision-Language-Action Modeling for Immersive Interaction with 3D Autonomous Characters
Jianping Jiang, Weiye Xiao, Zhengyu Lin et al.
SOLVE: Synergy of Language-Vision and End-to-End Networks for Autonomous Driving
Xuesong Chen, Linjiang Huang, Tao Ma et al.
Solving Instance Detection from an Open-World Perspective
Qianqian Shen, Yunhan Zhao, Nahyun Kwon et al.
SoMA: Singular Value Decomposed Minor Components Adaptation for Domain Generalizable Representation Learning
Seokju Yun, Seunghye Chae, Dongheon Lee et al.
Sonata: Self-Supervised Learning of Reliable Point Representations
Xiaoyang Wu, Daniel DeTone, Duncan Frost et al.
Sonic: Shifting Focus to Global Audio Perception in Portrait Animation
Xiaozhong Ji, Xiaobin Hu, Zhihong Xu et al.
Sound Bridge: Associating Egocentric and Exocentric Videos via Audio Cues
Sihong Huang, Jiaxin Wu, Xiaoyong Wei et al.
SoundVista: Novel-View Ambient Sound Synthesis via Visual-Acoustic Binding
Mingfei Chen, Israel D. Gebru, Ishwarya Ananthabhotla et al.
SP3D: Boosting Sparsely-Supervised 3D Object Detection via Accurate Cross-Modal Semantic Prompts
Shijia Zhao, Qiming Xia, Xusheng Guo et al.
SPAR3D: Stable Point-Aware Reconstruction of 3D Objects from Single Images
Zixuan Huang, Mark Boss, Aaryaman Vasishta et al.
SPARC: Score Prompting and Adaptive Fusion for Zero-Shot Multi-Label Recognition in Vision-Language Models
Kevin Miller, Aditya Gangrade, Samarth Mishra et al.
SPARS3R: Semantic Prior Alignment and Regularization for Sparse 3D Reconstruction
Yutao Tang, Yuxiang Guo, Deming Li et al.
Sparse2DGS: Geometry-Prioritized Gaussian Splatting for Surface Reconstruction from Sparse Views
Jiang Wu, Rui Li, Yu Zhu et al.
SparseAlign: a Fully Sparse Framework for Cooperative Object Detection
Yunshuang Yuan, Yan Xia, Daniel Cremers et al.
Sparse Point Cloud Patches Rendering via Splitting 2D Gaussians
Changfeng Ma, Ran Bi, Jie Guo et al.
Sparse Voxels Rasterization: Real-time High-fidelity Radiance Field Rendering
Cheng Sun, Jaesung Choe, Charles Loop et al.
Spatial457: A Diagnostic Benchmark for 6D Spatial Reasoning of Large Mutimodal Models
Xingrui Wang, Wufei Ma, Tiezheng Zhang et al.