Papers
OmniVTON: Training-Free Universal Virtual Try-On
Zhaotong Yang, Yuhui Li, Shengfeng He et al.
On-Device Diffusion Transformer Policy for Efficient Robot Manipulation
Yiming Wu, Huan Wang, Zhenghao Chen et al.
One Encoder to Rule them All: Representation Learning for Model-free Visual Reinforcement Learning using Fourier Neural Operators
Parag Dutta, Mohd Ayyoob, Shalabh Bhatnagar et al.
OneGT: One-Shot Geometry-Texture Neural Rendering for Head Avatars
Jinshu Chen, Bingchuan Li, Fan Zhang et al.
One Last Attention for Your Vision-Language Model
Liang Chen, Ghazi Shazan Ahmad, Tianjun Yao et al.
One Look is Enough: Seamless Patchwise Refinement for Zero-Shot Monocular Depth Estimation on High-Resolution Images
Byeongjun Kwon, Munchurl Kim
One Object, Multiple Lies: A Benchmark for Cross-task Adversarial Attack on Unified Vision-Language Models
Jiale Zhao, Xinyang Jiang, Junyao Gao et al.
One Perturbation is Enough: On Generating Universal Adversarial Perturbations against Vision-Language Pre-training Models
Hao Fang, Jiawei Kong, Wenbo Yu et al.
One Polyp Identifies All: One-Shot Polyp Segmentation with SAM via Cascaded Priors and Iterative Prompt Evolution
Xinyu Mao, Xiaohan Xing, Fei Meng et al.
One-Shot Knowledge Transfer for Scalable Person Re-Identification
Longhua Li, Lei Qi, Xin Geng
One-Step Specular Highlight Removal with Adapted Diffusion Models
Mahir Atmis, Levent Karacan, Mehmet Sarıgül
One Trajectory, One Token: Grounded Video Tokenization via Panoptic Sub-object Trajectory
Chenhao Zheng, Jieyu Zhang, Mohammadreza Salehi et al.
On Large Multimodal Models as Open-World Image Classifiers
Alessandro Conti, Massimiliano Mancini, Enrico Fini et al.
Online Dense Point Tracking with Streaming Memory
Qiaole Dong, Yanwei Fu
Online Generic Event Boundary Detection
Hyungrok Jung, Daneul Kim, Seunggyun Lim et al.
Online Language Splatting
Saimouli Katragadda, Cho-Ying Wu, Yuliang Guo et al.
Online Reasoning Video Segmentation with Just-in-Time Digital Twins
Yiqing Shen, Bohan Liu, Chenjia Li et al.
ONLY: One-Layer Intervention Sufficiently Mitigates Hallucinations in Large Vision-Language Models
Zifu Wan, Ce Zhang, Silong Yong et al.
On the Complexity-Faithfulness Trade-off of Gradient-Based Explanations
Amir Mehrpanah, Matteo Gamba, Kevin Smith et al.
On the Generalization of Representation Uncertainty in Earth Observation
Spyros Kondylatos, Nikolaos Ioannis Bountos, Dimitrios Michail et al.
On the Provable Importance of Gradients for Autonomous Language-Assisted Image Clustering
Bo Peng, Jie Lu, Guangquan Zhang et al.
On the Recovery of Cameras from Fundamental Matrices
Rakshith Madhavan, Federica Arrigoni
On the Robustness Tradeoff in Fine-Tuning
Kunyang Li, Jean-Charles Noirot Ferrand, Ryan Sheatsley et al.
OpenAnimals: Revisiting Person Re-Identification for Animals Towards Better Generalization
Saihui Hou, Panjian Huang, Zengbin Wang et al.
Open-ended Hierarchical Streaming Video Understanding with Vision Language Models
Hyolim Kang, Yunsu Park, Youngbeom Yoo et al.