Papers
One-step Diffusion with Distribution Matching Distillation
Tianwei Yin, Michaël Gharbi, Richard Zhang et al.
OneTracker: Unifying Visual Object Tracking with Foundation Models and Efficient Tuning
Lingyi Hong, Shilin Yan, Renrui Zhang et al.
On Exact Inversion of DPM-Solvers
Seongmin Hong, Kyeonghyun Lee, Suh Yoon Jeon et al.
Online Task-Free Continual Generative and Discriminative Learning via Dynamic Cluster Memory
Fei Ye, Adrian G. Bors
On Scaling Up a Multilingual Vision and Language Model
Xi Chen, Josip Djolonga, Piotr Padlewski et al.
On the Content Bias in Frechet Video Distance
Songwei Ge, Aniruddha Mahapatra, Gaurav Parmar et al.
On the Diversity and Realism of Distilled Dataset: An Efficient Dataset Distillation Paradigm
Peng Sun, Bei Shi, Daiwei Yu et al.
On the Estimation of Image-matching Uncertainty in Visual Place Recognition
Mubariz Zaffar, Liangliang Nan, Julian F. P. Kooij
On the Faithfulness of Vision Transformer Explanations
Junyi Wu, Weitai Kang, Hao Tang et al.
On the Road to Portability: Compressing End-to-End Motion Planner for Autonomous Driving
Kaituo Feng, Changsheng Li, Dongchun Ren et al.
On the Robustness of Language Guidance for Low-Level Vision Tasks: Findings from Depth Estimation
Agneet Chatterjee, Tejas Gokhale, Chitta Baral et al.
On the Robustness of Large Multimodal Models Against Image Adversarial Attacks
Xuanming Cui, Alejandro Aparcedo, Young Kyun Jang et al.
On the Scalability of Diffusion-based Text-to-Image Generation
Hao Li, Yang Zou, Ying Wang et al.
On the Test-Time Zero-Shot Generalization of Vision-Language Models: Do We Really Need Prompt Learning?
Maxime Zanella, Ismail Ben Ayed
On Train-Test Class Overlap and Detection for Image Retrieval
Chull Hwan Song, Jooyoung Yoon, Taebaek Hwang et al.
OOSTraj: Out-of-Sight Trajectory Prediction With Vision-Positioning Denoising
Haichao Zhang, Yi Xu, Hongsheng Lu et al.
Open3DIS: Open-Vocabulary 3D Instance Segmentation with 2D Mask Guidance
Phuc Nguyen, Tuan Duc Ngo, Evangelos Kalogerakis et al.
Open3DSG: Open-Vocabulary 3D Scene Graphs from Point Clouds with Queryable Objects and Open-Set Relationships
Sebastian Koch, Narunas Vaskevicius, Mirco Colosi et al.
OpenBias: Open-set Bias Detection in Text-to-Image Generative Models
Moreno D'Incà, Elia Peruzzo, Massimiliano Mancini et al.
OpenEQA: Embodied Question Answering in the Era of Foundation Models
Arjun Majumdar, Anurag Ajay, Xiaohan Zhang et al.
OpenESS: Event-based Semantic Scene Understanding with Open Vocabularies
Lingdong Kong, Youquan Liu, Lai Xing Ng et al.
Open-Set Domain Adaptation for Semantic Segmentation
Seun-An Choe, Ah-Hyung Shin, Keon-Hee Park et al.
OpenStreetView-5M: The Many Roads to Global Visual Geolocation
Guillaume Astruc, Nicolas Dufour, Ioannis Siglidis et al.
Open-Vocabulary 3D Semantic Segmentation with Foundation Models
Li Jiang, Shaoshuai Shi, Bernt Schiele
Open-Vocabulary Attention Maps with Token Optimization for Semantic Segmentation in Diffusion Models
Pablo Marcos-Manchón, Roberto Alcover-Couso, Juan C. SanMiguel et al.