Papers
Vectorized Conditional Neural Fields: A Framework for Solving Time-dependent Parametric Partial Differential Equations
Jan Hagnberger, Marimuthu Kalimuthu, Daniel Musekamp et al.
Vector Quantization Pretraining for EEG Time Series with Random Projection and Phase Alignment
Haokun Gui, Xiucheng Li, Xinyang Chen
Verification of Machine Unlearning is Fragile
Binchi Zhang, Zihan Chen, Cong Shen et al.
Verifying message-passing neural networks via topology-based bounds tightening
Christopher Hojny, Shiqiang Zhang, Juan S Campos et al.
Video-LaVIT: Unified Video-Language Pre-training with Decoupled Visual-Motional Tokenization
Yang Jin, Zhicheng Sun, Kun Xu et al.
Video-of-Thought: Step-by-Step Video Reasoning from Perception to Cognition
Hao Fei, Shengqiong Wu, Wei Ji et al.
VideoPoet: A Large Language Model for Zero-Shot Video Generation
Dan Kondratyuk, Lijun Yu, Xiuye Gu et al.
VideoPrism: A Foundational Visual Encoder for Video Understanding
Long Zhao, Nitesh Bharadwaj Gundavarapu, Liangzhe Yuan et al.
video-SALMONN: Speech-Enhanced Audio-Visual Large Language Models
Guangzhi Sun, Wenyi Yu, Changli Tang et al.
Viewing Transformers Through the Lens of Long Convolutions Layers
Itamar Zimerman, Lior Wolf
VinT-6D: A Large-Scale Object-in-hand Dataset from Vision, Touch and Proprioception
Zhaoliang Wan, Yonggen Ling, Senlin Yi et al.
ViP: A Differentially Private Foundation Model for Computer Vision
Yaodong Yu, Maziar Sanjabi, Yi Ma et al.
VisionGraph: Leveraging Large Multimodal Models for Graph Theory Problems in Visual Context
Yunxin Li, Baotian Hu, Haoyuan Shi et al.
Vision Mamba: Efficient Visual Representation Learning with Bidirectional State Space Model
Lianghui Zhu, Bencheng Liao, Qian Zhang et al.
Vision Transformers as Probabilistic Expansion from Learngene
Qiufeng Wang, Xu Yang, Haokun Chen et al.
Visual Representation Learning with Stochastic Frame Prediction
Huiwon Jang, Dongyoung Kim, Junsu Kim et al.
Visual-Text Cross Alignment: Refining the Similarity Score in Vision-Language Models
Jinhao Li, Haopeng Li, Sarah Monazam Erfani et al.
Visual Transformer with Differentiable Channel Selection: An Information Bottleneck Inspired Approach
Yancheng Wang, Ping Li, Yingzhen Yang
VNN: Verification-Friendly Neural Networks with Hard Robustness Guarantees
Anahita Baninajjar, Ahmed Rezine, Amir Aminifar
VoroNav: Voronoi-based Zero-shot Object Navigation with Large Language Model
Pengying Wu, Yao Mu, Bingxian Wu et al.
VQDNA: Unleashing the Power of Vector Quantization for Multi-Species Genomic Sequence Modeling
Siyuan Li, Zedong Wang, Zicheng Liu et al.
WARM: On the Benefits of Weight Averaged Reward Models
Alexandre Rame, Nino Vieillard, Leonard Hussenot et al.
Wasserstein Wormhole: Scalable Optimal Transport Distance with Transformer
Doron Haviv, Russell Zhang Kunes, Thomas Dougherty et al.
Watermarks in the Sand: Impossibility of Strong Watermarking for Language Models
Hanlin Zhang, Benjamin L. Edelman, Danilo Francati et al.