Papers
ViG: Linear-complexity Visual Sequence Learning with Gated Linear Attention
Bencheng Liao, Xinggang Wang, Lianghui Zhu et al.
VIoTGPT: Learning to Schedule Vision Tools Towards Intelligent Video Internet of Things
Yaoyao Zhong, Mengshi Qi, Rui Wang et al.
ViPCap: Retrieval Text-Based Visual Prompts for Lightweight Image Captioning
Taewhan Kim, Soeun Lee, Si-Woo Kim et al.
ViPOcc: Leveraging Visual Priors from Vision Foundation Models for Single-View 3D Occupancy Prediction
Yi Feng, Yu Han, Xijing Zhang et al.
Virtual Nodes Can Help: Tackling Distribution Shifts in Federated Graph Learning
Xingbo Fu, Zihan Chen, Yinhan He et al.
Vision-aware Multimodal Prompt Tuning for Uploadable Multi-source Few-shot Domain Adaptation
Kuanghong Liu, Jin Wang, Kangjian He et al.
Vision-Based Generic Potential Function for Policy Alignment in Multi-Agent Reinforcement Learning
Hao Ma, Shijie Wang, Zhiqiang Pu et al.
Vision Transformers Beat WideResNets on Small Scale Datasets Adversarial Robustness
Juntao Wu, Ziyu Song, Xiaoyu Zhang et al.
VisRec: A Semi-Supervised Approach to Visibility Data Reconstruction in Radio Astronomy
Ruoqi Wang, Haitao Wang, Qiong Luo et al.
Visual Perturbation for Text-Based Person Search
Pengcheng Zhang, Xiaohan Yu, Xiao Bai et al.
Visual Prompting Upgrades Neural Network Sparsification: A Data-Model Perspective
Can Jin, Tianjin Huang, Yihua Zhang et al.
Visual Question Answering for Peruvian Cuisine in Regional Spanish
Mariana Risco Cosavalente
Visual Reinforcement Learning with Residual Action
Zhenxian Liu, Peixi Peng, Yonghong Tian
VLScene: Vision-Language Guidance Distillation for Camera-Based 3D Semantic Scene Completion
Meng Wang, Huilong Pi, Ruihui Li et al.
VOILA: Complexity-Aware Universal Segmentation of CT Images by Voxel Interacting with Language
Zishuo Wan, Yu Gao, Wanyuan Pang et al.
Voter Priming Campaigns: Strategies, Equilibria, and Algorithms
Jonathan Shaki, Yonatan Aumann, Sarit Kraus
Vox-UDA: Voxel-wise Unsupervised Domain Adaptation for Cryo-Electron Subtomogram Segmentation with Denoised Pseudo-Labeling
Haoran Li, Xingjian Li, Jiahua Shi et al.
VProChart: Answering Chart Question Through Visual Perception Alignment Agent and Programmatic Solution Reasoning
Muye Huang, Lingling Zhang, Han Lai et al.
VQ4DiT: Efficient Post-Training Vector Quantization for Diffusion Transformers
Juncan Deng, Shuaiting Li, Zeyu Wang et al.
VQA4CIR: Boosting Composed Image Retrieval with Visual Question Answering
Chun-Mei Feng, Yang Bai, Tao Luo et al.
VQLTI: Long-Term Tropical Cyclone Intensity Forecasting with Physical Constraints
Xinyu Wang, Lei Liu, Kang Chen et al.
VQTalker: Towards Multilingual Talking Avatars Through Facial Motion Tokenization
Tao Liu, Ziyang Ma, Qi Chen et al.
VRVVC: Variable-Rate NeRF-Based Volumetric Video Compression
Qiang Hu, Houqiang Zhong, Zihan Zheng et al.
VTG-LLM: Integrating Timestamp Knowledge into Video LLMs for Enhanced Video Temporal Grounding
Yongxin Guo, Jingyu Liu, Mingda Li et al.