Papers
VideoElevator: Elevating Video Generation Quality with Versatile Text-to-Image Diffusion Models
Yabo Zhang, Yuxiang Wei, Xianhui Lin et al.
Video Repurposing from User Generated Content: A Large-scale Dataset and Benchmark
Yongliang Wu, Wenbo Zhu, Jiawang Cao et al.
Video Summarization Using Denoising Diffusion Probabilistic Model
Zirui Shang, Yubo Zhu, Hongxi Li et al.
VidEvent: A Large Dataset for Understanding Dynamic Evolution of Events in Videos
Baoyu Liang, Qile Su, Shoutai Zhu et al.
VidSole: A Multimodal Dataset for Joint Kinetics Quantification and Disease Detection with Deep Learning
Archit Kambhamettu, Samantha Snyder, Maliheh Fakhar et al.
Vietnamese Words Are Not Constructed from Syllables: Rethinking the Role of Word Segmentation in Natural Language Processing for Vietnamese Texts
Nghia Hieu Nguyen, Dat Tien Nguyen, Ngan Luu-Thuy Nguyen
View Transformation Robustness for Multi-View 3D Object Reconstruction with Reconstruction Error-Guided View Selection
Qi Zhang, Zhouhang Luo, Tao Yu et al.
ViFactCheck: A New Benchmark Dataset and Methods for Multi-Domain News Fact-Checking In Vietnamese
Tran Thai Hoa, Tran Quang Duy, Khanh Quoc Tran et al.
ViG: Linear-complexity Visual Sequence Learning with Gated Linear Attention
Bencheng Liao, Xinggang Wang, Lianghui Zhu et al.
VIoTGPT: Learning to Schedule Vision Tools Towards Intelligent Video Internet of Things
Yaoyao Zhong, Mengshi Qi, Rui Wang et al.
ViPCap: Retrieval Text-Based Visual Prompts for Lightweight Image Captioning
Taewhan Kim, Soeun Lee, Si-Woo Kim et al.
ViPOcc: Leveraging Visual Priors from Vision Foundation Models for Single-View 3D Occupancy Prediction
Yi Feng, Yu Han, Xijing Zhang et al.
Virtual Nodes Can Help: Tackling Distribution Shifts in Federated Graph Learning
Xingbo Fu, Zihan Chen, Yinhan He et al.
Vision-aware Multimodal Prompt Tuning for Uploadable Multi-source Few-shot Domain Adaptation
Kuanghong Liu, Jin Wang, Kangjian He et al.
Vision-Based Generic Potential Function for Policy Alignment in Multi-Agent Reinforcement Learning
Hao Ma, Shijie Wang, Zhiqiang Pu et al.
Vision Transformers Beat WideResNets on Small Scale Datasets Adversarial Robustness
Juntao Wu, Ziyu Song, Xiaoyu Zhang et al.
VisRec: A Semi-Supervised Approach to Visibility Data Reconstruction in Radio Astronomy
Ruoqi Wang, Haitao Wang, Qiong Luo et al.
Visual Perturbation for Text-Based Person Search
Pengcheng Zhang, Xiaohan Yu, Xiao Bai et al.
Visual Prompting Upgrades Neural Network Sparsification: A Data-Model Perspective
Can Jin, Tianjin Huang, Yihua Zhang et al.
Visual Question Answering for Peruvian Cuisine in Regional Spanish
Mariana Risco Cosavalente
Visual Reinforcement Learning with Residual Action
Zhenxian Liu, Peixi Peng, Yonghong Tian
VLScene: Vision-Language Guidance Distillation for Camera-Based 3D Semantic Scene Completion
Meng Wang, Huilong Pi, Ruihui Li et al.
VOILA: Complexity-Aware Universal Segmentation of CT Images by Voxel Interacting with Language
Zishuo Wan, Yu Gao, Wanyuan Pang et al.
Voter Priming Campaigns: Strategies, Equilibria, and Algorithms
Jonathan Shaki, Yonatan Aumann, Sarit Kraus