Papers
VIP5: Towards Multimodal Foundation Models for Recommendation
Shijie Geng, Juntao Tan, Shuchang Liu et al.
ViPE: Visualise Pretty-much Everything
Hassan Shahmohammadi, Adhiraj Ghosh, Hendrik Lensch
VIPHY: Probing “Visible” Physical Commonsense Knowledge
Shikhar Singh, Ehsan Qasemi, Muhao Chen
Vision-Enhanced Semantic Entity Recognition in Document Images via Visually-Asymmetric Consistency Learning
Hao Wang, Xiahua Chen, Rui Wang et al.
VISIT: Visualizing and Interpreting the Semantic Information Flow of Transformers
Shahar Katz, Yonatan Belinkov
ViSoBERT: A Pre-Trained Language Model for Vietnamese Social Media Text Processing
Nam Nguyen, Thang Phan, Duc-Vu Nguyen et al.
VIST5: An Adaptive, Retrieval-Augmented Language Model for Visualization-oriented Dialog
Henrik Voigt, Nuno Carvalhais, Monique Meuschke et al.
VISTA: Visual-Textual Knowledge Graph Representation Learning
Jaejun Lee, Chanyoung Chung, Hochang Lee et al.
ViStruct: Visual Structural Knowledge Extraction via Curriculum Guided Code-Vision Representation
Yangyi Chen, Xingyao Wang, Manling Li et al.
Visually Grounded Continual Language Learning with Selective Specialization
Kyra Ahrens, Lennart Bengtson, Jae Hee Lee et al.
Visually-Situated Natural Language Understanding with Contrastive Reading Model and Frozen Large Language Models
Geewook Kim, Hodong Lee, Daehee Kim et al.
Visual Prediction Improves Zero-Shot Cross-Modal Machine Translation
Tosho Hirasawa, Emanuele Bugliarello, Desmond Elliott et al.
Visual Storytelling with Question-Answer Plans
Danyang Liu, Mirella Lapata, Frank Keller
ViT-TTS: Visual Text-to-Speech with Scalable Diffusion Transformer
Huadai Liu, Rongjie Huang, Xuan Lin et al.
VivesDebate-Speech: A Corpus of Spoken Argumentation to Leverage Audio Features for Argument Mining
Ramon Ruiz-Dolz, Javier Iranzo-Sánchez
VKIE: The Application of Key Information Extraction on Video Text
Siyu An, Ye Liu, Haoyuan Peng et al.
VLIS: Unimodal Language Models Guide Multimodal Language Generation
Jiwan Chung, Youngjae Yu
VoxArabica: A Robust Dialect-Aware Arabic Speech Recognition System
Abdul Waheed, Bashar Talafha, Peter Sullivan et al.
Walking a Tightrope – Evaluating Large Language Models in High-Risk Domains
Chia-Chien Hung, Wiem Ben Rim, Lindsay Frost et al.
Watermarking LLMs with Weight Quantization
Linyang Li, Botian Jiang, Pengyu Wang et al.
Watermarking PLMs on Classification Tasks by Combining Contrastive Learning with Weight Perturbation
Chenxi Gu, Xiaoqing Zheng, Jianhan Xu et al.
Weakly-supervised Deep Cognate Detection Framework for Low-Resourced Languages Using Morphological Knowledge of Closely-Related Languages
Koustava Goswami, Priya Rani, Theodorus Fransen et al.
Weakly-Supervised Learning of Visual Relations in Multimodal Pretraining
Emanuele Bugliarello, Aida Nematzadeh, Lisa Hendricks
Weakly Supervised Semantic Parsing with Execution-based Spurious Program Filtering
Kang-il Lee, Segwang Kim, Kyomin Jung