Co-occurring keywords
Papers
Advancing High-Resolution Video-Language Representation With Large-Scale Video Transcriptions
CVPR 2022
Voxel-informed Language Grounding
ACL 2022
Retrieve, Caption, Generate: Visual Grounding for Enhancing Commonsense in Text Generation Models
AAAI 2022
Visual Definition Modeling: Challenging Vision & Language Models to Define Words and Objects
AAAI 2022
3DJCG: A Unified Framework for Joint Dense Captioning and Visual Grounding on 3D Point Clouds
CVPR 2022
TRIPS: Efficient Vision-and-Language Pre-training with Text-Relevant Image Patch Selection
EMNLP 2022