Luowei Zhou
18 papers · 2018–2025 · 7 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+10 more ↓ Show less ↑
π Interdisciplinary Bridge π Renaissance Researcher (6) π Conference Polyglot (7) π Academic Marathon (7) πΊοΈ Taxonomy Completionist (40)
πΊοΈ
Taxonomy Completionist
(40)
π§
Keyword Pioneer
π£
Hot Topic Early Bird
π
The Namer
π§¬
Topic Evolution
π
Trend Setter
β‘
Prolific Year
(8)
π₯
Unstoppable
(8)
ποΈ
Keyword Collector
(85)
π
Century Club
(18)
Conferences
CVPR (9)
NIPS (3)
ECCV (2)
AAAI (1)
ACL (1)
ICCV (1)
IJCNLP (1)
Top co-authors
Keywords
video understanding
(5)
image captioning
(4)
contrastive learning
(4)
multimodal learning
(4)
vision-language pretraining
(3)
video captioning
(3)
video question answering
(3)
transfer learning
(2)
multi-modal learning
(2)
visual question answering
(2)
vision-language model
(2)
image-text retrieval
(2)
visual grounding
(2)
zero-shot learning
(2)
foundation model
(2)
large language model
(2)
self-supervised learning
(1)
video classification
(1)
dataset creation
(1)
image generation
(1)
Papers
ADIEE: Automatic Dataset Creation and Scorer for Instruction-Guided Image Editing Evaluation
ICCV 2025
AssistGUI: Task-Oriented PC Graphical User Interface Automation
CVPR 2024
MIST: Multi-Modal Iterative Spatial-Temporal Transformer for Long-Form Video Question Answering
CVPR 2023
OmniVL: One Foundation Model for Image-Language and Video-Language Tasks
NIPS 2022
Language Models with Image Descriptors are Strong Few-Shot Video-Language Learners
NIPS 2022
Visual Clues: Bridging Vision and Language Foundations for Image Paragraph Captioning
NIPS 2022
RegionCLIP: Region-Based Language-Image Pretraining
CVPR 2022
BEVT: BERT Pretraining of Video Transformers
CVPR 2022
CLIP-Event: Connecting Text and Images With Event Structures
CVPR 2022
DNA: Improving Few-Shot Transfer Learning with Low-Rank Decomposition and Alignment
ECCV 2022
Learning Visual Representation from Modality-Shared Contrastive Language-Image Pre-training
ECCV 2022
UC2: Universal Cross-Lingual Cross-Modal Vision-and-Language Pre-Training
CVPR 2021
Cluster-Former: Clustering-based Sparse Transformer for Question Answering
IJCNLP 2021
Less Is More: ClipBERT for Video-and-Language Learning via Sparse Sampling
CVPR 2021
Cluster-Former: Clustering-based Sparse Transformer for Question Answering
ACL 2021
Unified Vision-Language Pre-Training for Image Captioning and VQA
AAAI 2020
Grounded Video Description
CVPR 2019
End-to-End Dense Video Captioning With Masked Transformer
CVPR 2018