Shiwei Zhang

37 papers · 2019–2025 · 8 conferences · across top CS/AI conferences

Achievements

+11 more ↓

🌈 Renaissance Researcher (5) 🌉 Interdisciplinary Bridge 🌍 Conference Polyglot (8) 🏃 Academic Marathon (6) 🗺️ Taxonomy Completionist (52)

🧭 Keyword Pioneer 🐝 Cross-Pollinator (15) 🌈 Renaissance Researcher (5) 🔬 Deep Specialist (10) 👥 Mega-Team (20) 🤝 Dynamic Duo (19) 🧬 Topic Evolution 🔥 Unstoppable (5) 🗃️ Keyword Collector (168) 💎 Century Club (37) ⚡ Prolific Year (8)

Conferences

CVPR (15) ICCV (11) NIPS (4) ICLR (3) AAAI (1) ACL (1) COLING (1) ECCV (1)

Top co-authors

Yingya Zhang (19) Xiang Wang (18) Zhiwu Qing (13) Hangjie Yuan (9) Nong Sang (9) Changxin Gao (8) Deli Zhao (8) Yujie Wei (8) Mingqian Tang (7) Ziyuan Huang (6)

Keywords

diffusion model (8) action recognition (7) video generation (6) transfer learning (4) temporal modeling (4) few-shot learning (4) text-to-video generation (3) temporal context (3) self-supervised learning (3) contrastive learning (3) video understanding (3) vision-language model (2) video diffusion model (2) generative model (2) zero-shot learning (2) text-to-image generation (2) image synthesis (2) video recognition (2) prompt learning (2) video customization (2)

Papers

Enhancing Zero-shot Object Counting via Text-guided Local Ranking and Number-evoked Global Attention ICCV 2025 Timestep Embedding Tells: It's Time to Cache for Video Diffusion Model CVPR 2025 SimpleVQA: Multimodal Factuality Evaluation for Multimodal Large Language Models ICCV 2025 Animate-X: Universal Character Image Animation with Enhanced Motion Representation ICLR 2025 FreeMask: Rethinking the Importance of Attention Masks for Zero-Shot Video Editing AAAI 2025 FreeScale: Unleashing the Resolution of Diffusion Models via Tuning-Free Scale Fusion ICCV 2025 PersonalVideo: High ID-Fidelity Video Customization without Dynamic and Semantic Degradation ICCV 2025 CountSE: Soft Exemplar Open-set Object Counting ICCV 2025 DreamRelation: Relation-Centric Video Customization ICCV 2025 A Recipe for Scaling up Text-to-Video Generation with Text-free Videos CVPR 2024 EvolveDirector: Approaching Advanced Text-to-Image Generation with Large Vision-Language Models NIPS 2024 Hierarchical Spatio-temporal Decoupling for Text-to-Video Generation CVPR 2024 Check Locate Rectify: A Training-Free Layout Calibration System for Text-to-Image Generation CVPR 2024 InstructVideo: Instructing Video Diffusion Models with Human Feedback CVPR 2024 DreamVideo: Composing Your Dream Videos with Customized Subject and Motion CVPR 2024 Enlarging Instance-Specific and Class-Specific Information for Open-Set Action Recognition CVPR 2023 FaceComposer: A Unified Model for Versatile Facial Content Creation NIPS 2023 The Devil is in the Wrongly-classified Samples: Towards Unified Open-set Recognition ICLR 2023 VideoComposer: Compositional Video Synthesis with Motion Controllability NIPS 2023 Space-time Prompting for Video Class-incremental Learning ICCV 2023 Disentangling Spatial and Temporal Learning for Efficient Image-to-Video Transfer Learning ICCV 2023 RLIPv2: Fast Scaling of Relational Language-Image Pre-Training ICCV 2023 MoLo: Motion-Augmented Long-Short Contrastive Learning for Few-Shot Action Recognition CVPR 2023 LipFormer: High-Fidelity and Generalizable Talking Face Generation With a Pre-Learned Facial Codebook CVPR 2023 Open-World Semantic Segmentation for LIDAR Point Clouds ECCV 2022 Prompt Combines Paraphrase: Teaching Pre-trained Models to Understand Rare Biomedical Words COLING 2022 TCTrack: Temporal Contexts for Aerial Tracking CVPR 2022 Hybrid Relation Guided Set Matching for Few-Shot Action Recognition CVPR 2022 G4: Grounding-guided Goal-oriented Dialogues Generation with Multiple Documents ACL 2022 Learning From Untrimmed Videos: Self-Supervised Video Representation Learning With Hierarchical Consistency CVPR 2022 TAda! Temporally-Adaptive Convolutions for Video Understanding ICLR 2022 Learning a Condensed Frame for Memory-Efficient Video Class-Incremental Learning NIPS 2022 Support-Set Based Cross-Supervision for Video Grounding ICCV 2021 Self-Supervised Motion Learning From Static Images CVPR 2021 Self-Supervised Learning for Semi-Supervised Temporal Action Proposal CVPR 2021 OadTR: Online Action Detection With Transformers ICCV 2021 TACNet: Transition-Aware Context Network for Spatio-Temporal Action Detection CVPR 2019