Shiwei Zhang
37 papers · 2019–2025 · 8 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+11 more ↓ Show less ↑
π Renaissance Researcher (5) π Interdisciplinary Bridge π Conference Polyglot (8) π Academic Marathon (6) πΊοΈ Taxonomy Completionist (52)
π§
Keyword Pioneer
π
Cross-Pollinator
(15)
π
Renaissance Researcher
(5)
π¬
Deep Specialist
(10)
π₯
Mega-Team
(20)
π€
Dynamic Duo
(19)
π§¬
Topic Evolution
π₯
Unstoppable
(5)
ποΈ
Keyword Collector
(168)
π
Century Club
(37)
β‘
Prolific Year
(8)
Conferences
CVPR (15)
ICCV (11)
NIPS (4)
ICLR (3)
AAAI (1)
ACL (1)
COLING (1)
ECCV (1)
Top co-authors
Keywords
diffusion model
(8)
action recognition
(7)
video generation
(6)
transfer learning
(4)
temporal modeling
(4)
few-shot learning
(4)
text-to-video generation
(3)
temporal context
(3)
self-supervised learning
(3)
contrastive learning
(3)
video understanding
(3)
vision-language model
(2)
video diffusion model
(2)
generative model
(2)
zero-shot learning
(2)
text-to-image generation
(2)
image synthesis
(2)
video recognition
(2)
prompt learning
(2)
video customization
(2)
Papers
Enhancing Zero-shot Object Counting via Text-guided Local Ranking and Number-evoked Global Attention
ICCV 2025
Timestep Embedding Tells: It's Time to Cache for Video Diffusion Model
CVPR 2025
SimpleVQA: Multimodal Factuality Evaluation for Multimodal Large Language Models
ICCV 2025
Animate-X: Universal Character Image Animation with Enhanced Motion Representation
ICLR 2025
FreeMask: Rethinking the Importance of Attention Masks for Zero-Shot Video Editing
AAAI 2025
FreeScale: Unleashing the Resolution of Diffusion Models via Tuning-Free Scale Fusion
ICCV 2025
PersonalVideo: High ID-Fidelity Video Customization without Dynamic and Semantic Degradation
ICCV 2025
CountSE: Soft Exemplar Open-set Object Counting
ICCV 2025
DreamRelation: Relation-Centric Video Customization
ICCV 2025
A Recipe for Scaling up Text-to-Video Generation with Text-free Videos
CVPR 2024
EvolveDirector: Approaching Advanced Text-to-Image Generation with Large Vision-Language Models
NIPS 2024
Hierarchical Spatio-temporal Decoupling for Text-to-Video Generation
CVPR 2024
Check Locate Rectify: A Training-Free Layout Calibration System for Text-to-Image Generation
CVPR 2024
InstructVideo: Instructing Video Diffusion Models with Human Feedback
CVPR 2024
DreamVideo: Composing Your Dream Videos with Customized Subject and Motion
CVPR 2024
Enlarging Instance-Specific and Class-Specific Information for Open-Set Action Recognition
CVPR 2023
FaceComposer: A Unified Model for Versatile Facial Content Creation
NIPS 2023
The Devil is in the Wrongly-classified Samples: Towards Unified Open-set Recognition
ICLR 2023
VideoComposer: Compositional Video Synthesis with Motion Controllability
NIPS 2023
Space-time Prompting for Video Class-incremental Learning
ICCV 2023
Disentangling Spatial and Temporal Learning for Efficient Image-to-Video Transfer Learning
ICCV 2023
RLIPv2: Fast Scaling of Relational Language-Image Pre-Training
ICCV 2023
MoLo: Motion-Augmented Long-Short Contrastive Learning for Few-Shot Action Recognition
CVPR 2023
LipFormer: High-Fidelity and Generalizable Talking Face Generation With a Pre-Learned Facial Codebook
CVPR 2023
Open-World Semantic Segmentation for LIDAR Point Clouds
ECCV 2022
Prompt Combines Paraphrase: Teaching Pre-trained Models to Understand Rare Biomedical Words
COLING 2022
TCTrack: Temporal Contexts for Aerial Tracking
CVPR 2022
Hybrid Relation Guided Set Matching for Few-Shot Action Recognition
CVPR 2022
G4: Grounding-guided Goal-oriented Dialogues Generation with Multiple Documents
ACL 2022
Learning From Untrimmed Videos: Self-Supervised Video Representation Learning With Hierarchical Consistency
CVPR 2022
TAda! Temporally-Adaptive Convolutions for Video Understanding
ICLR 2022
Learning a Condensed Frame for Memory-Efficient Video Class-Incremental Learning
NIPS 2022
Support-Set Based Cross-Supervision for Video Grounding
ICCV 2021
Self-Supervised Motion Learning From Static Images
CVPR 2021
Self-Supervised Learning for Semi-Supervised Temporal Action Proposal
CVPR 2021
OadTR: Online Action Detection With Transformers
ICCV 2021
TACNet: Transition-Aware Context Network for Spatio-Temporal Action Detection
CVPR 2019