Zheng Zhu
49 papers · 2018–2025 · 8 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+13 more ↓ Show less ↑
π Interdisciplinary Bridge π Academic Marathon (7) π Conference Polyglot (8) π Renaissance Researcher (7) πΊοΈ Taxonomy Completionist (73)
πΊοΈ
Taxonomy Completionist
(73)
π§
Keyword Pioneer
π£
Hot Topic Early Bird
π
Conference Loyalist
(21)
π¬
Deep Specialist
(11)
π€
Dynamic Duo
(22)
π
Keyword Champion
(2)
π
Century Club
(49)
π₯
Unstoppable
(8)
ποΈ
Keyword Collector
(193)
β
The Questioner
β‘
Prolific Year
(5)
π
Conference Pioneer
Conferences
CVPR (21)
ICCV (10)
AAAI (7)
ECCV (6)
NIPS (2)
CORL (1)
ICLR (1)
IJCAI (1)
Top co-authors
Keywords
autonomous driving
(9)
semantic segmentation
(5)
vision transformer
(5)
3d vision
(4)
neural network
(4)
video generation
(4)
convolutional neural network
(4)
world model
(4)
self-supervised learning
(4)
diffusion model
(4)
3d object detection
(4)
attention mechanism
(4)
object tracking
(3)
knowledge distillation
(3)
contrastive learning
(3)
object detection
(2)
scene reconstruction
(2)
metric learning
(2)
human pose estimation
(2)
scene representation
(2)
Papers
JTD-UAV: MLLM-Enhanced Joint Tracking and Description Framework for Anti-UAV Systems
CVPR 2025
DriveDreamer4D: World Models Are Effective Data Machines for 4D Driving Scene Representation
CVPR 2025
ReconDreamer++: Harmonizing Generative and Reconstructive Models for Driving Scene Representation
ICCV 2025
WonderTurbo: Generating Interactive 3D World in 0.72 Seconds
ICCV 2025
DetRF: Detachable Novel Views Synthesis of Dynamic Scenes Using Backdrop-Driven Neural Radiance Fields
AAAI 2025
DriveDreamer-2: LLM-Enhanced World Models for Diverse Driving Video Generation
AAAI 2025
HumanDreamer: Generating Controllable Human-Motion Videos via Decoupled Generation
CVPR 2025
ReconDreamer: Crafting World Models for Driving Scene Reconstruction via Online Restoration
CVPR 2025
DiffusionDepth: Diffusion Denoising Approach for Monocular Depth Estimation
ECCV 2024
DriveWorld: 4D Pre-trained Scene Understanding via World Models for Autonomous Driving
CVPR 2024
OpenPSG: Open-set Panoptic Scene Graph Generation via Large Multimodal Models
ECCV 2024
Unified Single-Stage Transformer Network for Efficient RGB-T Tracking
IJCAI 2024
DriveDreamer: Towards Real-world-driven World Models for Autonomous Driving
ECCV 2024
One at a Time: Progressive Multi-Step Volumetric Probability Learning for Reliable 3D Scene Perception
AAAI 2024
DiffBEV: Conditional Diffusion Model for Birdβs Eye View Perception
AAAI 2024
CompletionFormer: Depth Completion With Convolutions and Vision Transformers
CVPR 2023
Crafting Monocular Cues and Velocity Guidance for Self-Supervised Multi-Frame Depth Learning
AAAI 2023
A Simple Baseline for Multi-Camera 3D Object Detection
AAAI 2023
Are We Ready for Vision-Centric Driving Streaming Perception? The ASAP Benchmark
CVPR 2023
DiffTalk: Crafting Diffusion Models for Generalized Audio-Driven Portraits Animation
CVPR 2023
OPERA: Omni-Supervised Representation Learning with Hierarchical Supervisions
ICCV 2023
OccFormer: Dual-path Transformer for Vision-based 3D Semantic Occupancy Prediction
ICCV 2023
Token-Label Alignment for Vision Transformers
ICCV 2023
DREAM: Efficient Dataset Distillation by Representative Matching
ICCV 2023
DyGait: Exploiting Dynamic Representations for High-performance Gait Recognition
ICCV 2023
OpenOccupancy: A Large Scale Benchmark for Surrounding Semantic Occupancy Perception
ICCV 2023
SurroundOcc: Multi-camera 3D Occupancy Prediction for Autonomous Driving
ICCV 2023
Divide to Adapt: Mitigating Confirmation Bias for Domain Adaptation of Black-Box Predictors
ICLR 2023
Decoupled Multi-Task Learning With Cyclical Self-Regulation for Face Parsing
CVPR 2022
Learning Dynamic Facial Radiance Fields for Few-Shot Talking Head Synthesis
ECCV 2022
MVSTER: Epipolar Transformer for Efficient Multi-View Stereo
ECCV 2022
OrdinalCLIP: Learning Rank Prompts for Language-Guided Ordinal Regression
NIPS 2022
Dimension Embeddings for Monocular 3D Object Detection
CVPR 2022
Crafting Better Contrastive Views for Siamese Representation Learning
CVPR 2022
DenseCLIP: Language-Guided Dense Prediction With Context-Aware Prompting
CVPR 2022
An Efficient Training Approach for Very Large Scale Face Recognition
CVPR 2022
Shapley-NAS: Discovering Operation Contribution for Neural Architecture Search
CVPR 2022
SurroundDepth: Entangling Surrounding Views for Self-Supervised Multi-Camera Depth Estimation
CORL 2022
CAFE: Learning To Condense Dataset by Aligning Features
CVPR 2022
Structure-Aware Face Clustering on a Large-Scale Graph With 107 Nodes
CVPR 2021
Global Filter Networks for Image Classification
NIPS 2021
WebFace260M: A Benchmark Unveiling the Power of Million-Scale Deep Face Recognition
CVPR 2021
Gait Recognition in the Wild: A Benchmark
ICCV 2021
SIMPLE: SIngle-network with Mimicking and Point Learning for Bottom-up Human Pose Estimation
AAAI 2021
The Devil Is in the Details: Delving Into Unbiased Data Processing for Human Pose Estimation
CVPR 2020
Attention-Guided Unified Network for Panoptic Segmentation
CVPR 2019
Distractor-aware Siamese Networks for Visual Object Tracking
ECCV 2018
High Performance Visual Tracking With Siamese Region Proposal Network
CVPR 2018
End-to-End Flow Correlation Tracking With Spatial-Temporal Attention
CVPR 2018