Hang Zhou
68 papers · 2019–2026 · 12 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+14 more ↓ Show less ↑
π Conference Polyglot (12) π Academic Marathon (6) π§ Keyword Pioneer π Interdisciplinary Bridge π Cross-Pollinator (10)
π
Cross-Pollinator
(10)
π
Renaissance Researcher
(10)
πΊοΈ
Taxonomy Completionist
(93)
π§¬
Topic Evolution
π€
Dynamic Duo
(16)
π
Keyword Champion
π
Triple Crown
π¬
Deep Specialist
(12)
π
Grand Slam
π
Conference Pioneer
π₯
Unstoppable
(7)
ποΈ
Keyword Collector
(272)
π
Century Club
(66)
β‘
Prolific Year
(15)
Conferences
CVPR (16)
AAAI (14)
ECCV (9)
NIPS (6)
ICCV (5)
ICML (5)
ICLR (4)
WACV (3)
ACL (2)
INTERSPEECH (2)
EMNLP (1)
IJCAI (1)
Top co-authors
Research topics
Keywords
video generation
(6)
adversarial attack
(5)
audio-visual learning
(5)
lip synchronization
(4)
diffusion model
(4)
point cloud
(4)
multi-modal learning
(3)
cross-modal learning
(3)
facial animation
(3)
gesture generation
(3)
co-speech gesture
(3)
generative model
(3)
contrastive learning
(3)
audio-visual representation
(3)
audio-driven generation
(3)
video anomaly detection
(2)
speech enhancement
(2)
image synthesis
(2)
remote sensing
(2)
multimodal learning
(2)
Papers
GRAM-RΒ²: Self-Training Generative Foundation Reward Models for Reward Reasoning
AAAI 2026
Inter-Client Dependency Recovery with Hidden Global Components for Federated Traffic Prediction
AAAI 2026
Video Anomaly Detection with Motion and Appearance Guided Patch Diffusion Model
AAAI 2025
Transolver++: An Accurate Neural Solver for PDEs on Million-Scale Geometries
ICML 2025
GA-S3: Comprehensive Social Network Simulation with Group Agents
ACL 2025
Hallo3: Highly Dynamic and Realistic Portrait Image Animation with Video Diffusion Transformer
CVPR 2025
Re-HOLD: Video Hand Object Interaction Reenactment via adaptive Layout-instructed Diffusion Model
CVPR 2025
Unisolver: PDE-Conditional Transformers Towards Universal Neural PDE Solvers
ICML 2025
CASAGPT: Cuboid Arrangement and Scene Assembly for Interior Design
CVPR 2025
BOOTPLACE: Bootstrapped Object Placement with Detection Transformers
CVPR 2025
Hallo2: Long-Duration and High-Resolution Audio-Driven Portrait Image Animation
ICLR 2025
GestureHYDRA: Semantic Co-speech Gesture Synthesis via Hybrid Modality Diffusion Transformer and Cascaded-Synchronized Retrieval-Augmented Generation
ICCV 2025
Step-level Verifier-guided Hybrid Test-Time Scaling for Large Language Models
EMNLP 2025
AudCast: Audio-Driven Human Video Generation by Cascaded Diffusion Transformers
CVPR 2025
LLM Data Selection and Utilization via Dynamic Bi-level Optimization
ICML 2025
Forest2Seq: Revitalizing Order Prior for Sequential Indoor Scene Synthesis
ECCV 2024
Star-Agents: Automatic Data Optimization with LLM Agents for Instruction Tuning
NIPS 2024
ShowMaker: Creating High-Fidelity 2D Human Video via Fine-Grained Diffusion Modeling
NIPS 2024
EGODE: An Event-attended Graph ODE Framework for Modeling Rigid Dynamics
NIPS 2024
Coupled Mamba: Enhanced Multimodal Fusion with Coupled State Space Model
NIPS 2024
Attacking Transformers with Feature Diversity Adversarial Perturbation
AAAI 2024
Dynamic Feature Pruning and Consolidation for Occluded Person Re-identification
AAAI 2024
Progressive Text-to-Image Diffusion with Soft Latent Direction
AAAI 2024
ESRL: Efficient Sampling-Based Reinforcement Learning for Sequence Generation
AAAI 2024
Hybrid Alignment Training for Large Language Models
ACL 2024
ReSyncer: Rewiring Style-based Generator for Unified Audio-Visually Synced Facial Performer
ECCV 2024
Let the Avatar Talk using Texts without Paired Training Data
ECCV 2024
DreamGaussian: Generative Gaussian Splatting for Efficient 3D Content Creation
ICLR 2024
PGODE: Towards High-quality System Dynamics Modeling
ICML 2024
Transferable Facial Privacy Protection against Blind Face Restoration via Domain-Consistent Adversarial Obfuscation
ICML 2024
StyleSync: High-Fidelity Generalized and Personalized Lip Sync in Style-Based Generator
CVPR 2023
ReEnFP: Detail-Preserving Face Reconstruction by Encoding Facial Priors
WACV 2023
Exploiting Visual Context Semantics for Sound Source Localization
WACV 2023
Disentangling the Benefits of Self-Supervised Learning to Deployment-Driven Downstream Tasks of Satellite Images (Student Abstract)
AAAI 2023
Delicate Textured Mesh Recovery from NeRF via Adaptive Surface Refinement
ICCV 2023
PARCS: A Deployment-Oriented AI System for Robust Parcel-Level Cropland Segmentation of Satellite Images
AAAI 2023
GoBigger: A Scalable Platform for Cooperative-Competitive Multi-Agent Interactive Simulation
ICLR 2023
TimesNet: Temporal 2D-Variation Modeling for General Time Series Analysis
ICLR 2023
SeCo: Separating Unknown Musical Visual Sounds With Consistency Guidance
WACV 2023
Dual Memory Units with Uncertainty Regulation for Weakly Supervised Video Anomaly Detection
AAAI 2023
Robust Video Portrait Reenactment via Personalized Representation Quantization
AAAI 2023
GhostRNN: Reducing State Redundancy in RNN with Cheap Operations
INTERSPEECH 2023
Delving into Sequential Patches for Deepfake Detection
NIPS 2022
Expressive Talking Head Generation With Granular Audio-Visual Control
CVPR 2022
SepFusion: Finding Optimal Fusion Structures for Visual Sound Separation
AAAI 2022
Visual Sound Localization in the Wild by Cross-Modal Interference Erasing
AAAI 2022
Shape-Invariant 3D Adversarial Point Clouds
CVPR 2022
Few-Shot Head Swapping in the Wild
CVPR 2022
Learning Hierarchical Cross-Modal Association for Co-Speech Gesture Generation
CVPR 2022
Audio-Driven Co-Speech Gesture Video Generation
NIPS 2022
StyleSwap: Style-Based Generator Empowers Robust Face Swapping
ECCV 2022
TokenMix: Rethinking Image Mixing for Data Augmentation in Vision Transformers
ECCV 2022
Joint-Modal Label Denoising for Weakly-Supervised Audio-Visual Video Parsing
ECCV 2022
Semantic-Aware Implicit Neural Audio-Driven Video Portrait Generation
ECCV 2022
Pose-Controllable Talking Face Generation by Implicitly Modularized Audio-Visual Representation
CVPR 2021
Energy-Friendly Keyword Spotting System Using Add-Based Convolution
INTERSPEECH 2021
Speech2Talking-Face: Inferring and Driving a Face with Synchronized Audio-Visual Representation
IJCAI 2021
Audio-Driven Emotional Video Portraits
CVPR 2021
Visually Informed Binaural Audio Generation without Binaural Audios
CVPR 2021
Discriminability Distillation in Group Representation Learning
ECCV 2020
Sep-Stereo: Visually Guided Stereophonic Audio Generation by Associating Source Separation
ECCV 2020
Self-Robust 3D Point Recognition via Gather-Vector Guidance
CVPR 2020
LG-GAN: Label Guided Adversarial Network for Flexible Targeted Attack of Point Cloud Based Deep Networks
CVPR 2020
Rotate-and-Render: Unsupervised Photorealistic Face Rotation From Single-View Images
CVPR 2020
Vision-Infused Deep Audio Inpainting
ICCV 2019
Talking Face Generation by Adversarially Disentangled Audio-Visual Representation
AAAI 2019
A Graph-Based Framework to Bridge Movies and Synopses
ICCV 2019
DUP-Net: Denoiser and Upsampler Network for 3D Adversarial Point Clouds Defense
ICCV 2019