Zhuowen Tu
75 papers · 2013–2026 · 13 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+16 more ↓ Show less ↑
π Interdisciplinary Bridge π Conference Polyglot (13) π Academic Marathon (13) π Renaissance Researcher (6) πΊοΈ Taxonomy Completionist (107)
πΊοΈ
Taxonomy Completionist
(107)
π§
Keyword Pioneer
π£
Hot Topic Early Bird
π
Conference Loyalist
(29)
π€
Dynamic Duo
(12)
π
Keyword Champion
(2)
π¬
Deep Specialist
(11)
π
Grand Slam
π§¬
Topic Evolution
π
Century Club
(75)
β
The Questioner
π₯
Unstoppable
(14)
π
Trend Setter
ποΈ
Keyword Collector
(303)
β‘
Prolific Year
(11)
π
Conference Pioneer
Conferences
CVPR (29)
ICCV (16)
ECCV (5)
ICLR (4)
NIPS (4)
WACV (4)
EMNLP (3)
ICML (3)
AAAI (2)
AISTATS (2)
ACL (1)
IJCAI (1)
IJCNLP (1)
Top co-authors
Research topics
Keywords
image classification
(9)
convolutional neural network
(7)
diffusion model
(6)
generative model
(6)
instance segmentation
(6)
representation learning
(5)
transformer architecture
(5)
semi-supervised learning
(4)
object detection
(4)
multimodal learning
(4)
panoptic segmentation
(4)
knowledge distillation
(4)
vision transformer
(3)
scene understanding
(3)
self-supervised learning
(3)
image segmentation
(3)
image generation
(3)
3d reconstruction
(3)
3d vision
(3)
unsupervised learning
(3)
Papers
Gaussian Swaying: Surface-Based Framework for Aerodynamic Simulation with 3D Gaussians
WACV 2026
AuthGuard: Generalizable Deepfake Detection via Language Guidance
WACV 2026
CVP: Central-Peripheral Vision-Inspired Multimodal Model for Spatial Reasoning
WACV 2026
Ground-V: Teaching VLMs to Ground Complex Instructions in Pixels
CVPR 2025
DepR: Depth Guided Single-view Scene Reconstruction with Instance-level Diffusion
ICCV 2025
YOLO-Count: Differentiable Object Counting for Text-to-Image Generation
ICCV 2025
Lay-Your-Scene: Natural Scene Layout Generation with Diffusion Transformers
ICCV 2025
On the Scalability of Diffusion-based Text-to-Image Generation
CVPR 2024
HOIDiffusion: Generating Realistic 3D Hand-Object Interaction Data
CVPR 2024
Non-autoregressive Sequence-to-Sequence Vision-Language Models
CVPR 2024
TokenCompose: Text-to-Image Diffusion with Token-level Supervision
CVPR 2024
Enhancing Vision-Language Pre-training with Rich Supervisions
CVPR 2024
Bayesian Diffusion Models for 3D Shape Reconstruction
CVPR 2024
Restoration by Generation with Constrained Priors
CVPR 2024
Patched Denoising Diffusion Models For High-Resolution Image Synthesis
ICLR 2024
BLIVA: A Simple Multimodal LLM for Better Handling of Text-Rich Visual Questions
AAAI 2024
When Is Multilinguality a Curse? Language Modeling for 250 High- and Low-Resource Languages
EMNLP 2024
DocKD: Knowledge Distillation from LLMs for Open-World Document Understanding Models
EMNLP 2024
Dolfin: Diffusion Layout Transformers without Autoencoder
ECCV 2024
Open-World Dynamic Prompt and Continual Visual Representation Learning
ECCV 2024
DiffusionRig: Learning Personalized Priors for Facial Appearance Editing
CVPR 2023
Guided Recommendation for Model Fine-Tuning
CVPR 2023
Distilling Large Vision-Language Model with Out-of-Distribution Generalizability
ICCV 2023
MasQCLIP for Open-Vocabulary Universal Image Segmentation
ICCV 2023
Single-Stage Diffusion NeRF: A Unified Approach to 3D Generation and Reconstruction
ICCV 2023
Object-Centric Multiple Object Tracking
ICCV 2023
SkeleTR: Towards Skeleton-based Action Recognition in the Wild
ICCV 2023
DocTr: Document Transformer for Structured Information Extraction in Documents
ICCV 2023
Uni-3D: A Universal Model for Panoptic 3D Scene Reconstruction
ICCV 2023
On the Feasibility of Cross-Task Transfer with Model-Based Reinforcement Learning
ICLR 2023
Open-Vocabulary Universal Image Segmentation with MaskCLIP
ICML 2023
An In-depth Study of Stochastic Backpropagation
NIPS 2022
X-DETR: A Versatile Architecture for Instance-Wise Vision-Language Tasks
ECCV 2022
The Geometry of Multilingual Language Model Representations
EMNLP 2022
Instance Segmentation With Mask-Supervised Polygonal Boundary Transformers
CVPR 2022
Text Spotting Transformers
CVPR 2022
MeMOT: Multi-Object Tracking With Memory
CVPR 2022
Semi-supervised Vision Transformers at Scale
NIPS 2022
ViTGAN: Training GANs with Vision Transformers
ICLR 2022
Co-Scale Conv-Attentional Image Transformers
ICCV 2021
Visual Relationship Detection Using Part-and-Sum Transformers With Composite Queries
ICCV 2021
Exponential Moving Average Normalization for Self-Supervised and Semi-Supervised Learning
CVPR 2021
Attentional Constellation Nets for Few-Shot Learning
ICLR 2021
Long Short-Term Transformer for Online Action Detection
NIPS 2021
Convolutions and Self-Attention: Re-interpreting Relative Positions in Pre-trained Language Models
ACL 2021
Convolutions and Self-Attention: Re-interpreting Relative Positions in Pre-trained Language Models
IJCNLP 2021
Dual Contradistinctive Generative Autoencoder
CVPR 2021
Line Segment Detection Using Transformers Without Edges
CVPR 2021
Compatibility-Aware Heterogeneous Visual Search
CVPR 2021
Pose Recognition With Cascade Transformers
CVPR 2021
One-Pixel Signature: Characterizing CNN Models for Backdoor Detection
ECCV 2020
Local Binary Pattern Networks
WACV 2020
Guided Variational Autoencoder for Disentanglement Learning
CVPR 2020
Recognizing Objects From Any View With Object and Viewer-Centered Representations
CVPR 2020
Learning Instance Occlusion for Panoptic Segmentation
CVPR 2020
3D Volumetric Modeling with Introspective Neural Networks
AAAI 2019
Attentional ShapeContextNet for Point Cloud Recognition
CVPR 2018
Wasserstein Introspective Neural Networks
CVPR 2018
Rethinking Spatiotemporal Feature Learning: Speed-Accuracy Trade-offs in Video Classification
ECCV 2018
Deep Convolutional Neural Networks with Merge-and-Run Mappings
IJCAI 2018
Deeply Supervised Salient Object Detection With Short Connections
CVPR 2017
Aggregated Residual Transformations for Deep Neural Networks
CVPR 2017
Introspective Neural Networks for Generative Modeling
ICCV 2017
Introspective Classification with Convolutional Nets
NIPS 2017
Generalizing Pooling Functions in Convolutional Neural Networks: Mixed, Gated, and Tree
AISTATS 2016
Deeply-Supervised Nets
AISTATS 2015
Holistically-Nested Edge Detection
ICCV 2015
MILCut: A Sweeping Line Multiple Instance Learning Paradigm for Interactive Image Segmentation
CVPR 2014
Dynamic Label Propagation for Semi-supervised Multi-class Multi-label Classification
ICCV 2013
Action Recognition with Actons
ICCV 2013
Sparse Subspace Denoising for Image Manifolds
CVPR 2013
Robust Estimation of Nonrigid Transformation for Point Set Registration
CVPR 2013
Harvesting Mid-level Visual Concepts from Large-Scale Internet Images
CVPR 2013
Max-Margin Multiple-Instance Dictionary Learning
ICML 2013
Fixed-Point Model For Structured Labeling
ICML 2013