Yujun Shen

91 papers · 2018–2025 · 6 conferences · across top CS/AI conferences

Achievements

+14 more ↓

🧭 Keyword Pioneer 🌍 Conference Polyglot (6) 🌉 Interdisciplinary Bridge 🗺️ Taxonomy Completionist (10) 🏃 Academic Marathon (7)

🏃 Academic Marathon (7) 🐝 Cross-Pollinator (14) 🌈 Renaissance Researcher (7) 🏠 Conference Loyalist (43) 👑 Triple Crown 🤝 Dynamic Duo (22) 🏆 Keyword Champion (2) 🔬 Deep Specialist (32) 🧬 Topic Evolution 💎 Century Club (91) 📈 Trend Setter 🔥 Unstoppable (6) ⚡ Prolific Year (22) 🗃️ Keyword Collector (327)

Conferences

CVPR (43) NIPS (17) ICCV (12) ECCV (10) ICLR (5) ICML (4)

Top co-authors

Yinghao Xu (22) Deli Zhao (21) Ceyuan Yang (20) Sida Peng (15) Kecheng Zheng (15) Bolei Zhou (14) Qifeng Chen (14) Xiaowei Zhou (13) Jingren Zhou (12) Jiapeng Zhu (12)

Research topics

Core AI (1)

Keywords

generative adversarial network (20) diffusion model (15) image generation (14) image synthesis (13) image editing (7) latent space (7) novel view synthesis (7) neural radiance field (6) generative model (6) depth estimation (5) 3d reconstruction (4) neural network (4) 3d gaussian splatting (4) self-supervised learning (4) video generation (4) foundation model (4) representation learning (4) neural rendering (3) multimodal learning (3) adversarial training (3)

Papers

Learning Visual Generative Priors without Text CVPR 2025 MangaNinja: Line Art Colorization with Precise Reference Following CVPR 2025 AvatarArtist: Open-Domain 4D Avatarization CVPR 2025 Prometheus: 3D-Aware Latent Diffusion Models for Feed-Forward Text-to-3D Scene Generation CVPR 2025 ReTracker: Exploring Image Matching for Robust Online Any Point Tracking ICCV 2025 Edicho: Consistent Image Editing in the Wild ICCV 2025 Neural Shell Texture Splatting: More Details and Fewer Primitives ICCV 2025 DiffDoctor: Diagnosing Image Diffusion Models Before Treating ICCV 2025 SpatialTrackerV2: Advancing 3D Point Tracking with Explicit Camera Motion ICCV 2025 Diffuman4D: 4D Consistent Human View Synthesis from Sparse-View Videos with Spatio-Temporal Diffusion Models ICCV 2025 BoxDreamer: Dreaming Box Corners for Generalizable Object Pose Estimation ICCV 2025 LeviTor: 3D Trajectory Oriented Image-to-Video Synthesis CVPR 2025 EnvGS: Modeling View-Dependent Appearance with Environment Gaussian CVPR 2025 MagicQuill: An Intelligent Interactive Image Editing System CVPR 2025 Ready-to-React: Online Reaction Policy for Two-Character Interaction Generation ICLR 2025 Framer: Interactive Frame Interpolation ICLR 2025 UniRestore3D: A Scalable Framework For General Shape Restoration ICLR 2025 Exploring Sparse MoE in GANs for Text-conditioned Image Synthesis CVPR 2025 Contextual AD Narration with Interleaved Multimodal Sequence CVPR 2025 PlanarSplatting: Accurate Planar Surface Reconstruction in 3 Minutes CVPR 2025 AniDoc: Animation Creation Made Easier CVPR 2025 ScaleLSD: Scalable Deep Line Segment Detection Streamlined CVPR 2025 Mimir: Improving Video Diffusion Models for Precise Text Understanding CVPR 2025 FLARE: Feed-forward Geometry, Appearance and Camera Estimation from Uncalibrated Sparse Views CVPR 2025 Benchmarking Large Vision-Language Models via Directed Scene Graph for Comprehensive Image Captioning CVPR 2025 Learning Temporally Consistent Video Depth from Video Diffusion Priors CVPR 2025 Rectified Diffusion Guidance for Conditional Generation CVPR 2025 CCM: Real-Time Controllable Visual Content Creation Using Text-to-Image Consistency Models ICML 2024 UKnow: A Unified Knowledge Protocol with Multimodal Knowledge Graph Datasets for Reasoning and Vision-Language Pre-Training NIPS 2024 LoTLIP: Improving Language-Image Pre-training for Long Text Understanding NIPS 2024 Zero-shot Image Editing with Reference Imitation NIPS 2024 BerfScene: Bev-conditioned Equivariant Radiance Fields for Infinite 3D Scene Generation CVPR 2024 A Recipe for Scaling up Text-to-Video Generation with Text-free Videos CVPR 2024 Towards More Accurate Diffusion Model Acceleration with A Timestep Tuner CVPR 2024 Ranni: Taming Text-to-Image Diffusion for Accurate Instruction Following CVPR 2024 SpatialTracker: Tracking Any 2D Pixels in 3D Space CVPR 2024 NEAT: Distilling 3D Wireframes from Neural Attraction Fields CVPR 2024 AnyDoor: Zero-shot Object-level Image Customization CVPR 2024 CoDeF: Content Deformation Fields for Temporally Consistent Video Processing CVPR 2024 4K4D: Real-Time 4D View Synthesis at 4K Resolution CVPR 2024 GRM: Large Gaussian Reconstruction Model for Efficient 3D Reconstruction and Generation ECCV 2024 Learning 3D-aware GANs from Unposed Images with Template Feature Field ECCV 2024 Language-Image Pre-training with Long Captions ECCV 2024 LivePhoto: Real Image Animation with Text-guided Motion Control ECCV 2024 Exploring Guided Sampling of Conditional GANs ECCV 2024 SAM-guided Graph Cut for 3D Instance Segmentation ECCV 2024 Real-time 3D-aware Portrait Editing from a Single Image ECCV 2024 Lipschitz Singularities in Diffusion Models ICLR 2024 SMaRt: Improving GANs with Score Matching Regularity ICML 2024 VideoComposer: Compositional Video Synthesis with Motion Controllability NIPS 2023 Customizable Image Synthesis with Multiple Subjects NIPS 2023 Res-Tuning: A Flexible and Efficient Tuning Paradigm via Unbinding Tuner from Backbone NIPS 2023 Towards Smooth Video Composition ICLR 2023 Composer: Creative and Controllable Image Synthesis with Composable Conditions ICML 2023 LinkGAN: Linking GAN Latents to Pixels for Controllable Image Synthesis ICCV 2023 ViM: Vision Middleware for Unified Downstream Transferring ICCV 2023 One-Shot Generative Domain Adaptation ICCV 2023 Learning 3D-Aware Image Synthesis With Unknown Pose Distribution CVPR 2023 Dimensionality-Varying Diffusion Process CVPR 2023 LipFormer: High-Fidelity and Generalizable Talking Face Generation With a Pre-Learned Facial Codebook CVPR 2023 Regularized Mask Tuning: Uncovering Hidden Knowledge in Pre-Trained Vision-Language Models ICCV 2023 Scanning Only Once: An End-to-end Framework for Fast Temporal Grounding in Long Videos ICCV 2023 FaceComposer: A Unified Model for Versatile Facial Content Creation NIPS 2023 Revisiting the Evaluation of Image Synthesis with GANs NIPS 2023 Learning Modulated Transformation in GANs NIPS 2023 Benchmarking and Analyzing 3D-aware Image Synthesis with a Modularized Codebase NIPS 2023 Balancing Logit Variation for Long-Tailed Semantic Segmentation CVPR 2023 GLeaD: Improving GANs With a Generator-Leading Task CVPR 2023 Neural Dependencies Emerging From Learning Massive Categories CVPR 2023 DisCoScene: Spatially Disentangled Generative Radiance Fields for Controllable 3D-Aware Scene Synthesis CVPR 2023 Compact Neural Volumetric Video Representations with Dynamic Codebooks NIPS 2023 3D-Aware Image Synthesis via Learning Structural and Textural Representations CVPR 2022 Semi-Supervised Semantic Segmentation Using Unreliable Pseudo-Labels CVPR 2022 Cross-Model Pseudo-Labeling for Semi-Supervised Action Recognition CVPR 2022 High-Fidelity GAN Inversion with Padding Space ECCV 2022 3D-Aware Indoor Scene Synthesis with Depth Priors ECCV 2022 Improving GANs with A Dynamic Discriminator NIPS 2022 Learning from Future: A Novel Self-Training Framework for Semantic Segmentation NIPS 2022 A Unified Model for Multi-class Anomaly Detection NIPS 2022 Region-Based Semantic Factorization in GANs ICML 2022 Improving 3D-aware Image Synthesis with A Geometry-aware Discriminator NIPS 2022 Improving GAN Equilibrium by Raising Spatial Awareness CVPR 2022 Generative Hierarchical Features From Synthesizing Images CVPR 2021 Glancing at the Patch: Anomaly Localization With Global and Local Feature Comparison CVPR 2021 Closed-Form Factorization of Latent Semantics in GANs CVPR 2021 Low-Rank Subspaces in GANs NIPS 2021 Data-Efficient Instance Generation from Instance Discrimination NIPS 2021 In-Domain GAN Inversion for Real Image Editing ECCV 2020 Interpreting the Latent Space of GANs for Semantic Face Editing CVPR 2020 Image Processing Using Multi-Code GAN Prior CVPR 2020 FaceID-GAN: Learning a Symmetry Three-Player GAN for Identity-Preserving Face Synthesis CVPR 2018