Pengfei Wan

40 papers · 2021–2026 · 9 conferences · across top CS/AI conferences

Achievements

+9 more ↓

🗺️ Taxonomy Completionist (62) 🏃 Academic Marathon (5) 🌉 Interdisciplinary Bridge 🌍 Conference Polyglot (8) 🧭 Keyword Pioneer

🧭 Keyword Pioneer 🏃 Academic Marathon (5) 🔬 Deep Specialist (10) 🏆 Grand Slam 🤝 Dynamic Duo (20) 🔥 Unstoppable (5) 🗃️ Keyword Collector (176) 💎 Century Club (36) ⚡ Prolific Year (22)

Conferences

CVPR (12) ICCV (10) AAAI (5) ICLR (4) NIPS (4) EMNLP (2) ACL (1) ECCV (1) ICML (1)

Top co-authors

Di Zhang (22) Xin Tao (10) Xintao Wang (9) Wen Zheng (6) Kun Gai (5) Jiahao Wang (4) Xiaoqiang Liu (4) Jiarong Ou (4) Menghan Xia (4) Rui Chen (4)

Keywords

video generation (10) diffusion model (7) generative adversarial network (5) diffusion transformer (4) image generation (4) representation learning (3) point cloud completion (3) contrastive learning (3) multimodal learning (3) 3d shape generation (2) style transfer (2) video diffusion model (2) text-to-video generation (2) video diffusion (2) latent space (2) self-supervised learning (2) gaussian splatting (2) multimodal large language model (2) temporal coherence (2) neural network (2)

Papers

Boosting Resolution Generalization of Diffusion Transformers with Randomized Positional Encodings AAAI 2026 SegTune: Structured and Fine-Grained Control for Song Generation ACL 2026 Bridging Cognitive Gap: Hierarchical Description Learning for Artistic Image Aesthetics Assessment AAAI 2026 FilmWeaver: Weaving Consistent Multi-Shot Videos with Cache-Guided Autoregressive Diffusion AAAI 2026 RICO: Improving Accuracy and Completeness in Image Recaptioning via Visual Reconstruction EMNLP 2025 SEA: Supervised Embedding Alignment for Token-Level Visual-Textual Integration in MLLMs EMNLP 2025 How Far are AI-generated Videos from Simulating the 3D Visual World: A Learned 3D Evaluation Approach ICCV 2025 BadVideo: Stealthy Backdoor Attack against Text-to-Video Generation ICCV 2025 ReCamMaster: Camera-Controlled Generative Rendering from A Single Video ICCV 2025 GameFactory: Creating New Games with Generative Interactive Videos ICCV 2025 Imbalance in Balance: Online Concept Balancing in Generation Models ICCV 2025 GGTalker: Talking Head Systhesis with Generalizable Gaussian Priors and Identity-Specific Adaptation ICCV 2025 FullDiT: Video Generative Foundation Models with Multimodal Control via Full Attention ICCV 2025 Scene Graph Guided Generation: Enable Accurate Relations Generation in Text-to-Image Models via Textural Rectification ICCV 2025 Cafe-Talk: Generating 3D Talking Face Animation with Multimodal Coarse- and Fine-grained Control ICLR 2025 Stable Segment Anything Model ICLR 2025 SynCamMaster: Synchronizing Multi-Camera Video Generation from Diverse Viewpoints ICLR 2025 3DTrajMaster: Mastering 3D Trajectory for Multi-Entity Motion in Video Generation ICLR 2025 MODA: MOdular Duplex Attention for Multimodal Perception, Cognition, and Emotion Understanding ICML 2025 SketchVideo: Sketch-based Video Generation and Editing CVPR 2025 StyleMaster: Stylize Your Video with Artistic Generation and Translation CVPR 2025 Koala-36M: A Large-scale Video Dataset Improving Consistency between Fine-grained Conditions and Video Content CVPR 2025 PatchVSR: Breaking Video Diffusion Resolution Limits with Patch-wise Video Super-Resolution CVPR 2025 GPAvatar: High-fidelity Head Avatars by Learning Efficient Gaussian Projections CVPR 2025 Towards Precise Scaling Laws for Video Diffusion Transformers CVPR 2025 Unleashing the Potential of Multi-modal Foundation Models and Video Diffusion for 4D Dynamic Physical Scene Simulation CVPR 2025 VideoTetris: Towards Compositional Text-to-Video Generation NIPS 2024 Agent Attention: On the Integration of Softmax and Linear Attention ECCV 2024 FEditNet: Few-Shot Editing of Latent Semantics in GAN Spaces AAAI 2023 Augmentation-Aware Self-Supervision for Data-Efficient GAN Training NIPS 2023 DVIS: Decoupled Video Instance Segmentation Framework ICCV 2023 Exploring Set Similarity for Dense Self-Supervised Representation Learning CVPR 2022 Debiased Self-Training for Semi-Supervised Learning NIPS 2022 Assessing a Single Image in Reference-Guided Image Synthesis AAAI 2022 Wavelet Knowledge Distillation: Towards Efficient Image-to-Image Translation CVPR 2022 Camera-Space Hand Mesh Recovery via Semantic Aggregation and Adaptive 2D-1D Registration CVPR 2021 PMP-Net: Point Cloud Completion by Learning Multi-Step Point Moving Paths CVPR 2021 Cycle4Completion: Unpaired Point Cloud Completion Using Cycle Transformation With Missing Region Coding CVPR 2021 BlendGAN: Implicitly GAN Blending for Arbitrary Stylized Face Generation NIPS 2021 SnowflakeNet: Point Cloud Completion by Snowflake Point Deconvolution With Skip-Transformer ICCV 2021