YUHUI YUAN

28 papers · 2017–2025 · 6 conferences · across top CS/AI conferences

Achievements

+8 more ↓

🌉 Interdisciplinary Bridge 🌍 Conference Polyglot (6) 🏃 Academic Marathon (8) 🌈 Renaissance Researcher (6) 🗺️ Taxonomy Completionist (36)

🐣 Hot Topic Early Bird 🌍 Conference Polyglot (6) 🏃 Academic Marathon (8) 🧬 Topic Evolution 💎 Century Club (28) ⚡ Prolific Year (7) 🔥 Unstoppable (7) 🗃️ Keyword Collector (85)

Conferences

CVPR (8) ICCV (8) ECCV (6) NIPS (4) AAAI (1) ICLR (1)

Top co-authors

Han Hu (8) Ji Li (8) Chao Zhang (7) Jingdong Wang (5) Jianmin Bao (4) Gao Huang (4) WEICONG LIANG (4) Baining Guo (4) Chong Luo (3) Yifan Pu (3)

Keywords

diffusion model (6) object detection (6) text-to-image generation (5) semantic segmentation (4) image generation (3) stable diffusion (3) vision transformer (3) text-to-image model (2) image editing (2) dense prediction (2) semi-supervised learning (2) human-object interaction (1) depth estimation (1) object removal (1) multimodal learning (1) self-attention mechanism (1) text-to-image synthesis (1) image inpainting (1) direct preference optimization (1) human pose estimation (1)

Papers

ART: Anonymous Region Transformer for Variable Multi-Layer Transparent Image Generation CVPR 2025 Hybrid Layout Control for Diffusion Transformer: Fewer Annotations, Superior Aesthetics ICCV 2025 DesignEdit: Unify Spatial-Aware Image Editing via Training-free Inpainting with a Multi-Layered Latent Diffusion Framework AAAI 2025 Aesthetic Post-Training Diffusion Models from Generic Preferences with Step-by-step Preference Optimization CVPR 2025 BizGen: Advancing Article-level Visual Text Rendering for Infographics Generation CVPR 2025 Glyph-ByT5: A Customized Text Encoder for Accurate Visual Text Rendering ECCV 2024 MicroCinema: A Divide-and-Conquer Approach for Text-to-Video Generation CVPR 2024 LISA: Reasoning Segmentation via Large Language Model CVPR 2024 V-DETR: DETR with Vertex Relative Position Encoding for 3D Object Detection ICLR 2024 CCEdit: Creative and Controllable Video Editing via Diffusion Models CVPR 2024 FontStudio: Shape-Adaptive Diffusion Model for Coherent and Consistent Font Effect Generation ECCV 2024 Efficient Diffusion Transformer with Step-wise Dynamic Attention Mediators ECCV 2024 Exploring Predicate Visual Context in Detecting of Human-Object Interactions ICCV 2023 Mask-Attention-Free Transformer for 3D Instance Segmentation ICCV 2023 Space Engage: Collaborative Space Supervision for Contrastive-Based Semi-Supervised Semantic Segmentation ICCV 2023 GlyphControl: Glyph Conditional Control for Visual Text Generation NIPS 2023 DETRs With Hybrid Matching CVPR 2023 DETR Does Not Need Multi-Scale or Locality Design ICCV 2023 Rank-DETR for High Quality Object Detection NIPS 2023 RankSeg: Adaptive Pixel Classification with Image Category Ranking for Segmentation ECCV 2022 Expediting Large-Scale Vision Transformer for Dense Prediction without Fine-tuning NIPS 2022 HRFormer: High-Resolution Vision Transformer for Dense Predict NIPS 2021 Semi-Supervised Semantic Segmentation With Cross Pseudo Supervision CVPR 2021 Conditional DETR for Fast Training Convergence ICCV 2021 SegFix: Model-Agnostic Boundary Refinement for Segmentation ECCV 2020 Object-Contextual Representations for Semantic Segmentation ECCV 2020 Beyond Human Parts: Dual Part-Aligned Representations for Person Re-Identification ICCV 2019 Hard-Aware Deeply Cascaded Embedding ICCV 2017