Qihang Yu

29 papers · 2018–2025 · 8 conferences · across top CS/AI conferences

Achievements

+12 more ↓

🌉 Interdisciplinary Bridge 🏃 Academic Marathon (7) 🌈 Renaissance Researcher (7) 🌍 Conference Polyglot (8) 🗺️ Taxonomy Completionist (49)

🗺️ Taxonomy Completionist (49) 🧭 Keyword Pioneer 🐣 Hot Topic Early Bird 👥 Mega-Team (25) 🔬 Deep Specialist (10) 🏆 Grand Slam 🤝 Dynamic Duo (18) 🔥 Unstoppable (6) 🗃️ Keyword Collector (114) 📈 Trend Setter ⚡ Prolific Year (5) 💎 Century Club (29)

Conferences

CVPR (9) ICCV (6) NIPS (5) ECCV (3) AAAI (2) ICLR (2) ICML (1) WACV (1)

Top co-authors

Liang-Chieh Chen (18) Alan Yuille (12) Xiaohui Shen (10) Ju He (9) Alan L. Yuille (5) Xueqing Deng (5) Siyuan Qiao (4) Hartwig Adam (4) Huiyu Wang (3) Yuyin Zhou (3)

Keywords

semantic segmentation (7) image segmentation (6) medical imaging (4) transformer architecture (4) panoptic segmentation (4) convolutional neural network (3) image generation (3) mask transformer (3) object detection (3) diffusion model (2) computer vision (2) feature extraction (2) flow matching (2) representation learning (2) instance segmentation (2) efficient computing (2) vision transformer (2) neural architecture search (2) model compression (2) text-to-image generation (2)

Papers

Leveraging Panoptic Scene Graph for Evaluating Fine-Grained Text-to-Image Generation ICCV 2025 Randomized Autoregressive Visual Generation ICCV 2025 FlowTok: Flowing Seamlessly Across Text and Image Tokens ICCV 2025 FlowAR: Scale-wise Autoregressive Image Generation Meets Flow Matching ICML 2025 Beyond Next-Token: Next-X Prediction for Autoregressive Visual Generation ICCV 2025 Democratizing Text-to-Image Masked Generative Models with Compact Text-Aware One-Dimensional Tokens ICCV 2025 Video-kMaX: A Simple Unified Approach for Online and Near-Online Video Panoptic Segmentation WACV 2024 An Image is Worth 32 Tokens for Reconstruction and Generation NIPS 2024 Alleviating Distortion in Image Generation via Multi-Resolution Diffusion Models and Time-Dependent Layer Normalization NIPS 2024 ViTamin: Designing Scalable Vision Models in the Vision-Language Era CVPR 2024 COCONut: Modernizing COCO Segmentation CVPR 2024 Towards Open-Ended Visual Recognition with Large Language Models ECCV 2024 MOAT: Alternating Mobile Convolution and Attention Brings Strong Vision Models ICLR 2023 ReMaX: Relaxing for Better Training on Efficient Panoptic Segmentation NIPS 2023 CancerUniT: Towards a Single Unified Model for Effective Detection, Segmentation, and Diagnosis of Eight Major Cancers Using a Large Collection of CT Scans ICCV 2023 Convolutions Die Hard: Open-Vocabulary Segmentation with Single Frozen Convolutional CLIP NIPS 2023 Compositor: Bottom-Up Clustering and Compositing for Robust Part and Object Segmentation CVPR 2023 TubeFormer-DeepLab: Video Mask Transformer CVPR 2022 "PartImageNet: A Large, High-Quality Dataset of Parts" ECCV 2022 k-Means Mask Transformer ECCV 2022 CMT-DeepLab: Clustering Mask Transformers for Panoptic Segmentation CVPR 2022 Shape-Texture Debiased Neural Network Training ICLR 2021 CAKES: Channel-wise Automatic KErnel Shrinking for Efficient 3D Networks AAAI 2021 Mask Guided Matting via Progressive Refinement Network CVPR 2021 Glance-and-Gaze Vision Transformer NIPS 2021 C2FNAS: Coarse-to-Fine Neural Architecture Search for 3D Medical Image Segmentation CVPR 2020 When Radiology Report Generation Meets Knowledge Graph AAAI 2020 Neural Architecture Search for Lightweight Non-Local Networks CVPR 2020 Recurrent Saliency Transformation Network: Incorporating Multi-Stage Visual Cues for Small Organ Segmentation CVPR 2018