Gang Yu

60 papers · 2015–2026 · 9 conferences · across top CS/AI conferences

Achievements

+11 more ↓

🌉 Interdisciplinary Bridge 🏃 Academic Marathon (10) 🌍 Conference Polyglot (9) 🌈 Renaissance Researcher (7) 🗺️ Taxonomy Completionist (76)

🗺️ Taxonomy Completionist (76) 🧭 Keyword Pioneer 🐣 Hot Topic Early Bird 🏠 Conference Loyalist (20) 🔬 Deep Specialist (11) 🤝 Dynamic Duo (15) 💎 Century Club (58) 🗃️ Keyword Collector (232) 🔥 Unstoppable (5) ⚡ Prolific Year (10) 🚀 Conference Pioneer

Conferences

CVPR (20) ICCV (9) AAAI (7) NIPS (7) ECCV (6) ICLR (5) MICCAI (3) ACL (2) WACV (1)

Top co-authors

Tao Chen (16) Xin Chen (15) Chi Zhang (11) Xianfang Zeng (8) BIN FU (8) Jian Sun (8) Wen Liu (8) Fukun Yin (7) Chao Peng (7) Chunhua Shen (7)

Keywords

semantic segmentation (12) convolutional neural network (7) diffusion model (6) object detection (5) novel view synthesis (5) 3d reconstruction (5) feature learning (4) scene text detection (3) implicit neural representation (3) instance segmentation (3) neural network (3) point cloud (3) scene understanding (3) occlusion handling (2) neural representation (2) object tracking (2) text-to-image generation (2) image segmentation (2) 3d vision (2) video understanding (2)

Papers

SERL: Self-Examining Reinforcement Learning on Open-Domain AAAI 2026 Sparse-vDiT: Unleashing the Power of Sparse Attention to Accelerate Video Diffusion Transformers AAAI 2026 DeRS: Towards Extremely Efficient Upcycled Mixture-of-Experts Models CVPR 2025 Reason from Future: Reverse Thought Chain Enhances LLM Reasoning ACL 2025 MT-WilmsNet: A Multi-Level Transformer Fusion Network for Wilms’ Tumor Segmentation and Metastasis Prediction MICCAI 2025 High-Fidelity Unified One-to-Many Medical Image Synthesis via Text-Conditioned Latent Diffusion MICCAI 2025 MeshAnything: Artist-Created Mesh Generation with Autoregressive Transformers ICLR 2025 SaMer: A Scenario-aware Multi-dimensional Evaluator for Large Language Models ICLR 2025 MotionAgent: Fine-grained Controllable Video Generation via Motion Field Agent ICCV 2025 MikuDance: Animating Character Art with Mixed Motion Dynamics ICCV 2025 SC-Captioner: Improving Image Captioning with Self-Correction by Reinforcement Learning ICCV 2025 MVPaint: Synchronized Multi-View Diffusion for Painting Anything 3D CVPR 2025 PM-INR: Prior-Rich Multi-Modal Implicit Large-Scale Scene Neural Representation AAAI 2024 Cross-Dimensional Medical Self-Supervised Representation Learning Based on a Pseudo-3D Transformation MICCAI 2024 TapMo: Shape-aware Motion Generation of Skeleton-free Characters ICLR 2024 LL3DA: Visual Interactive Instruction Tuning for Omni-3D Understanding Reasoning and Planning CVPR 2024 Paint3D: Paint Anything 3D with Lighting-Less Texture Diffusion Models CVPR 2024 MotionChain: Conversational Motion Controllers via Multimodal Prompts ECCV 2024 MeshXL: Neural Coordinate Field for Generative 3D Foundation Models NIPS 2024 M3DBench: Towards Omni 3D Assistant with Interleaved Multi-modal Instructions ECCV 2024 Disentangled Pre-Training for Image Matting WACV 2024 IT3D: Improved Text-to-3D Generation with Explicit View Synthesis AAAI 2024 Enhanced Visual Instruction Tuning with Synthesized Image-Dialogue Data ACL 2024 Michelangelo: Conditional 3D Shape Generation based on Shape-Image-Text Aligned Latent Representation NIPS 2023 A Large-Scale Outdoor Multi-Modal Dataset and Benchmark for Novel View Synthesis and Implicit Scene Reconstruction ICCV 2023 PDF: Point Diffusion Implicit Function for Large-scale Scene Neural Representation NIPS 2023 Executing Your Commands via Motion Diffusion in Latent Space CVPR 2023 Robust Geometry-Preserving Depth Estimation Using Differentiable Rendering ICCV 2023 End-to-End 3D Dense Captioning With Vote2Cap-DETR CVPR 2023 STAR Loss: Reducing Semantic Ambiguity in Facial Landmark Detection CVPR 2023 Metric3D: Towards Zero-shot Metric 3D Prediction from A Single Image ICCV 2023 Capturing the Motion of Every Joint: 3D Human Pose and Shape Estimation with Independent Tokens ICLR 2023 SeaFormer: Squeeze-enhanced Axial Transformer for Mobile Semantic Segmentation ICLR 2023 MotionGPT: Human Motion as a Foreign Language NIPS 2023 D&D: Learning Human Dynamics from Dynamic Camera ECCV 2022 Coordinates Are NOT Lonely - Codebook Prior Helps Implicit Neural 3D representations NIPS 2022 Hierarchical Normalization for Robust Monocular Depth Estimation NIPS 2022 TopFormer: Token Pyramid Transformer for Mobile Semantic Segmentation CVPR 2022 SiamFC++: Towards Robust and Accurate Visual Tracking with Target Estimation Guidelines AAAI 2020 State-Aware Tracker for Real-Time Video Object Segmentation CVPR 2020 High-Order Information Matters: Learning Relation and Topology for Occluded Person Re-Identification CVPR 2020 Context Prior for Scene Segmentation CVPR 2020 Attention-Based Multi-Context Guiding for Few-Shot Semantic Segmentation AAAI 2019 ThunderNet: Towards Real-Time Generic Object Detection on Mobile Devices ICCV 2019 Objects365: A Large-Scale, High-Quality Dataset for Object Detection ICCV 2019 Efficient and Accurate Arbitrary-Shaped Text Detection With Pixel Aggregation Network ICCV 2019 TACNet: Transition-Aware Context Network for Spatio-Temporal Action Detection CVPR 2019 Shape Robust Text Detection With Progressive Scale Expansion Network CVPR 2019 An End-To-End Network for Panoptic Segmentation CVPR 2019 Modeling Local Geometric Structure of 3D Point Clouds Using Geo-CNN CVPR 2019 Scene Text Detection with Supervised Pyramid Context Network AAAI 2019 Learnable Tree Filter for Structure-preserving Feature Transform NIPS 2019 DetNet: Design Backbone for Object Detection ECCV 2018 MegDet: A Large Mini-Batch Object Detector CVPR 2018 BiSeNet: Bilateral Segmentation Network for Real-time Semantic Segmentation ECCV 2018 Associating Inter-Image Salient Instances for Weakly Supervised Semantic Segmentation ECCV 2018 Learning a Discriminative Feature Network for Semantic Segmentation CVPR 2018 Cascaded Pyramid Network for Multi-Person Pose Estimation CVPR 2018 Large Kernel Matters -- Improve Semantic Segmentation by Global Convolutional Network CVPR 2017 Fast Action Proposals for Human Action Detection and Search CVPR 2015