Gang Yu
60 papers · 2015–2026 · 9 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+11 more ↓ Show less ↑
π Interdisciplinary Bridge π Academic Marathon (10) π Conference Polyglot (9) π Renaissance Researcher (7) πΊοΈ Taxonomy Completionist (76)
πΊοΈ
Taxonomy Completionist
(76)
π§
Keyword Pioneer
π£
Hot Topic Early Bird
π
Conference Loyalist
(20)
π¬
Deep Specialist
(11)
π€
Dynamic Duo
(15)
π
Century Club
(58)
ποΈ
Keyword Collector
(232)
π₯
Unstoppable
(5)
β‘
Prolific Year
(10)
π
Conference Pioneer
Conferences
CVPR (20)
ICCV (9)
AAAI (7)
NIPS (7)
ECCV (6)
ICLR (5)
MICCAI (3)
ACL (2)
WACV (1)
Top co-authors
Keywords
semantic segmentation
(12)
convolutional neural network
(7)
diffusion model
(6)
object detection
(5)
novel view synthesis
(5)
3d reconstruction
(5)
feature learning
(4)
scene text detection
(3)
implicit neural representation
(3)
instance segmentation
(3)
neural network
(3)
point cloud
(3)
scene understanding
(3)
occlusion handling
(2)
neural representation
(2)
object tracking
(2)
text-to-image generation
(2)
image segmentation
(2)
3d vision
(2)
video understanding
(2)
Papers
SERL: Self-Examining Reinforcement Learning on Open-Domain
AAAI 2026
Sparse-vDiT: Unleashing the Power of Sparse Attention to Accelerate Video Diffusion Transformers
AAAI 2026
DeRS: Towards Extremely Efficient Upcycled Mixture-of-Experts Models
CVPR 2025
Reason from Future: Reverse Thought Chain Enhances LLM Reasoning
ACL 2025
MT-WilmsNet: A Multi-Level Transformer Fusion Network for Wilmsβ Tumor Segmentation and Metastasis Prediction
MICCAI 2025
High-Fidelity Unified One-to-Many Medical Image Synthesis via Text-Conditioned Latent Diffusion
MICCAI 2025
MeshAnything: Artist-Created Mesh Generation with Autoregressive Transformers
ICLR 2025
SaMer: A Scenario-aware Multi-dimensional Evaluator for Large Language Models
ICLR 2025
MotionAgent: Fine-grained Controllable Video Generation via Motion Field Agent
ICCV 2025
MikuDance: Animating Character Art with Mixed Motion Dynamics
ICCV 2025
SC-Captioner: Improving Image Captioning with Self-Correction by Reinforcement Learning
ICCV 2025
MVPaint: Synchronized Multi-View Diffusion for Painting Anything 3D
CVPR 2025
PM-INR: Prior-Rich Multi-Modal Implicit Large-Scale Scene Neural Representation
AAAI 2024
Cross-Dimensional Medical Self-Supervised Representation Learning Based on a Pseudo-3D Transformation
MICCAI 2024
TapMo: Shape-aware Motion Generation of Skeleton-free Characters
ICLR 2024
LL3DA: Visual Interactive Instruction Tuning for Omni-3D Understanding Reasoning and Planning
CVPR 2024
Paint3D: Paint Anything 3D with Lighting-Less Texture Diffusion Models
CVPR 2024
MotionChain: Conversational Motion Controllers via Multimodal Prompts
ECCV 2024
MeshXL: Neural Coordinate Field for Generative 3D Foundation Models
NIPS 2024
M3DBench: Towards Omni 3D Assistant with Interleaved Multi-modal Instructions
ECCV 2024
Disentangled Pre-Training for Image Matting
WACV 2024
IT3D: Improved Text-to-3D Generation with Explicit View Synthesis
AAAI 2024
Enhanced Visual Instruction Tuning with Synthesized Image-Dialogue Data
ACL 2024
Michelangelo: Conditional 3D Shape Generation based on Shape-Image-Text Aligned Latent Representation
NIPS 2023
A Large-Scale Outdoor Multi-Modal Dataset and Benchmark for Novel View Synthesis and Implicit Scene Reconstruction
ICCV 2023
PDF: Point Diffusion Implicit Function for Large-scale Scene Neural Representation
NIPS 2023
Executing Your Commands via Motion Diffusion in Latent Space
CVPR 2023
Robust Geometry-Preserving Depth Estimation Using Differentiable Rendering
ICCV 2023
End-to-End 3D Dense Captioning With Vote2Cap-DETR
CVPR 2023
STAR Loss: Reducing Semantic Ambiguity in Facial Landmark Detection
CVPR 2023
Metric3D: Towards Zero-shot Metric 3D Prediction from A Single Image
ICCV 2023
Capturing the Motion of Every Joint: 3D Human Pose and Shape Estimation with Independent Tokens
ICLR 2023
SeaFormer: Squeeze-enhanced Axial Transformer for Mobile Semantic Segmentation
ICLR 2023
MotionGPT: Human Motion as a Foreign Language
NIPS 2023
D&D: Learning Human Dynamics from Dynamic Camera
ECCV 2022
Coordinates Are NOT Lonely - Codebook Prior Helps Implicit Neural 3D representations
NIPS 2022
Hierarchical Normalization for Robust Monocular Depth Estimation
NIPS 2022
TopFormer: Token Pyramid Transformer for Mobile Semantic Segmentation
CVPR 2022
SiamFC++: Towards Robust and Accurate Visual Tracking with Target Estimation Guidelines
AAAI 2020
State-Aware Tracker for Real-Time Video Object Segmentation
CVPR 2020
High-Order Information Matters: Learning Relation and Topology for Occluded Person Re-Identification
CVPR 2020
Context Prior for Scene Segmentation
CVPR 2020
Attention-Based Multi-Context Guiding for Few-Shot Semantic Segmentation
AAAI 2019
ThunderNet: Towards Real-Time Generic Object Detection on Mobile Devices
ICCV 2019
Objects365: A Large-Scale, High-Quality Dataset for Object Detection
ICCV 2019
Efficient and Accurate Arbitrary-Shaped Text Detection With Pixel Aggregation Network
ICCV 2019
TACNet: Transition-Aware Context Network for Spatio-Temporal Action Detection
CVPR 2019
Shape Robust Text Detection With Progressive Scale Expansion Network
CVPR 2019
An End-To-End Network for Panoptic Segmentation
CVPR 2019
Modeling Local Geometric Structure of 3D Point Clouds Using Geo-CNN
CVPR 2019
Scene Text Detection with Supervised Pyramid Context Network
AAAI 2019
Learnable Tree Filter for Structure-preserving Feature Transform
NIPS 2019
DetNet: Design Backbone for Object Detection
ECCV 2018
MegDet: A Large Mini-Batch Object Detector
CVPR 2018
BiSeNet: Bilateral Segmentation Network for Real-time Semantic Segmentation
ECCV 2018
Associating Inter-Image Salient Instances for Weakly Supervised Semantic Segmentation
ECCV 2018
Learning a Discriminative Feature Network for Semantic Segmentation
CVPR 2018
Cascaded Pyramid Network for Multi-Person Pose Estimation
CVPR 2018
Large Kernel Matters -- Improve Semantic Segmentation by Global Convolutional Network
CVPR 2017
Fast Action Proposals for Human Action Detection and Search
CVPR 2015