Xuming He

59 papers · 2008–2025 · 10 conferences · across top CS/AI conferences

Achievements

+17 more ↓

🧭 Keyword Pioneer 🗺️ Taxonomy Completionist (21) 🌉 Interdisciplinary Bridge 🌈 Renaissance Researcher (5) 🐣 Hot Topic Early Bird

🌈 Renaissance Researcher (5) 🌉 Interdisciplinary Bridge 🌍 Conference Polyglot (10) 🏠 Conference Loyalist (24) 🌟 Keyword Trendsetter Combo (3) 🤝 Dynamic Duo (12) 🏆 Keyword Champion 🏆 Grand Slam 🔬 Deep Specialist (11) 🧬 Topic Evolution 🚀 Conference Pioneer 🔥 Unstoppable (13) ⚡ Prolific Year (8) 🗃️ Keyword Collector (258) 💎 Century Club (59) ❓ The Questioner 📈 Trend Setter

Conferences

CVPR (24) ICCV (9) NIPS (6) AAAI (5) ECCV (5) IJCAI (4) ICLR (3) ICML (1) MIDL (1) NAACL (1)

Top co-authors

Songyang Zhang (12) Shipeng Yan (9) Yongfei Liu (8) Mathieu Salzmann (8) Rongjie Li (6) Bo Wan (5) Zhitong Gao (4) Chuyu Zhang (4) Longtian Qiu (4) miaomiao Liu (4)

Keywords

object detection (8) conditional random field (6) vision-language model (6) semantic segmentation (5) scene graph generation (5) graph neural network (5) knowledge distillation (4) zero-shot learning (4) depth estimation (3) contrastive learning (3) convolutional neural network (3) image captioning (3) few-shot learning (3) representation learning (3) cross-modal alignment (3) weakly supervised learning (3) attention mechanism (3) visual grounding (3) scene understanding (3) image classification (3)

Papers

Relation-aware Hierarchical Prompt for Open-vocabulary Scene Graph Generation AAAI 2025 GeoDistill: Geometry-Guided Self-Distillation for Weakly Supervised Cross-View Localization ICCV 2025 Generalize or Detect? Towards Robust Semantic Segmentation Under Multiple Distribution Shifts NIPS 2024 CryoGEM: Physics-Informed Generative Cryo-Electron Microscopy NIPS 2024 P$^2$OT: Progressive Partial Optimal Transport for Deep Imbalanced Clustering ICLR 2024 Learning by Correction: Efficient Tuning Task for Zero-Shot Generative Vision-Language Reasoning CVPR 2024 RealDex: Towards Human-like Grasping for Robotic Dexterous Hand IJCAI 2024 Mining Fine-Grained Image-Text Alignment for Zero-Shot Captioning via Text-Only Training AAAI 2024 "SPHINX: A Mixer of Weights, Visual Embeddings and Image Scales for Multi-modal Large Language Models" ECCV 2024 Dual-level Adaptive Self-Labeling for Novel Class Discovery in Point Cloud Segmentation ECCV 2024 From Pixels to Graphs: Open-Vocabulary Scene Graph Generation with Vision-Language Models CVPR 2024 DSGG: Dense Relation Transformer for an End-to-end Scene Graph Generation CVPR 2024 MILD: Modeling the Instance Learning Dynamics for Learning with Noisy Labels IJCAI 2023 Human-centric Scene Understanding for 3D Large-scale Scenarios ICCV 2023 Grounded Image Text Matching with Mismatched Relation Reasoning ICCV 2023 Class-relation Knowledge Distillation for Novel Class Discovery ICCV 2023 Weakly-supervised HOI Detection via Prior-guided Bi-level Representation Learning ICLR 2023 CALIP: Zero-Shot Enhancement of CLIP with Parameter-Free Attention AAAI 2023 HOICLIP: Efficient Knowledge Transfer for HOI Detection With Vision-Language Models CVPR 2023 ATTA: Anomaly-aware Test-Time Adaptation for Out-of-Distribution Detection in Segmentation NIPS 2023 Modeling Multimodal Aleatoric Uncertainty in Segmentation with Mixture of Stochastic Experts ICLR 2023 KD-VLP: Improving End-to-End Vision-and-Language Pretraining with Object Knowledge Distillation NAACL 2022 General Incremental Learning With Domain-Aware Categorical Representations CVPR 2022 SGTR: End-to-End Scene Graph Generation With Transformer CVPR 2022 Learning Semantic Correspondence with Sparse Annotations ECCV 2022 Generative Negative Text Replay for Continual Vision-Language Pretraining ECCV 2022 Bipartite Graph Network With Adaptive Message Passing for Unbiased Scene Graph Generation CVPR 2021 DER: Dynamically Expandable Representation for Class Incremental Learning CVPR 2021 Relation-aware Instance Refinement for Weakly Supervised Visual Grounding CVPR 2021 Dynamic Grained Encoder for Vision Transformers NIPS 2021 Weakly Supervised Volumetric Segmentation via Self-taught Shape Denoising Model MIDL 2021 GNeRF: GAN-Based Neural Radiance Field Without Posed Camera ICCV 2021 Learning Implicit Temporal Alignment for Few-shot Video Classification IJCAI 2021 Distribution Alignment: A Unified Framework for Long-Tail Visual Recognition CVPR 2021 Learning Cross-Modal Context Graph for Visual Grounding AAAI 2020 Part-aware Prototype Network for Few-shot Semantic Segmentation ECCV 2020 A Dual Attention Network with Semantic Embedding for Few-Shot Learning AAAI 2019 Dynamic Context Correspondence Network for Semantic Alignment ICCV 2019 Pose-Aware Multi-Level Feature Network for Human Object Interaction Detection ICCV 2019 LatentGNN: Learning Efficient Non-local Relations for Visual Recognition ICML 2019 SemStyle: Learning to Generate Stylised Image Captions Using Unaligned Text CVPR 2018 One-Shot Action Localization by Learning Sequence Matching Network CVPR 2018 Geometry-Aware Deep Network for Single-Image Novel View Synthesis CVPR 2018 Indoor Scene Parsing With Instance Segmentation, Semantic Labeling and Support Relationship Inference CVPR 2017 Boundary-Aware Instance Segmentation CVPR 2017 Predicting Salient Face in Multiple-Face Videos CVPR 2017 Deep Free-Form Deformation Network for Object-Mask Registration ICCV 2017 Learning deep structured network for weakly supervised change detection IJCAI 2017 Learning to Co-Generate Object Proposals With a Deep Structured Network CVPR 2016 Separating Objects and Clutter in Indoor Scenes CVPR 2015 Multiclass Semantic Video Segmentation With Object-Level Active Inference CVPR 2015 Indoor Scene Structure Analysis for Single Image Depth Estimation CVPR 2015 Structural Kernel Learning for Large Scale Multiclass Object Co-Detection ICCV 2015 An Exemplar-based CRF for Multi-instance Object Segmentation CVPR 2014 Discrete-Continuous Depth Estimation from a Single Image CVPR 2014 Winding Number for Region-Boundary Consistent Salient Contour Extraction CVPR 2013 Learning Structured Hough Voting for Joint Object Detection and Occlusion Reasoning CVPR 2013 A unified model of short-range and long-range motion perception NIPS 2010 Learning Hybrid Models for Image Annotation with Partially Labeled Data NIPS 2008