Haifeng Huang
23 papers · 2020–2025 · 11 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+12 more ↓ Show less ↑
🏃 Academic Marathon (5) 🌍 Conference Polyglot (11) 🌉 Interdisciplinary Bridge 🧭 Keyword Pioneer 🐝 Cross-Pollinator (10)
🌈
Renaissance Researcher
(9)
🌍
Conference Polyglot
(11)
🏃
Academic Marathon
(5)
🤝
Dynamic Duo
(11)
🔬
Deep Specialist
(11)
🧬
Topic Evolution
🏆
Keyword Champion
(3)
🗃️
Keyword Collector
(117)
🚀
Conference Pioneer
🔥
Unstoppable
(6)
💎
Century Club
(23)
⚡
Prolific Year
(6)
Conferences
NIPS (5)
AAAI (3)
CVPR (3)
ACL (2)
EMNLP (2)
ICCV (2)
IJCAI (2)
ICML (1)
INTERSPEECH (1)
MICCAI (1)
NAACL (1)
Top co-authors
Keywords
multi-modal learning
(5)
visual grounding
(4)
vision-language model
(4)
representation learning
(4)
large language model
(4)
3d visual grounding
(3)
scene understanding
(3)
electronic medical record
(3)
attention mechanism
(3)
contrastive learning
(2)
medical diagnosis
(2)
robotic manipulation
(2)
graph neural network
(2)
cross-modal alignment
(2)
semantic alignment
(2)
visual question answering
(2)
point cloud
(2)
3d vision
(2)
question answering
(2)
policy generalization
(2)
Papers
Data-Efficiently Learn Large Language Model for Universal 3D Scene Perception
NAACL 2025
Towards a Multimodal Large Language Model with Pixel-Level Insight for Biomedicine
AAAI 2025
Improving Retrieval Augmented Language Model with Self-Reasoning
AAAI 2025
GENMANIP: LLM-driven Simulation for Generalizable Instruction-Following Manipulation
CVPR 2025
RoboGround: Robotic Manipulation with Grounded Vision-Language Priors
CVPR 2025
SpatialCLIP: Learning 3D-aware Image Representations from Spatially Discriminative Language
CVPR 2025
Robin3D: Improving 3D Large Language Model via Robust Instruction Tuning
ICCV 2025
Extending Multi-modal Contrastive Representations
NIPS 2024
FreeBind: Free Lunch in Unified Multimodal Space via Knowledge Fusion
ICML 2024
MMScan: A Multi-Modal 3D Scene Dataset with Hierarchical Grounded Language Annotations
NIPS 2024
Chat-Scene: Bridging 3D Scene and Large Language Models with Object Identifiers
NIPS 2024
Unified Audio Visual Cues for Target Speaker Extraction
INTERSPEECH 2024
A Refer-and-Ground Multimodal Large Language Model for Biomedicine
MICCAI 2024
Distilling Coarse-to-Fine Semantic Matching Knowledge for Weakly Supervised 3D Visual Grounding
ICCV 2023
Scene-robust Natural Language Video Localization via Learning Domain-invariant Representations
ACL 2023
3DRP-Net: 3D Relative Position-aware Network for 3D Visual Grounding
EMNLP 2023
Connecting Multi-modal Contrastive Representations
NIPS 2023
A Speaker-Aware Co-Attention Framework for Medical Dialogue Information Extraction
EMNLP 2022
Towards Effective Multi-Modal Interchanges in Zero-Resource Sounding Object Localization
NIPS 2022
A Novel Sequence-to-Subgraph Framework for Diagnosis Classification
IJCAI 2021
Towards Interpretable Clinical Diagnosis with Bayesian Network Ensembles Stacked on Entity-Aware CNNs
ACL 2020
The Graph-based Mutual Attentive Network for Automatic Diagnosis
IJCAI 2020
Generative Adversarial Regularized Mutual Information Policy Gradient Framework for Automatic Diagnosis
AAAI 2020