Xuehai He
17 papers · 2021–2026 · 12 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+10 more ↓ Show less ↑
π Academic Marathon (5) π Interdisciplinary Bridge π§ Keyword Pioneer π Conference Polyglot (12) π Cross-Pollinator (7)
π
Cross-Pollinator
(7)
π
Renaissance Researcher
(6)
πΊοΈ
Taxonomy Completionist
(32)
π
Grand Slam
π§¬
Topic Evolution
β
The Questioner
(2)
β‘
Prolific Year
(5)
ποΈ
Keyword Collector
(63)
π
Century Club
(17)
π₯
Unstoppable
(6)
Conferences
ACL (3)
ICLR (3)
IJCNLP (2)
AAAI (1)
CVPR (1)
EACL (1)
EMNLP (1)
ICCV (1)
ICML (1)
NAACL (1)
NIPS (1)
WACV (1)
Top co-authors
Keywords
vision-language model
(3)
question answering
(2)
dialog generation
(2)
multimodal learning
(2)
visual question answering
(2)
multi-task learning
(2)
few-shot learning
(2)
pathology imaging
(2)
vision language model
(2)
model evaluation
(1)
event understanding
(1)
video generation
(1)
in-context learning
(1)
knowledge distillation
(1)
transfer learning
(1)
compositional generalization
(1)
medical image analysis
(1)
medical imaging
(1)
large multimodal model
(1)
text-to-image generation
(1)
Papers
Interleaved Vision-and-Language Generation via Generative Voken
WACV 2026
VLM4D: Towards Spatiotemporal Awareness in Vision Language Models
ICCV 2025
Worse than Random? An Embarrassingly Simple Probing Evaluation of Large Multimodal Models in Medical VQA
ACL 2025
Is Your World Simulator a Good Story Presenter? A Consecutive Events-Based Benchmark for Future Long Video Generation
CVPR 2025
EditRoom: LLM-parameterized Graph Diffusion for Composable 3D Room Layout Editing
ICLR 2025
MMWorld: Towards Multi-discipline Multi-faceted World Model Evaluation in Videos
ICLR 2025
ComCLIP: Training-Free Compositional Image and Text Matching
NAACL 2024
Mastering Robot Manipulation with Multimodal Prompts through Pretraining and Multi-task Fine-tuning
ICML 2024
Training-Free Structured Diffusion Guidance for Compositional Text-to-Image Synthesis
ICLR 2023
Parameter-Efficient Model Adaptation for Vision Transformers
AAAI 2023
Multimodal Graph Transformer for Multimodal Question Answering
EACL 2023
LayoutGPT: Compositional Visual Planning and Generation with Large Language Models
NIPS 2023
CPL: Counterfactual Prompt Learning for Vision and Language Models
EMNLP 2022
Towards Visual Question Answering on Pathology Images
ACL 2021
On the Generation of Medical Dialogs for COVID-19
ACL 2021
Towards Visual Question Answering on Pathology Images
IJCNLP 2021
On the Generation of Medical Dialogs for COVID-19
IJCNLP 2021