Xin Eric Wang
36 papers · 2020–2026 · 14 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+13 more ↓ Show less ↑
π Conference Polyglot (13) π Academic Marathon (6) π§ Keyword Pioneer π Interdisciplinary Bridge π Cross-Pollinator (7)
π
Cross-Pollinator
(7)
π
Renaissance Researcher
(7)
πΊοΈ
Taxonomy Completionist
(60)
π¬
Deep Specialist
(11)
π€
Dynamic Duo
(11)
π
Triple Crown
π
Keyword Champion
(3)
π
Grand Slam
β‘
Prolific Year
(8)
π
Century Club
(35)
π₯
Unstoppable
(5)
β
The Questioner
ποΈ
Keyword Collector
(142)
Conferences
EMNLP (7)
ECCV (6)
ICLR (6)
NIPS (3)
ACL (2)
CVPR (2)
ICML (2)
WACV (2)
AAAI (1)
AACL (1)
EACL (1)
ICCV (1)
IJCNLP (1)
NAACL (1)
Top co-authors
Keywords
large language model
(5)
multimodal learning
(4)
multimodal large language model
(4)
large reasoning model
(3)
adversarial attack
(3)
model evaluation
(2)
attention mechanism
(2)
prompt injection
(2)
benchmark evaluation
(2)
embodied agent
(2)
ai safety
(2)
visual grounding
(2)
vision language model
(2)
safety assessment
(2)
in-context learning
(2)
video understanding
(2)
knowledge distillation
(1)
text-to-image synthesis
(1)
few-shot learning
(1)
medical imaging
(1)
Papers
Interleaved Vision-and-Language Generation via Generative Voken
WACV 2026
Whatβs Missing in Vision-Language Models? Probing Their Struggles with Causal Order Reasoning
EACL 2026
Hidden in Plain Sight: Reasoning in Underspecified and Misspecified Scenarios for Multimodal LLMs
EMNLP 2025
The Hidden Risks of Large Reasoning Models: A Safety Assessment of R1
AACL 2025
Multimodal Inconsistency Reasoning (MMIR): A New Benchmark for Multimodal Reasoning Models
ACL 2025
Worse than Random? An Embarrassingly Simple Probing Evaluation of Large Multimodal Models in Medical VQA
ACL 2025
SafeKey: Amplifying Aha-Moment Insights for Safety Reasoning
EMNLP 2025
GUI-Bee: Align GUI Action Grounding to Novel Environments via Autonomous Exploration
EMNLP 2025
Dynamic Evaluation for Oversensitivity in LLMs
EMNLP 2025
VLM4D: Towards Spatiotemporal Awareness in Vision Language Models
ICCV 2025
Agent S: An Open Agentic Framework that Uses Computers Like a Human
ICLR 2025
EditRoom: LLM-parameterized Graph Diffusion for Composable 3D Room Layout Editing
ICLR 2025
MMWorld: Towards Multi-discipline Multi-faceted World Model Evaluation in Videos
ICLR 2025
Multimodal Situational Safety
ICLR 2025
The Hidden Risks of Large Reasoning Models: A Safety Assessment of R1
IJCNLP 2025
LLM-Coordination: Evaluating and Analyzing Multi-agent Coordination Abilities in Large Language Models
NAACL 2025
Multimodal Procedural Planning via Dual Text-Image Prompting
EMNLP 2024
Active Listening: Personalized Question Generation in Open-Domain Social Conversation with User Model Based Prompting
EMNLP 2024
NavGPT-2: Unleashing Navigational Reasoning Capability for Large Vision-Language Models
ECCV 2024
SwapAnything: Enabling Arbitrary Object Swapping in Personalized Image Editing
ECCV 2024
Read Anywhere Pointed: Layout-aware GUI Screen Reading with Tree-of-Lens Grounding
EMNLP 2024
CUDA-GHR: Controllable Unsupervised Domain Adaptation for Gaze and Head Redirection
WACV 2023
Parameter-Efficient Model Adaptation for Vision Transformers
AAAI 2023
Training-Free Structured Diffusion Guidance for Compositional Text-to-Image Synthesis
ICLR 2023
Neuro-Symbolic Procedural Planning with Commonsense Prompting
ICLR 2023
ESC: Exploration with Soft Commonsense Constraints for Zero-shot Object Navigation
ICML 2023
LayoutGPT: Compositional Visual Planning and Generation with Large Language Models
NIPS 2023
PHOTOSWAP: Personalized Subject Swapping in Images
NIPS 2023
LLMScore: Unveiling the Power of Large Language Models in Text-to-Image Synthesis Evaluation
NIPS 2023
FedVLN: Privacy-Preserving Federated Vision-and-Language Navigation
ECCV 2022
M3L: Language-Based Video Editing via Multi-Modal Multi-Level Transformers
CVPR 2022
Compositional Temporal Grounding With Structured Variational Cross-Graph Correspondence Learning
CVPR 2022
Understanding Instance-Level Impact of Fairness Constraints
ICML 2022
Language-Driven Artistic Style Transfer
ECCV 2022
Counterfactual Vision-and-Language Navigation via Adversarial Path Sampler
ECCV 2020
Environment-agnostic Multitask Learning for Natural Language Grounded Navigation
ECCV 2020