Yuhui Zhang

27 papers · 2019–2026 · 10 conferences · across top CS/AI conferences

Achievements

+13 more ↓

🏃 Academic Marathon (6) 🌍 Conference Polyglot (9) 🌉 Interdisciplinary Bridge 🧭 Keyword Pioneer 🐝 Cross-Pollinator (13)

🐝 Cross-Pollinator (13) 🌈 Renaissance Researcher (8) 🗺️ Taxonomy Completionist (62) 👥 Mega-Team (23) 🏆 Keyword Champion (2) 🔬 Deep Specialist (10) 🧬 Topic Evolution 🤝 Dynamic Duo (11) 🗃️ Keyword Collector (121) 💎 Century Club (26) 🔥 Unstoppable (5) ❓ The Questioner (3) ⚡ Prolific Year (7)

Conferences

ACL (6) CVPR (4) EMNLP (4) NIPS (4) ICLR (3) IJCAI (2) EACL (1) ECCV (1) ICML (1) INTERSPEECH (1)

Top co-authors

Serena Yeung-Levy (12) James Burgess (7) Alejandro Lozano (7) Xiaohan Wang (6) Yuchang Su (5) Serena Yeung (4) Ludwig Schmidt (3) Emma Lundberg (3) Yiming Liu (3) Lisa Dunlap (2)

Keywords

vision-language model (4) vision language model (4) benchmark evaluation (4) visual question answering (3) question answering (3) large language model (3) self-supervised learning (2) catastrophic forgetting (2) zero-shot classification (2) biomedical imaging (2) language model (2) contrastive learning (2) multimodal learning (2) negation understanding (2) multi-modal learning (2) multimodal large language model (2) chain-of-thought reasoning (2) named entity recognition (2) lexical semantics (1) zero-shot learning (1)

Papers

PaperSearchQA: Learning to Search and Reason over Scientific Papers with RLVR EACL 2026 Automated Generation of Challenging Multiple-Choice Questions for Vision Language Model Evaluation CVPR 2025 NegVQA: Can Vision Language Models Understand Negation? ACL 2025 MicroVQA: A Multimodal Reasoning Benchmark for Microscopy-Based Scientific Research CVPR 2025 BIOMEDICA: An Open Biomedical Image-Caption Archive, Dataset, and Vision-Language Models Derived from Scientific Literature CVPR 2025 MAKAR: a Multi-Agent framework based Knowledge-Augmented Reasoning for Grounded Multimodal Named Entity Recognition EMNLP 2025 EquiBench: Benchmarking Large Language Models’ Reasoning about Program Semantics via Equivalence Checking EMNLP 2025 Data or Language Supervision: What Makes CLIP Better than DINO? EMNLP 2025 Video Action Differencing ICLR 2025 CellFlux: Simulating Cellular Morphology Changes via Flow Matching ICML 2025 AttentionDrag: Exploiting Latent Correlation Knowledge in Pre-trained Diffusion Models for Image Editing IJCAI 2025 Describing Differences in Image Sets with Natural Language CVPR 2024 Connect, Collapse, Corrupt: Learning Cross-Modal Tasks with Uni-Modal Data ICLR 2024 VideoAgent: Long-form Video Understanding with Large Language Model as Agent ECCV 2024 Pre-trained Language Models Do Not Help Auto-regressive Text-to-Image Generation EMNLP 2024 Micro-Bench: A Microscopy Benchmark for Vision-Language Understanding NIPS 2024 Why are Visually-Grounded Language Models Bad at Image Classification? NIPS 2024 MuEP: A Multimodal Benchmark for Embodied Planning with Foundation Models IJCAI 2024 MoCa: Measuring Human-Language Model Alignment on Causal and Moral Judgment Tasks NIPS 2023 Diagnosing and Rectifying Vision Models using Language ICLR 2023 Beyond Positive Scaling: How Negation Impacts Scaling Trends of Language Models ACL 2023 Deep Self-Supervised Learning of Speech Denoising from Noisy Speeches INTERSPEECH 2022 Mind the Gap: Understanding the Modality Gap in Multi-modal Contrastive Representation Learning NIPS 2022 Inducing Grammar from Long Short-Term Memory Networks by Shapley Decomposition ACL 2020 Enhancing Transformer with Sememe Knowledge ACL 2020 Stanza: A Python Natural Language Processing Toolkit for Many Human Languages ACL 2020 Jiuge: A Human-Machine Collaborative Chinese Classical Poetry Generation System ACL 2019