Haoyuan Li

26 papers · 2016–2026 · 11 conferences · across top CS/AI conferences

Achievements

+11 more ↓

🏃 Academic Marathon (9) 🧭 Keyword Pioneer 🌉 Interdisciplinary Bridge 🌍 Conference Polyglot (11) 🐝 Cross-Pollinator (13)

🌍 Conference Polyglot (11) 🏃 Academic Marathon (9) 🗺️ Taxonomy Completionist (52) 👥 Mega-Team (30) 🏆 Grand Slam 🤝 Dynamic Duo (11) 🧬 Topic Evolution 💎 Century Club (24) 🔥 Unstoppable (6) 🗃️ Keyword Collector (120) ⚡ Prolific Year (15)

Conferences

ACL (6) AAAI (5) ICLR (3) NAACL (3) CVPR (2) ICCV (2) EMNLP (1) ICML (1) IJCAI (1) NIPS (1) NSDI (1)

Top co-authors

Hao Jiang (11) Wanggui He (8) Zhelun Yu (7) Fangxun Shu (5) Snigdha Chaturvedi (4) Yueting Zhuang (4) Wenqiao Zhang (4) Siliang Tang (4) Ziwei Huang (3) Guanghao Zhang (3)

Keywords

unsupervised learning (4) multi-modal learning (3) multimodal learning (3) image generation (2) visual grounding (2) domain adaptation (2) multi-document summarization (2) text summarization (2) vision-language model (2) multimodal large language model (2) extractive summarization (2) opinion summarization (2) zero-shot learning (2) visual question answering (2) large language model (2) video generation (1) embedding learning (1) object detection (1) transfer learning (1) image retrieval (1)

Papers

CMMCoT: Enhancing Complex Multi-Image Comprehension via Multi-Modal Chain-of-Thought and Memory Augmentation AAAI 2026 MAU-GPT: Enhancing Multi-type Industrial Anomaly Understanding via Anomaly-aware and Generalist Experts Adaptation AAAI 2026 Detecting and Mitigating Hallucination in Large Vision Language Models via Fine-Grained AI Feedback AAAI 2025 HealthGPT: A Medical Large Vision-Language Model for Unifying Comprehension and Generation via Heterogeneous Knowledge Adaptation ICML 2025 Coverage-based Fairness in Multi-document Summarization NAACL 2025 Streaming Video Question-Answering with In-context Video KV-Cache Retrieval ICLR 2025 CorrDetail: Visual Detail Enhanced Self-Correction for Face Forgery Detection IJCAI 2025 TeamLoRA: Boosting Low-Rank Adaptation with Expert Collaboration and Competition ACL 2025 T2I-FactualBench: Benchmarking the Factuality of Text-to-Image Models with Knowledge-Intensive Concepts ACL 2025 Improving Fairness of Large Language Models in Multi-document Summarization ACL 2025 Align2LLaVA: Cascaded Human and Large Language Model Preference Alignment for Multi-modal Instruction Curation ACL 2025 MARS: Mixture of Auto-Regressive Models for Fine-grained Text-to-image Synthesis AAAI 2025 EMOVA: Empowering Language Models to See, Hear and Speak with Vivid Emotions CVPR 2025 Boundary Matters: Leveraging Structured Text Plots for Long Text Outline Generation EMNLP 2025 Anomaly Detection of Integrated Circuits Package Substrates Using the Large Vision Model SAIC: Dataset Construction, Methodology, and Application ICCV 2025 UniGS: Unified Language-Image-3D Pretraining with Gaussian Splatting ICLR 2025 LLaVA-MoD: Making LLaVA Tiny via MoE-Knowledge Distillation ICLR 2025 T2S-GPT: Dynamic Vector Quantization for Autoregressive Sign Language Production from Text ACL 2024 Rationale-based Opinion Summarization NAACL 2024 Coordinate Transformer: Achieving Single-stage Multi-person Mesh Recovery from Videos ICCV 2023 Aspect-aware Unsupervised Extractive Opinion Summarization ACL 2023 DATE: Domain Adaptive Product Seeker for E-Commerce CVPR 2023 Towards Effective Multi-Modal Interchanges in Zero-Resource Sounding Object Localization NIPS 2022 Improving Zero and Few-Shot Abstractive Summarization with Intermediate Fine-tuning and Data Augmentation NAACL 2021 Urban2Vec: Incorporating Street View Imagery and POIs for Multi-Modal Urban Neighborhood Embedding AAAI 2020 FairRide: Near-Optimal, Fair Cache Sharing NSDI 2016