Haotian Liu

27 papers · 2019–2025 · 11 conferences · across top CS/AI conferences

Achievements

+7 more ↓

🌉 Interdisciplinary Bridge 🌈 Renaissance Researcher (7) 🌍 Conference Polyglot (11) 🏃 Academic Marathon (6) 🗺️ Taxonomy Completionist (44)

🌍 Conference Polyglot (11) 🏃 Academic Marathon (6) 🌈 Renaissance Researcher (7) 🤝 Dynamic Duo (12) ⚡ Prolific Year (6) 💎 Century Club (27) 🗃️ Keyword Collector (100)

Conferences

CVPR (7) NIPS (6) ECCV (3) ICCV (3) ICLR (2) AAAI (1) ACL (1) CORL (1) EMNLP (1) IJCAI (1) WACV (1)

Top co-authors

Yong Jae Lee (12) Chunyuan Li (8) Jianfeng Gao (5) Yuheng Li (5) Jianwei Yang (5) Mu Cai (4) Hongteng Xu (3) Yu Zhou (2) Thao Nguyen (2) Xurui Li (2)

Keywords

vision-language model (5) visual question answering (4) multimodal learning (4) large language model (3) large multimodal model (3) few-shot learning (3) multimodal large language model (2) zero-shot learning (2) industrial anomaly (2) transfer learning (2) visual instruction tuning (2) anomaly detection (2) industrial anomaly detection (2) instruction following (2) image generation (2) object detection (2) curriculum learning (1) pose estimation (1) domain adaptation (1) vision transformer (1)

Papers

GPS: A Probabilistic Distributional Similarity with Gumbel Priors for Set-to-Set Matching ICLR 2025 SeaS: Few-shot Industrial Anomaly Image Generation with Separation and Sharing Fine-tuning ICCV 2025 Fantastic Copyrighted Beasts and How (Not) to Generate Them ICLR 2025 AnomalyNCD: Towards Novel Anomaly Class Discovery in Industrial Scenarios CVPR 2025 IMAGINATION POLICY: Using Generative Point Cloud Models for Learning Manipulation Policies CORL 2024 Aligning Large Multimodal Models with Factually Augmented RLHF ACL 2024 Computer Vision on the Edge: Individual Cattle Identification in Real-Time With ReadMyCow System WACV 2024 Yo'LLaVA: Your Personalized Language and Vision Assistant NIPS 2024 Bridging The Gap between Low-rank and Orthogonal Adaptation via Householder Reflection Adaptation NIPS 2024 CharXiv: Charting Gaps in Realistic Chart Understanding in Multimodal LLMs NIPS 2024 ViP-LLaVA: Making Large Multimodal Models Understand Arbitrary Visual Prompts CVPR 2024 Edit One for All: Interactive Batch Image Editing CVPR 2024 Improved Baselines with Visual Instruction Tuning CVPR 2024 Generalizable Face Landmarking Guided by Conditional Face Warping CVPR 2024 Removing Distributional Discrepancies in Captions Improves Image-Text Alignment ECCV 2024 LLaVA-Plus: Learning to Use Tools for Creating Multimodal Agents ECCV 2024 UNICORN: A Unified Causal Video-Oriented Language-Modeling Framework for Temporal Video-Language Tasks EMNLP 2024 Inferring Iterated Function Systems Approximately from Fractal Images IJCAI 2024 Visual Instruction Tuning NIPS 2023 GLIGEN: Open-Set Grounded Text-to-Image Generation CVPR 2023 Data-Efficient Image Quality Assessment with Attention-Panel Decoder AAAI 2023 Learning Customized Visual Models With Retrieval-Augmented Knowledge CVPR 2023 LLaVA-Med: Training a Large Language-and-Vision Assistant for Biomedicine in One Day NIPS 2023 TMA: Temporal Motion Aggregation for Event-based Optical Flow ICCV 2023 ELEVATER: A Benchmark and Toolkit for Evaluating Language-Augmented Visual Models NIPS 2022 Masked Discrimination for Self-Supervised Learning on Point Clouds ECCV 2022 Identity From Here, Pose From There: Self-Supervised Disentanglement and Generation of Objects Using Unlabeled Videos ICCV 2019