Boyang Li

46 papers · 2016–2026 · 11 conferences · across top CS/AI conferences

Achievements

+12 more ↓

🧭 Keyword Pioneer 🐣 Hot Topic Early Bird 🌉 Interdisciplinary Bridge 🗺️ Taxonomy Completionist (11) 🌍 Conference Polyglot (11)

🌍 Conference Polyglot (11) 🧭 Keyword Pioneer 🐣 Hot Topic Early Bird 🔬 Deep Specialist (10) 🏆 Keyword Champion (3) 🗃️ Keyword Collector (222) ⚡ Prolific Year (7) 🚀 Conference Pioneer 📈 Trend Setter 💎 Century Club (45) 🔥 Unstoppable (11) ❓ The Questioner (2)

Conferences

CVPR (8) EMNLP (7) AAAI (6) IJCAI (5) ACL (4) ICCV (3) NAACL (3) NIPS (3) WACV (3) COLING (2) ICML (2)

Top co-authors

Chunyan Miao (6) Junnan Li (4) Han Yu (4) Haoxin Li (4) Jun Chen (3) Mohamed Elhoseiny (3) Leonid Sigal (3) Shuai Liu (3) Junqi Zhao (3) Kai Huang (3)

Keywords

vision-language model (8) multimodal learning (7) transfer learning (6) large language model (6) zero-shot learning (6) neural network (4) few-shot learning (3) text classification (3) semantic segmentation (3) pretrained language model (3) image captioning (3) 3d object detection (3) domain adaptation (3) visual question answering (3) multimodal alignment (3) contrastive learning (2) vision transformer (2) knowledge graph (2) language model (2) video-text alignment (2)

Papers

Learning to Animate Images from A Few Videos to Portray Delicate Human Actions WACV 2026 METP: Multi-Granularity Integration of External Covariates for Temporal Point Processes AAAI 2026 Zero-to-Strong Generalization: Eliciting Strong Capabilities of Large Language Models Iteratively without Gold Labels COLING 2025 Local Masked Reconstruction for Efficient Self-Supervised Learning on High-Resolution Images WACV 2025 CAT Merging: A Training-Free Approach for Resolving Conflicts in Model Merging ICML 2025 Enhancing Vision-Language Compositional Understanding with Multimodal Synthetic Data CVPR 2025 FSHNet: Fully Sparse Hybrid Network for 3D Object Detection CVPR 2025 CharMoral: A Character Morality Dataset for Morally Dynamic Character Analysis in Long-Form Narratives COLING 2025 Black Swan: Abductive and Defeasible Video Reasoning in Unpredictable Events CVPR 2025 SPHERE: Unveiling Spatial Blind Spots in Vision-Language Models Through Hierarchical Evaluation ACL 2025 What Are We Measuring When We Evaluate Large Vision-Language Models? An Analysis of Latent Factors and Biases NAACL 2024 FFAM: Feature Factorization Activation Map for Explanation of 3D Detectors NIPS 2024 Distilling Autoregressive Models to Obtain High-Performance Non-autoregressive Solvers for Vehicle Routing Problems with Faster Inference Speed AAAI 2024 Emergent Open-Vocabulary Semantic Segmentation from Off-the-shelf Vision-Language Models CVPR 2024 Concept-skill Transferability-based Data Selection for Large Vision-Language Models EMNLP 2024 Diversify, Rationalize, and Combine: Ensembling Multiple QA Strategies for Zero-shot Knowledge-based VQA EMNLP 2024 A Training Data Recipe to Accelerate A* Search with Language Models EMNLP 2024 Multilingual Synopses of Movie Narratives: A Dataset for Vision-Language Story Understanding EMNLP 2024 Neuro-Symbolic Temporal Point Processes ICML 2024 DCDet: Dynamic Cross-based 3D Object Detector IJCAI 2024 Event Causality Is Key to Computational Story Understanding NAACL 2024 OctFormer: Efficient Octree-Based Transformer for Point Cloud Compression with Local Enhancement AAAI 2023 Is GPT-3 a Good Data Annotator? ACL 2023 InstructBLIP: Towards General-purpose Vision-Language Models with Instruction Tuning NIPS 2023 Uncovering Prototypical Knowledge for Weakly Open-Vocabulary Semantic Segmentation NIPS 2023 Monte Carlo Linear Clustering with Single-Point Supervision is Enough for Infrared Small Target Detection ICCV 2023 Mitigating and Evaluating Static Bias of Action Representations in the Background and the Foreground ICCV 2023 From Images to Textual Prompts: Zero-Shot Visual Question Answering With Frozen Large Language Models CVPR 2023 Toward Knowledge-Enriched Conversational Recommendation Systems ACL 2022 Plug-and-Play VQA: Zero-shot VQA by Conjoining Large Pretrained Models with Zero Training EMNLP 2022 VisualGPT: Data-Efficient Adaptation of Pretrained Language Models for Image Captioning CVPR 2022 History-Aware Hierarchical Transformer for Multi-session Open-domain Dialogue System EMNLP 2022 Improving the Sample Efficiency of Prompt Tuning with Domain Adaptation EMNLP 2022 Proof of Learning (PoLe): Empowering Machine Learning with Consensus Building on Blockchains (Demo) AAAI 2021 Latent-Optimized Adversarial Neural Transfer for Sarcasm Detection NAACL 2021 Data-Efficient Alignment of Multimodal Sequences by Aligning Gradient Updates and Internal Feature Distributions WACV 2021 Exploring Long Tail Visual Relationship Recognition With Large Vocabulary ICCV 2021 HyDRA: Hypergradient Data Relevance Analysis for Interpreting Deep Neural Networks AAAI 2021 Noise-Resistant Deep Metric Learning With Ranking-Based Instance Selection CVPR 2021 Simultaneous Arrival Matching for New Spatial Crowdsourcing Platforms IJCAI 2020 Understanding Actors and Evaluating Personae with Gaussian Embeddings AAAI 2019 Real-Time Adversarial Attacks IJCAI 2019 A Neural Multi-Sequence Alignment TeCHnique (NeuMATCH) CVPR 2018 Predicting the Quality of Short Narratives from Social Media IJCAI 2017 Game Engine Learning from Video IJCAI 2017 Multiplicative Representations for Unsupervised Semantic Role Induction ACL 2016