Qingpei Guo
20 papers · 2021–2026 · 8 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+8 more ↓ Show less ↑
π Cross-Pollinator (13) π Interdisciplinary Bridge π§ Keyword Pioneer π Conference Polyglot (8) π Academic Marathon (5)
π
Conference Polyglot
(8)
π
Renaissance Researcher
(7)
π
Cross-Pollinator
(13)
π€
Dynamic Duo
(10)
π₯
Unstoppable
(5)
π
Century Club
(17)
ποΈ
Keyword Collector
(97)
β‘
Prolific Year
(6)
Conferences
CVPR (6)
AAAI (4)
ICCV (3)
ACL (2)
NIPS (2)
ECCV (1)
ICML (1)
IJCAI (1)
Top co-authors
Keywords
multimodal large language model
(6)
multimodal learning
(3)
image retrieval
(2)
multi-modal large language model
(2)
representation learning
(2)
video understanding
(2)
instruction tuning
(2)
diffusion model
(2)
image generation
(2)
large language model
(2)
zero-shot learning
(1)
curriculum learning
(1)
embedding learning
(1)
video generation
(1)
reinforcement learning
(1)
active learning
(1)
visual perception
(1)
contrastive learning
(1)
autoregressive generation
(1)
self-supervised learning
(1)
Papers
VaccineRAG: Boosting Multimodal Large Language Modelsβ Immunity to Harmful RAG Samples
AAAI 2026
EvoMoE: Expert Evolution in Mixture of Experts for Multimodal Large Language Models
AAAI 2026
SCAN: Self-Calibrated AutoregressioN for High-Quality Visual Generation
AAAI 2026
DynFocus: Dynamic Cooperative Network Empowers LLMs with Video Understanding
CVPR 2025
Attributive Reasoning for Hallucination Diagnosis of Large Language Models
AAAI 2025
VQAGuider: Guiding Multimodal Large Language Models to Answer Complex Video Questions
ACL 2025
SegAgent: Exploring Pixel Understanding Capabilities in MLLMs by Imitating Human Annotator Trajectories
CVPR 2025
Social Debiasing for Fair Multi-modal LLMs
ICCV 2025
Engage for All: Making Ordinary Image Descriptions Appealing Again!
ICCV 2025
Unified Video Generation via Next-Set Prediction in Continuous Domain
ICCV 2025
EVE: Efficient Zero-Shot Text-Based Video Editing With Depth Map Guidance and Temporal Consistency Constraints
IJCAI 2024
LoTLIP: Improving Language-Image Pre-training for Long Text Understanding
NIPS 2024
SyCoCa: Symmetrizing Contrastive Captioners with Attentive Masking for Multimodal Alignment
ICML 2024
Pink: Unveiling the Power of Referential Comprehension for Multi-modal LLMs
CVPR 2024
Referencing Where to Focus: Improving Visual Grounding with Referential Query
NIPS 2024
HOTVCOM: Generating Buzzworthy Comments for Videos
ACL 2024
Boundary-Aware Backward-Compatible Representation via Adversarial Learning in Image Retrieval
CVPR 2023
CNVid-3.5M: Build, Filter, and Pre-Train the Large-Scale Public Chinese Video-Text Dataset
CVPR 2023
Switch-BERT: Learning to Model Multimodal Interactions by Switching Attention and Input
ECCV 2022
LPSNet: A Lightweight Solution for Fast Panoptic Segmentation
CVPR 2021