Qing Li

140 papers · 2004–2026 · 15 conferences · across top CS/AI conferences

Achievements

+17 more ↓

🗺️ Taxonomy Completionist (28) 🧭 Keyword Pioneer 🌉 Interdisciplinary Bridge 🌈 Renaissance Researcher (5) 🐣 Hot Topic Early Bird

🌈 Renaissance Researcher (5) 🌉 Interdisciplinary Bridge 🧭 Keyword Pioneer 🏠 Conference Loyalist (20) 🏆 Keyword Champion (2) 👑 Triple Crown 🏆 Grand Slam 🔬 Deep Specialist (22) 🧬 Topic Evolution 🤝 Dynamic Duo (17) 🚀 Conference Pioneer ⚡ Prolific Year (26) 🔥 Unstoppable (9) 🗃️ Keyword Collector (63) 💎 Century Club (133) 📈 Trend Setter ❓ The Questioner

Conferences

AAAI (25) ACL (20) CVPR (14) NIPS (13) ICCV (12) IJCAI (11) EMNLP (10) COLING (7) ICLR (7) ECCV (6) ICML (5) MICCAI (4) NAACL (3) IJCNLP (2) INTERSPEECH (1)

Top co-authors

Song-chun Zhu (18) Siyuan Huang (16) Yi Cai (15) Xiaojian Ma (14) Yong Jiang (11) Haoran Xie (9) Jiahui Geng (7) Yixin Chen (7) Qingbao Huang (6) Derui Zhu (6)

Research topics

Privacy (1) Probability (1)

Keywords

large language model (12) multimodal learning (9) visual question answering (7) graph neural network (6) point cloud (6) diffusion model (6) domain adaptation (5) vision language model (5) vision-language model (5) unsupervised learning (5) adversarial robustness (5) neural network (4) contrastive learning (4) adversarial learning (4) attention mechanism (4) knowledge distillation (4) normal estimation (4) visual grounding (3) text classification (3) instruction following (3)

Papers

CPOStream: Collaborating Prediction and Observation for Flicker-Free Streamable Free-Viewpoint Video with 3DGS AAAI 2026 CoQuIR: A Comprehensive Benchmark for Code Quality-Aware Information Retrieval ACL 2026 ContrastKV: Robust KV Cache Eviction via Contrastive Signal Fusion for Multi-Query Generalization ACL 2026 RatioSketch: Towards More Accurate Frequency Estimation in Data Streams via a Lightweight Neural Network AAAI 2026 AIM: Manifold-based Data Filtering for Representation Finetuning AAAI 2026 TongUI: Internet-Scale Trajectories from Multimodal Web Tutorials for Generalized GUI Agents AAAI 2026 Suit the Remedy to the Retriever: Interpretable Query Optimization with Retriever Preference Alignment for Vision-Language Retrieval AAAI 2026 A Survey on Multi-View Knowledge Graph: Generation, Fusion, Applications and Future Directions IJCAI 2025 Revolutionizing Encrypted Traffic Classification with MH-Net: A Multi-View Heterogeneous Graph Model AAAI 2025 FIRM: Flexible Interactive Reflection ReMoval AAAI 2025 EvdCLIP: Improving Vision-Language Retrieval with Entity Visual Descriptions from Large Language Models AAAI 2025 Efficient Robustness Evaluation via Constraint Relaxation AAAI 2025 CoRe: Context-Regularized Text Embedding Learning for Text-to-Image Personalization AAAI 2025 Explicitly Guided Difficulty-Controllable Visual Question Generation AAAI 2025 Internal Activation Revision: Safeguarding Vision Language Models Without Parameter Update AAAI 2025 HD-NDEs: Neural Differential Equations for Hallucination Detection in LLMs ACL 2025 Explicit and Implicit Data Augmentation for Social Event Detection ACL 2025 Marco-Bench-MIF: On Multilingual Instruction-Following Capability of Large Language ACL 2025 VSCBench: Bridging the Gap in Vision-Language Model Safety Calibration ACL 2025 QueryAttack: Jailbreaking Aligned Large Language Models Using Structured Non-natural Query Language ACL 2025 LLaMA-E: Empowering E-commerce Authoring with Object-Interleaved Instruction Following COLING 2025 Fine-Grained Features-based Code Search for Precise Query-Code Matching COLING 2025 Bridging the Vision-Brain Gap with an Uncertainty-Aware Blur Prior CVPR 2025 SPC-GS: Gaussian Splatting with Semantic-Prompt Consistency for Indoor Open-World Free-view Synthesis from Sparse Inputs CVPR 2025 METASCENES: Towards Automated Replica Creation for Real-world 3D Scans CVPR 2025 Unveiling the Mist over 3D Vision-Language Understanding: Object-centric Evaluation with Chain-of-Analysis CVPR 2025 SlideCoder: Layout-aware RAG-enhanced Hierarchical Slide Generation from Design EMNLP 2025 Reasoning under Uncertainty: Efficient LLM Inference via Unsupervised Confidence Dilution and Convergent Adaptive Sampling EMNLP 2025 SpecCoT: Accelerating Chain-of-Thought Reasoning through Speculative Exploration EMNLP 2025 SAUCE: Selective Concept Unlearning in Vision-Language Models with Sparse Autoencoders ICCV 2025 Embodied VideoAgent: Persistent Memory from Egocentric Videos and Embodied Sensors Enables Dynamic Scene Understanding ICCV 2025 Mitigating Catastrophic Overfitting in Fast Adversarial Training via Label Information Elimination ICCV 2025 UIPro: Unleashing Superior Interaction Capability For GUI Agents ICCV 2025 Learning Normals of Noisy Points by Local Gradient-Aware Surface Filtering ICCV 2025 Move to Understand a 3D Scene: Bridging Visual Grounding and Exploration for Efficient and Versatile Embodied Navigation ICCV 2025 Multi-modal Agent Tuning: Building a VLM-Driven Agent for Efficient Tool Usage ICLR 2025 ESE: Espresso Sentence Embeddings ICLR 2025 MMKE-Bench: A Multimodal Editing Benchmark for Diverse Visual Knowledge ICLR 2025 Falcon: Fast Visuomotor Policies via Partial Denoising ICML 2025 Collaborative Multi-LoRA Experts with Achievement-based Multi-Tasks Loss for Unified Multimodal Information Extraction IJCAI 2025 Tree-of-AdEditor: Heuristic Tree Reasoning for Automated Video Advertisement Editing with Large Language Model IJCAI 2025 From Generalist to Specialist: Distilling a Mixture of Foundation Models for Domain-specific Medical Image Segmentation MICCAI 2025 Multimodal Imputation of Imaging-derived Phenotypes from Genomic and Blood-based Biomarkers Enhances Common Disease Discovery MICCAI 2025 Spatio-temporal Pre-trained Foundation Model for Neural Decoding with Fine-grained Optimization MICCAI 2025 Advancing the Robustness of Large Language Models through Self-Denoised Smoothing NAACL 2024 FIRE: A Dataset for Feedback Integration and Refinement Evaluation of Multimodal Models NIPS 2024 Dissect Black Box: Interpreting for Rule-Based Explanations in Unsupervised Anomaly Detection NIPS 2024 OmniJARVIS: Unified Vision-Language-Action Tokenization Enables Open-World Instruction Following Agents NIPS 2024 HoloVIC: Large-scale Dataset and Benchmark for Multi-Sensor Holographic Intersection and Vehicle-Infrastructure Cooperative CVPR 2024 CLOVA: A Closed-LOop Visual Assistant with Tool Usage and Update CVPR 2024 Cross Initialization for Face Personalization of Text-to-Image Models CVPR 2024 AttnDreamBooth: Towards Text-Aligned Personalized Text-to-Image Generation NIPS 2024 Entity Alignment with Noisy Annotations from Large Language Models NIPS 2024 Bongard-OpenWorld: Few-Shot Reasoning for Free-form Visual Concepts in the Real World ICLR 2024 SceneVerse: Scaling 3D Vision-Language Learning for Grounded Scene Understanding ECCV 2024 VideoAgent: A Memory-augmented Multimodal Agent for Video Understanding ECCV 2024 Unifying 3D Vision-Language Understanding via Promptable Queries ECCV 2024 Neural-Symbolic Recursive Machine for Systematic Generalization ICLR 2024 An Empirical Study on the Fairness of Foundation Models for Multi-Organ Image Segmentation MICCAI 2024 Reference-free Hallucination Detection for Large Vision-Language Models EMNLP 2024 End-to-End Neuro-Symbolic Reinforcement Learning with Textual Explanations ICML 2024 An Embodied Generalist Agent in 3D World ICML 2024 UltraEdit: Instruction-based Fine-Grained Image Editing at Scale NIPS 2024 PoLLMgraph: Unraveling Hallucinations in Large Language Models via State Transition Dynamics NAACL 2024 Quantized Side Tuning: Fast and Memory-Efficient Tuning of Quantized Large Language Models ACL 2024 Releasing the Capacity of GANs in Non-Autoregressive Image Captioning COLING 2024 Adversarial Initialization with Universal Adversarial Perturbation: A New Approach to Fast Adversarial Training AAAI 2024 Automated Defect Report Generation for Enhanced Industrial Quality Control AAAI 2024 One-Step Forward and Backtrack: Overcoming Zig-Zagging in Loss-Aware Quantization Training AAAI 2024 Compositional Inversion for Stable Diffusion Models AAAI 2024 3D-VisTA: Pre-trained Transformer for 3D Vision and Text Alignment ICCV 2023 Metis: Understanding and Enhancing In-Network Regular Expressions NIPS 2023 Recurrent Attention Networks for Long-text Modeling ACL 2023 Rethinking Multimodal Entity and Relation Extraction from a Translation Point of View ACL 2023 Joint Multimodal Entity-Relation Extraction Based on Edge-Enhanced Graph Alignment Network and Word-Pair Relation Tagging AAAI 2023 NeuralGF: Unsupervised Point Normal Estimation by Learning Neural Gradient Function NIPS 2023 SlotGAT: Slot-based Message Passing for Heterogeneous Graphs ICML 2023 Interpreting Unsupervised Anomaly Detection in Security via Rule Extraction NIPS 2023 SQA3D: Situated Question Answering in 3D Scenes ICLR 2023 SHS-Net: Learning Signed Hyper Surfaces for Oriented Normal Estimation of Point Clouds CVPR 2023 SheetCopilot: Bringing Software Productivity to the Next Level through Large Language Models NIPS 2023 Learning non-Markovian Decision-Making from State-only Sequences NIPS 2023 A Minimalist Dataset for Systematic Generalization of Perception, Syntax, and Semantics ICLR 2023 AMR-TST: Abstract Meaning Representation-based Text Style Transfer ACL 2023 Generative Diffusion Models on Graphs: Methods and Applications IJCAI 2023 Subsequence-based Graph Routing Network for Capturing Multiple Risk Propagation Processes IJCAI 2022 HSurf-Net: Normal Estimation for 3D Point Clouds by Learning Hyper Surfaces NIPS 2022 Fairness Reprogramming NIPS 2022 SpeechT5: Unified-Modal Encoder-Decoder Pre-Training for Spoken Language Processing ACL 2022 Exploring Non-Autoregressive Text Style Transfer EMNLP 2021 Merging Statistical Feature via Adaptive Gate for Improved Text Classification AAAI 2021 Story Ending Generation with Multi-Level Graph Convolutional Networks over Dependency Trees AAAI 2021 Collaborative Learning of Bidirectional Decoders for Unsupervised Text Style Transfer EMNLP 2021 Modeling the Momentum Spillover Effect for Stock Prediction via Attribute-Driven Graph Attention Networks AAAI 2021 Learning by Fixing: Solving Math Word Problems with Weak Supervision AAAI 2021 SMART: A Situation Model for Algebra Story Problems via Attributed Grammar AAAI 2021 Entity Guided Question Generation with Contextual Structure and Sequence Information Capturing AAAI 2021 YouRefIt: Embodied Reference Understanding With Language and Gesture ICCV 2021 IgSEG: Image-guided Story Ending Generation ACL 2021 Incorporating Global Information in Local Attention for Knowledge Representation Learning ACL 2021 VLGrammar: Grounded Grammar Induction of Vision and Language ICCV 2021 Tracklet Proposal Network for Multi-Object Tracking on Point Clouds IJCAI 2021 A Comparative Survey: Benchmarking for Pool-based Active Learning IJCAI 2021 Incorporating Global Information in Local Attention for Knowledge Representation Learning IJCNLP 2021 IgSEG: Image-guided Story Ending Generation IJCNLP 2021 An Entity-Aware Adversarial Domain Adaptation Network for Cross-Domain Named Entity Recognition (Student Abstract) AAAI 2021 Bridging Cross-Tasks Gap for Cognitive Assessment via Fine-Grained Domain Adaptation IJCAI 2020 An Investigation of Few-Shot Learning in Spoken Term Classification INTERSPEECH 2020 Task-oriented Domain-specific Meta-Embedding for Text Classification EMNLP 2020 Conditional Causal Relationships between Emotions and Causes in Texts EMNLP 2020 Suppressing Mislabeled Data via Grouping and Self-Attention ECCV 2020 A Competence-aware Curriculum for Visual Concepts Learning via Question Answering ECCV 2020 GaitPart: Temporal Part-Based Model for Gait Recognition CVPR 2020 Point2Node: Correlation Learning of Dynamic-Node for Point Cloud Feature Modeling AAAI 2020 Controllable Abstractive Sentence Summarization with Guiding Entities COLING 2020 A Two-phase Prototypical Network Model for Incremental Few-shot Relation Classification COLING 2020 A Unified Sequence Labeling Model for Emotion Cause Pair Extraction COLING 2020 Closed Loop Neural-Symbolic Learning via Integrating Neural Perception, Grammar Parsing, and Symbolic Reasoning ICML 2020 Neural Mixed Counting Models for Dispersed Topic Discovery ACL 2020 Aligned Dual Channel Graph Convolutional Network for Visual Question Answering ACL 2020 VizWiz-Priv: A Dataset for Recognizing the Presence and Purpose of Private Visual Information in Images Taken by Blind People CVPR 2019 Deep Adversarial Social Recommendation IJCAI 2019 LO-Net: Deep Real-Time Lidar Odometry CVPR 2019 Why Does a Visual Question Have Different Answers? ICCV 2019 Unpaired Multi-Domain Image Generation via Regularized Conditional GANs IJCAI 2018 VizWiz Grand Challenge: Answering Visual Questions From Blind People CVPR 2018 Unsupervised Cross-Dataset Person Re-Identification by Transfer Learning of Spatial-Temporal Patterns CVPR 2018 VQA-E: Explaining, Elaborating, and Enhancing Your Answers for Visual Questions ECCV 2018 Tell-and-Answer: Towards Explainable Visual Question Answering using Attributes and Captions EMNLP 2018 A Robust Noise Resistant Algorithm for POI Identification from Flickr Data IJCAI 2017 Least Squares Generative Adversarial Networks ICCV 2017 Locally-Transferred Fisher Vectors for Texture Classification ICCV 2017 A Network Framework for Noisy Label Aggregation in Social Media ACL 2017 Fusing Subcategory Probabilities for Texture Classification CVPR 2015 Exploiting Social Relations and Sentiment for Stock Prediction EMNLP 2014 Exploiting Topic based Twitter Sentiment for Stock Prediction ACL 2013 Recommendation in Internet Forums and Blogs ACL 2010 Concept Unification of Terms in Different Languages for IR ACL 2006 Concept Unification of Terms in Different Languages for IR COLING 2006 Converting Text into Agent Animations: Assigning Gestures to Text NAACL 2004