Xinyu Li

42 papers · 2018–2026 · 15 conferences · across top CS/AI conferences

Achievements

+12 more ↓

🗺️ Taxonomy Completionist (13) 🧭 Keyword Pioneer 🌈 Renaissance Researcher (5) 🌉 Interdisciplinary Bridge 🌍 Conference Polyglot (15)

🌍 Conference Polyglot (15) 🗺️ Taxonomy Completionist (13) 🧭 Keyword Pioneer 🌱 Topic Pioneer 🧬 Topic Evolution 🗃️ Keyword Collector (182) ⚡ Prolific Year (5) 🚀 Conference Pioneer 📈 Trend Setter 💎 Century Club (38) 🔥 Unstoppable (9) ❓ The Questioner

Conferences

CVPR (7) WACV (6) ICCV (5) AAAI (4) ACL (4) MICCAI (3) NIPS (3) ECCV (2) INTERSPEECH (2) ACML (1) COLING (1) ICML (1) IJCAI (1) MLHC (1) NAACL (1)

Top co-authors

Joseph Tighe (9) Hao Chen (8) Bing Shuai (7) Ivan Marsic (6) Jue Wang (6) David Fan (6) Davide Modolo (5) Chunhui Liu (5) Zhikang Zhang (5) Vimal Bhat (5)

Keywords

video understanding (8) action recognition (4) object detection (4) transformer architecture (3) cross-modal alignment (3) contrastive learning (3) self-supervised learning (3) video classification (3) causal inference (3) multimodal learning (3) semantic embedding (2) domain adaptation (2) activity recognition (2) feature extraction (2) heterogeneous treatment effect (2) temporal modeling (2) visual question answering (2) attention mechanism (2) spatio-temporal modeling (2) unmeasured confounding (2)

Papers

When AI Meets AI: A Game-Theoretic Defense Framework Against AI Empowered Cyber Threats AAAI 2026 Looking Beyond the One: Operationalizing and Eliciting Visual Ambiguity in VLLMs ACL 2026 Is the Attention Matrix Really the Key to Self-Attention in Multivariate Long-Term Time Series Forecasting? ACL 2026 Learning Compact Video Representations for Efficient Long-form Video Understanding in Large Multimodal Models WACV 2026 NeuroBridge: Bio-Inspired Self-Supervised EEG-to-Image Decoding via Cognitive Priors and Bidirectional Semantic Alignment AAAI 2026 Now You See Me: Context-Aware Automatic Audio Description WACV 2025 SemiVisBooster: Boosting Semi-Supervised Learning for Fine-Grained Classification through Pseudo-Label Semantic Guidance ICCV 2025 DZAD: Diffusion-based Zero-shot Anomaly Detection AAAI 2025 Tetra-orientated Mamba with T2-FLAIR Mismatch Features for Glioma Segmentation, IDH Genotyping, and Grading MICCAI 2025 GEXIA: Granularity Expansion and Iterative Approximation for Scalable Multi-Grained Video-Language Learning WACV 2025 GarmentDiffusion: 3D Garment Sewing Pattern Generation with Multimodal Diffusion Transformers IJCAI 2025 DetectDiffuse: Aggregation- and Attention-driven Universal Lesion Detection with Multi-scale Diffusion Model MICCAI 2025 Unraveling and Mitigating Retriever Inconsistencies in Retrieval-Augmented Large Language Models ACL 2024 Adapting Animal Models to Assess Sufficiency of Fluid Resuscitation in Humans (Student Abstract) AAAI 2024 Text-Guided Video Masked Autoencoder ECCV 2024 SkinCON: Towards consensus for the uncertainty of skin cancer sub-typing through distribution regularized adaptive predictive sets (DRAPS) MICCAI 2024 Video Token Merging for Long Video Understanding NIPS 2024 Revisiting Multimodal Representation in Contrastive Learning: From Patch and Token Embeddings to Finite Discrete Tokens CVPR 2023 Discrete Cosin TransFormer: Image Modeling From Frequency Domain WACV 2023 MEGA: Multimodal Alignment Aggregation and Distillation For Cinematic Video Segmentation ICCV 2023 Motion-Guided Masking for Spatiotemporal Representation Learning ICCV 2023 Difference-in-Differences Meets Tree-based Methods: Heterogeneous Treatment Effects Estimation with Unmeasured Confounding ICML 2023 Stochastic Backpropagation: A Memory Efficient Strategy for Training Video Models CVPR 2022 NUTA: Non-Uniform Temporal Aggregation for Action Recognition WACV 2022 Id-Free Person Similarity Learning CVPR 2022 TubeR: Tubelet Transformer for Video Action Detection CVPR 2022 What To Look at and Where: Semantic and Spatial Refined Transformer for Detecting Human-Object Interactions CVPR 2022 Debiased Causal Tree: Heterogeneous Treatment Effects Estimation with Unmeasured Confounding NIPS 2022 Do Deep Neural Nets Display Human-like Attention in Short Answer Scoring? NAACL 2022 Robust Direct Learning for Causal Data Fusion ACML 2022 SSCAP: Self-Supervised Co-Occurrence Action Parsing for Unsupervised Temporal Action Segmentation WACV 2022 Long Short-Term Transformer for Online Action Detection NIPS 2021 SiamMOT: Siamese Multi-Object Tracking CVPR 2021 Multi-Label Activity Recognition Using Activity-Specific Features and Activity Correlations CVPR 2021 Selective Feature Compression for Efficient Activity Recognition Inference ICCV 2021 VidTr: Video Transformer Without Convolutions ICCV 2021 Directional Temporal Modeling for Action Recognition ECCV 2020 Dynamically Personalized Detection of Hemorrhage MLHC 2019 Multi-Stream Network with Temporal Attention for Environmental Sound Classification INTERSPEECH 2019 Speech Audio Super-Resolution for Speech Recognition INTERSPEECH 2019 Hybrid Attention based Multimodal Network for Spoken Language Classification COLING 2018 Multimodal Affective Analysis Using Hierarchical Attention Strategy with Word-Level Alignment ACL 2018