Xinyu Li
42 papers · 2018–2026 · 15 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+12 more ↓ Show less ↑
πΊοΈ Taxonomy Completionist (13) π§ Keyword Pioneer π Renaissance Researcher (5) π Interdisciplinary Bridge π Conference Polyglot (15)
π
Conference Polyglot
(15)
πΊοΈ
Taxonomy Completionist
(13)
π§
Keyword Pioneer
π±
Topic Pioneer
π§¬
Topic Evolution
ποΈ
Keyword Collector
(182)
β‘
Prolific Year
(5)
π
Conference Pioneer
π
Trend Setter
π
Century Club
(38)
π₯
Unstoppable
(9)
β
The Questioner
Conferences
CVPR (7)
WACV (6)
ICCV (5)
AAAI (4)
ACL (4)
MICCAI (3)
NIPS (3)
ECCV (2)
INTERSPEECH (2)
ACML (1)
COLING (1)
ICML (1)
IJCAI (1)
MLHC (1)
NAACL (1)
Top co-authors
Keywords
video understanding
(8)
action recognition
(4)
object detection
(4)
transformer architecture
(3)
cross-modal alignment
(3)
contrastive learning
(3)
self-supervised learning
(3)
video classification
(3)
causal inference
(3)
multimodal learning
(3)
semantic embedding
(2)
domain adaptation
(2)
activity recognition
(2)
feature extraction
(2)
heterogeneous treatment effect
(2)
temporal modeling
(2)
visual question answering
(2)
attention mechanism
(2)
spatio-temporal modeling
(2)
unmeasured confounding
(2)
Papers
When AI Meets AI: A Game-Theoretic Defense Framework Against AI Empowered Cyber Threats
AAAI 2026
Looking Beyond the One: Operationalizing and Eliciting Visual Ambiguity in VLLMs
ACL 2026
Is the Attention Matrix Really the Key to Self-Attention in Multivariate Long-Term Time Series Forecasting?
ACL 2026
Learning Compact Video Representations for Efficient Long-form Video Understanding in Large Multimodal Models
WACV 2026
NeuroBridge: Bio-Inspired Self-Supervised EEG-to-Image Decoding via Cognitive Priors and Bidirectional Semantic Alignment
AAAI 2026
Now You See Me: Context-Aware Automatic Audio Description
WACV 2025
SemiVisBooster: Boosting Semi-Supervised Learning for Fine-Grained Classification through Pseudo-Label Semantic Guidance
ICCV 2025
DZAD: Diffusion-based Zero-shot Anomaly Detection
AAAI 2025
Tetra-orientated Mamba with T2-FLAIR Mismatch Features for Glioma Segmentation, IDH Genotyping, and Grading
MICCAI 2025
GEXIA: Granularity Expansion and Iterative Approximation for Scalable Multi-Grained Video-Language Learning
WACV 2025
GarmentDiffusion: 3D Garment Sewing Pattern Generation with Multimodal Diffusion Transformers
IJCAI 2025
DetectDiffuse: Aggregation- and Attention-driven Universal Lesion Detection with Multi-scale Diffusion Model
MICCAI 2025
Unraveling and Mitigating Retriever Inconsistencies in Retrieval-Augmented Large Language Models
ACL 2024
Adapting Animal Models to Assess Sufficiency of Fluid Resuscitation in Humans (Student Abstract)
AAAI 2024
Text-Guided Video Masked Autoencoder
ECCV 2024
SkinCON: Towards consensus for the uncertainty of skin cancer sub-typing through distribution regularized adaptive predictive sets (DRAPS)
MICCAI 2024
Video Token Merging for Long Video Understanding
NIPS 2024
Revisiting Multimodal Representation in Contrastive Learning: From Patch and Token Embeddings to Finite Discrete Tokens
CVPR 2023
Discrete Cosin TransFormer: Image Modeling From Frequency Domain
WACV 2023
MEGA: Multimodal Alignment Aggregation and Distillation For Cinematic Video Segmentation
ICCV 2023
Motion-Guided Masking for Spatiotemporal Representation Learning
ICCV 2023
Difference-in-Differences Meets Tree-based Methods: Heterogeneous Treatment Effects Estimation with Unmeasured Confounding
ICML 2023
Stochastic Backpropagation: A Memory Efficient Strategy for Training Video Models
CVPR 2022
NUTA: Non-Uniform Temporal Aggregation for Action Recognition
WACV 2022
Id-Free Person Similarity Learning
CVPR 2022
TubeR: Tubelet Transformer for Video Action Detection
CVPR 2022
What To Look at and Where: Semantic and Spatial Refined Transformer for Detecting Human-Object Interactions
CVPR 2022
Debiased Causal Tree: Heterogeneous Treatment Effects Estimation with Unmeasured Confounding
NIPS 2022
Do Deep Neural Nets Display Human-like Attention in Short Answer Scoring?
NAACL 2022
Robust Direct Learning for Causal Data Fusion
ACML 2022
SSCAP: Self-Supervised Co-Occurrence Action Parsing for Unsupervised Temporal Action Segmentation
WACV 2022
Long Short-Term Transformer for Online Action Detection
NIPS 2021
SiamMOT: Siamese Multi-Object Tracking
CVPR 2021
Multi-Label Activity Recognition Using Activity-Specific Features and Activity Correlations
CVPR 2021
Selective Feature Compression for Efficient Activity Recognition Inference
ICCV 2021
VidTr: Video Transformer Without Convolutions
ICCV 2021
Directional Temporal Modeling for Action Recognition
ECCV 2020
Dynamically Personalized Detection of Hemorrhage
MLHC 2019
Multi-Stream Network with Temporal Attention for Environmental Sound Classification
INTERSPEECH 2019
Speech Audio Super-Resolution for Speech Recognition
INTERSPEECH 2019
Hybrid Attention based Multimodal Network for Spoken Language Classification
COLING 2018
Multimodal Affective Analysis Using Hierarchical Attention Strategy with Word-Level Alignment
ACL 2018