Xinxiao Wu
20 papers · 2013–2026 · 6 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+9 more ↓ Show less ↑
π Interdisciplinary Bridge π Conference Polyglot (6) π Academic Marathon (12) π Renaissance Researcher (5) πΊοΈ Taxonomy Completionist (40)
π§
Keyword Pioneer
π£
Hot Topic Early Bird
π
Conference Polyglot
(6)
π§¬
Topic Evolution
π
Century Club
(18)
π
Conference Pioneer
π₯
Unstoppable
(8)
β
The Questioner
ποΈ
Keyword Collector
(97)
Conferences
AAAI (10)
IJCAI (4)
ICCV (3)
CVPR (1)
NIPS (1)
WACV (1)
Top co-authors
Keywords
video understanding
(4)
domain adaptation
(4)
video captioning
(3)
multimodal learning
(3)
image captioning
(3)
unsupervised learning
(2)
causal inference
(2)
graph neural network
(2)
prompt tuning
(2)
visual relationship detection
(2)
transfer learning
(2)
video recognition
(1)
action recognition
(1)
open-vocabulary detection
(1)
object detection
(1)
few-shot learning
(1)
style transfer
(1)
adversarial learning
(1)
data augmentation
(1)
knowledge distillation
(1)
Papers
TongUI: Internet-Scale Trajectories from Multimodal Web Tutorials for Generalized GUI Agents
AAAI 2026
What to Trust? A Trust-aware Knowledge-guided Method for Zero-shot Object State Understanding in Videos
AAAI 2026
LLM-enhanced Action-aware Multi-modal Prompt Tuning for Image-Text Matching
ICCV 2025
Video Summarization Using Denoising Diffusion Probabilistic Model
AAAI 2025
METOR: A Unified Framework for Mutual Enhancement of Objects and Relationships in Open-vocabulary Video Visual Relationship Detection
IJCAI 2025
DiffCLIP: Leveraging Stable Diffusion for Language Grounded 3D Classification
WACV 2024
Relational Distant Supervision for Image Captioning without Image-Text Pairs
AAAI 2024
Multi-Modal Prompting for Open-Vocabulary Video Visual Relationship Detection
AAAI 2024
Meta-Causal Learning for Single Domain Generalization
CVPR 2023
Teaching What You Should Teach: A Data-Based Distillation Method
IJCAI 2023
Adaptive Image-to-Video Scene Graph Generation via Knowledge Reasoning and Adversarial Learning
AAAI 2022
Entity-aware and Motion-aware Transformers for Language-driven Action Localization
IJCAI 2022
Multi-modal Dependency Tree for Video Captioning
NIPS 2021
Spatial-temporal Causal Inference for Partial Image-to-video Adaptation
AAAI 2021
Anticipating Future Relations via Graph Growing for Action Prediction
AAAI 2021
MemCap: Memorizing Style Knowledge for Image Captioning
AAAI 2020
Joint Commonsense and Relation Reasoning for Image and Video Captioning
AAAI 2020
Joint Syntax Representation Learning and Visual Cue Translation for Video Captioning
ICCV 2019
Exploiting Images for Video Recognition with Hierarchical Generative Adversarial Networks
IJCAI 2018
Cross-View Action Recognition over Heterogeneous Feature Spaces
ICCV 2013