Xiaohan Wang
44 papers · 2020–2026 · 12 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+12 more ↓ Show less ↑
π Conference Polyglot (12) π§ Keyword Pioneer π Renaissance Researcher (5) π Interdisciplinary Bridge π Academic Marathon (5)
π
Academic Marathon
(5)
π
Cross-Pollinator
(11)
πΊοΈ
Taxonomy Completionist
(81)
π€
Dynamic Duo
(20)
π¬
Deep Specialist
(11)
π§¬
Topic Evolution
π₯
Unstoppable
(6)
ποΈ
Keyword Collector
(168)
π
Trend Setter
π
Century Club
(43)
β
The Questioner
(2)
β‘
Prolific Year
(9)
Conferences
CVPR (14)
ICCV (8)
AAAI (5)
ACL (4)
ICLR (4)
IJCAI (2)
NIPS (2)
ECCV (1)
EMNLP (1)
IJCNLP (1)
UAI (1)
WACV (1)
Top co-authors
Keywords
contrastive learning
(6)
vision-language model
(5)
video understanding
(5)
prototype learning
(4)
vision language model
(3)
scene understanding
(3)
multimodal learning
(3)
representation learning
(3)
point cloud
(3)
zero-shot learning
(3)
multi-modal learning
(2)
transformer architecture
(2)
semantic segmentation
(2)
cross-modal learning
(2)
video recognition
(2)
knowledge editing
(2)
reinforcement learning
(2)
action recognition
(2)
transfer learning
(2)
domain adaptation
(2)
Papers
Modality-Balanced Collaborative Distillation for Multi-Modal Domain Generalization
AAAI 2026
Video Action Differencing
ICLR 2025
Feather the Throttle: Revisiting Visual Token Pruning for Vision-Language Model Acceleration
ICCV 2025
Automated Generation of Challenging Multiple-Choice Questions for Vision Language Model Evaluation
CVPR 2025
BIOMEDICA: An Open Biomedical Image-Caption Archive, Dataset, and Vision-Language Models Derived from Scientific Literature
CVPR 2025
Innovative Thinking, Infinite Humor: Humor Research of Large Language Models through Structured Thought Leaps
ICLR 2025
Just Shift It: Test-Time Prototype Shifting for Zero-Shot Generalization with Vision-Language Models
WACV 2025
Targeted Learning for Variable Importance
UAI 2025
Video-STaR: Self-Training Enables Video Instruction Tuning with Any Supervision
ICLR 2025
Apollo: An Exploration of Video Understanding in Large Multimodal Models
CVPR 2025
A Category Agnostic Model for Visual Rearrangment
CVPR 2024
Why are Visually-Grounded Language Models Bad at Image Classification?
NIPS 2024
Interpretable3D: An Ad-Hoc Interpretable Classifier for 3D Point Clouds
AAAI 2024
Cross-Sentence Gloss Consistency for Continuous Sign Language Recognition
AAAI 2024
DGL: Dynamic Global-Local Prompt Tuning for Text-Video Retrieval
AAAI 2024
EasyEdit: An Easy-to-use Knowledge Editing Framework for Large Language Models
ACL 2024
Describing Differences in Image Sets with Natural Language
CVPR 2024
An Interactive Navigation Method with Effect-oriented Affordance
CVPR 2024
Imagine Before Go: Self-Supervised Generative Map for Object Goal Navigation
CVPR 2024
VideoAgent: Long-form Video Understanding with Large Language Model as Agent
ECCV 2024
Editing Conceptual Knowledge for Large Language Models
EMNLP 2024
Test-Time Adaptation with CLIP Reward for Zero-Shot Generalization in Vision-Language Models
ICLR 2024
Continual Multimodal Knowledge Graph Construction
IJCAI 2024
Bidirectional Cross-Modal Knowledge Exploration for Video Recognition With Pre-Trained Vision-Language Models
CVPR 2023
LambdaKG: A Library for Pre-trained Language Model-Based Knowledge Graph Embeddings
IJCNLP 2023
Gloss-Free End-to-End Sign Language Translation
ACL 2023
Adversarially Masking Synthetic To Mimic Real: Adaptive Noise Injection for Point Cloud Segmentation Adaptation
CVPR 2023
CaMP: Causal Multi-policy Planning for Interactive Navigation in Multi-room Scenes
NIPS 2023
How to Unleash the Power of Large Language Models for Few-shot Relation Extraction?
ACL 2023
Open Anomalous Trajectory Recognition via Probabilistic Metric Learning
IJCAI 2023
LANA: A Language-Capable Navigator for Instruction Following and Generation
CVPR 2023
Global-to-Local Modeling for Video-Based 3D Human Pose and Shape Estimation
CVPR 2023
Bird's-Eye-View Scene Graph for Vision-Language Navigation
ICCV 2023
JOTR: 3D Joint Contrastive Learning with Transformers for Occluded Human Mesh Recovery
ICCV 2023
Action Sensitivity Learning for Temporal Action Localization
ICCV 2023
MAAL: Multimodality-Aware Autoencoder-Based Affordance Learning for 3D Articulated Objects
ICCV 2023
Clustering based Point Cloud Representation Learning for 3D Analysis
ICCV 2023
WhitenedCSE: Whitening-based Contrastive Learning of Sentence Embeddings
ACL 2023
A Simple Episodic Linear Probe Improves Visual Recognition in the Wild
CVPR 2022
Large-Scale Video Panoptic Segmentation in the Wild: A Benchmark
CVPR 2022
PR-RRN: Pairwise-Regularized Residual-Recursive Networks for Non-Rigid Structure-From-Motion
ICCV 2021
Interactive Prototype Learning for Egocentric Action Recognition
ICCV 2021
T2VLAD: Global-Local Sequence Alignment for Text-Video Retrieval
CVPR 2021
Symbiotic Attention with Privileged Information for Egocentric Action Recognition
AAAI 2020