Huiyu Wang
29 papers · 2019–2026 · 8 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+11 more ↓ Show less ↑
π Renaissance Researcher (6) π Interdisciplinary Bridge π Academic Marathon (7) π Conference Polyglot (8) πΊοΈ Taxonomy Completionist (53)
πΊοΈ
Taxonomy Completionist
(53)
π§
Keyword Pioneer
π£
Hot Topic Early Bird
π₯
Mega-Team
(100)
π€
Dynamic Duo
(13)
π§¬
Topic Evolution
π
Century Club
(29)
π₯
Unstoppable
(8)
ποΈ
Keyword Collector
(102)
β‘
Prolific Year
(6)
π
Conference Pioneer
Conferences
CVPR (9)
ECCV (7)
ICCV (4)
WACV (3)
ICLR (2)
NIPS (2)
EMNLP (1)
ICML (1)
Top co-authors
Keywords
video understanding
(4)
semantic segmentation
(4)
masked autoencoder
(4)
zero-shot learning
(3)
egocentric video
(3)
multimodal learning
(3)
mask transformer
(2)
panoptic segmentation
(2)
procedural activity
(2)
self-supervised learning
(2)
convolutional neural network
(2)
instance segmentation
(2)
egocentric vision
(2)
activity recognition
(2)
video question answering
(2)
video segmentation
(2)
vision transformer
(2)
pose estimation
(1)
anomaly detection
(1)
domain adaptation
(1)
Papers
TimeRefine: Temporal Grounding with Time Refining Video LLM
WACV 2026
Finding Dino: A Plug-and-Play Framework for Zero-Shot Detection of Out-of-Distribution Objects using Prototypes
WACV 2025
BIMBA: Selective-Scan Compression for Long-Range Video Question Answering
CVPR 2025
MusicFlow: Cascaded Flow Matching for Text Guided Music Generation
ICML 2024
VideoINSTA: Zero-shot Long Video Understanding via Informative Spatial-Temporal Reasoning with LLMs
EMNLP 2024
4Diff: 3D-Aware Diffusion Model for Third-to-First Viewpoint Translation
ECCV 2024
Learning to Segment Referred Objects from Narrated Egocentric Videos
CVPR 2024
Ego-Exo4D: Understanding Skilled Human Activity from First- and Third-Person Perspectives
CVPR 2024
"Propose, Assess, Search: Harnessing LLMs for Goal-Oriented Planning in Instructional Videos"
ECCV 2024
Rethinking Video-Text Understanding: Retrieval from Counterfactually Augmented Data
ECCV 2024
Ego4D Goal-Step: Toward Hierarchical Understanding of Procedural Activities
NIPS 2023
HT-Step: Aligning Instructional Articles with How-To Videos
NIPS 2023
Masked Autoencoders Enable Efficient Knowledge Distillers
CVPR 2023
Diffusion Models as Masked Autoencoders
ICCV 2023
Ego-Only: Egocentric Action Detection without Exocentric Transferring
ICCV 2023
SMAUG: Sparse Masked Autoencoder for Efficient Video-Language Pre-Training
ICCV 2023
Image BERT Pre-training with Online Tokenizer
ICLR 2022
k-Means Mask Transformer
ECCV 2022
CMT-DeepLab: Clustering Mask Transformers for Panoptic Segmentation
CVPR 2022
In Defense of Image Pre-training for Spatiotemporal Recognition
ECCV 2022
TubeFormer-DeepLab: Video Mask Transformer
CVPR 2022
CP2: Copy-Paste Contrastive Pretraining for Semantic Segmentation
ECCV 2022
A Simple Data Mixing Prior for Improving Self-Supervised Learning
CVPR 2022
MaX-DeepLab: End-to-End Panoptic Segmentation With Mask Transformers
CVPR 2021
CO2: Consistent Contrast for Unsupervised Visual Representation Learning
ICLR 2021
Axial-DeepLab: Stand-Alone Axial-Attention for Panoptic Segmentation
ECCV 2020
Combining Compositional Models and Deep Networks For Robust Object Classification under Occlusion
WACV 2020
ELASTIC: Improving CNNs With Dynamic Scaling Policies
CVPR 2019
Semantic-Aware Knowledge Preservation for Zero-Shot Sketch-Based Image Retrieval
ICCV 2019