Huiyu Wang

29 papers · 2019–2026 · 8 conferences · across top CS/AI conferences

Achievements

+11 more ↓

🌈 Renaissance Researcher (6) 🌉 Interdisciplinary Bridge 🏃 Academic Marathon (7) 🌍 Conference Polyglot (8) 🗺️ Taxonomy Completionist (53)

🗺️ Taxonomy Completionist (53) 🧭 Keyword Pioneer 🐣 Hot Topic Early Bird 👥 Mega-Team (100) 🤝 Dynamic Duo (13) 🧬 Topic Evolution 💎 Century Club (29) 🔥 Unstoppable (8) 🗃️ Keyword Collector (102) ⚡ Prolific Year (6) 🚀 Conference Pioneer

Conferences

CVPR (9) ECCV (7) ICCV (4) WACV (3) ICLR (2) NIPS (2) EMNLP (1) ICML (1)

Top co-authors

Alan Yuille (13) Lorenzo Torresani (8) CHEN WEI (7) Cihang Xie (6) Tushar Nagarajan (5) Gedas Bertasius (5) Liang-Chieh Chen (5) Hartwig Adam (5) Md Mohaiminul Islam (4) Yukun Zhu (4)

Keywords

video understanding (4) semantic segmentation (4) masked autoencoder (4) zero-shot learning (3) egocentric video (3) multimodal learning (3) mask transformer (2) panoptic segmentation (2) procedural activity (2) self-supervised learning (2) convolutional neural network (2) instance segmentation (2) egocentric vision (2) activity recognition (2) video question answering (2) video segmentation (2) vision transformer (2) pose estimation (1) anomaly detection (1) domain adaptation (1)

Papers

TimeRefine: Temporal Grounding with Time Refining Video LLM WACV 2026 Finding Dino: A Plug-and-Play Framework for Zero-Shot Detection of Out-of-Distribution Objects using Prototypes WACV 2025 BIMBA: Selective-Scan Compression for Long-Range Video Question Answering CVPR 2025 MusicFlow: Cascaded Flow Matching for Text Guided Music Generation ICML 2024 VideoINSTA: Zero-shot Long Video Understanding via Informative Spatial-Temporal Reasoning with LLMs EMNLP 2024 4Diff: 3D-Aware Diffusion Model for Third-to-First Viewpoint Translation ECCV 2024 Learning to Segment Referred Objects from Narrated Egocentric Videos CVPR 2024 Ego-Exo4D: Understanding Skilled Human Activity from First- and Third-Person Perspectives CVPR 2024 "Propose, Assess, Search: Harnessing LLMs for Goal-Oriented Planning in Instructional Videos" ECCV 2024 Rethinking Video-Text Understanding: Retrieval from Counterfactually Augmented Data ECCV 2024 Ego4D Goal-Step: Toward Hierarchical Understanding of Procedural Activities NIPS 2023 HT-Step: Aligning Instructional Articles with How-To Videos NIPS 2023 Masked Autoencoders Enable Efficient Knowledge Distillers CVPR 2023 Diffusion Models as Masked Autoencoders ICCV 2023 Ego-Only: Egocentric Action Detection without Exocentric Transferring ICCV 2023 SMAUG: Sparse Masked Autoencoder for Efficient Video-Language Pre-Training ICCV 2023 Image BERT Pre-training with Online Tokenizer ICLR 2022 k-Means Mask Transformer ECCV 2022 CMT-DeepLab: Clustering Mask Transformers for Panoptic Segmentation CVPR 2022 In Defense of Image Pre-training for Spatiotemporal Recognition ECCV 2022 TubeFormer-DeepLab: Video Mask Transformer CVPR 2022 CP2: Copy-Paste Contrastive Pretraining for Semantic Segmentation ECCV 2022 A Simple Data Mixing Prior for Improving Self-Supervised Learning CVPR 2022 MaX-DeepLab: End-to-End Panoptic Segmentation With Mask Transformers CVPR 2021 CO2: Consistent Contrast for Unsupervised Visual Representation Learning ICLR 2021 Axial-DeepLab: Stand-Alone Axial-Attention for Panoptic Segmentation ECCV 2020 Combining Compositional Models and Deep Networks For Robust Object Classification under Occlusion WACV 2020 ELASTIC: Improving CNNs With Dynamic Scaling Policies CVPR 2019 Semantic-Aware Knowledge Preservation for Zero-Shot Sketch-Based Image Retrieval ICCV 2019