Yifei Huang

37 papers · 2018–2026 · 12 conferences · across top CS/AI conferences

Achievements

+12 more ↓

🏃 Academic Marathon (7) 🌍 Conference Polyglot (11) 🌉 Interdisciplinary Bridge 🧭 Keyword Pioneer 🐝 Cross-Pollinator (10)

🐝 Cross-Pollinator (10) 🌈 Renaissance Researcher (8) 🗺️ Taxonomy Completionist (57) 🤝 Dynamic Duo (14) 👥 Mega-Team (100) 🏆 Keyword Champion (2) 🚀 Conference Pioneer 💎 Century Club (34) 📈 Trend Setter 🗃️ Keyword Collector (133) 🔥 Unstoppable (6) ⚡ Prolific Year (6)

Conferences

CVPR (10) ECCV (6) ICCV (5) ICLR (5) AAAI (2) ACL (2) WACV (2) COLING (1) EMNLP (1) ICML (1) IJCAI (1) MICCAI (1)

Top co-authors

Yoichi Sato (14) Guo Chen (8) Jilan Xu (7) Mingfang Zhang (5) Baoqi Pei (5) Limin Wang (5) Lijin Yang (5) Ryosuke Furuta (4) Weidi Xie (4) Yali Wang (4)

Keywords

video understanding (8) egocentric video (6) multimodal learning (6) egocentric vision (5) contrastive learning (2) action anticipation (2) online action detection (2) activity recognition (2) large language model (2) graph neural network (2) link prediction (1) scene understanding (1) object detection (1) video recognition (1) pose estimation (1) zero-shot learning (1) representation learning (1) domain adaptation (1) temporal reasoning (1) temporal modeling (1)

Papers

Learning Procedural-Aware Video Representations Through State-Grounded Hierarchy Unfolding AAAI 2026 Unleashing Spatial Reasoning in Multimodal Large Language Models via Textual Representation Guided Reasoning ACL 2026 MeepleLM: A Virtual Playtester Simulating Diverse Subjective Experiences ACL 2026 Learning Streaming Video Representation via Multitask Training ICCV 2025 Egocentric Object-Interaction Anticipation with Retentive and Predictive Learning IJCAI 2025 MAGRET: Machine-generated Text Detection with Rewritten Texts COLING 2025 SiMHand: Mining Similar Hands for Large-Scale 3D Hand Pose Pre-training ICLR 2025 Beyond Label Semantics: Language-Guided Action Anatomy for Few-shot Action Recognition ICCV 2025 Egocentric Action-aware Inertial Localization in Point Clouds with Vision-Language Guidance ICCV 2025 TextCenGen: Attention-Guided Text-Centric Background Adaptation for Text-to-Image Generation ICML 2025 CG-Bench: Clue-grounded Question Answering Benchmark for Long Video Understanding ICLR 2025 EgoExo-Gen: Ego-centric Video Prediction by Watching Exo-centric Videos ICLR 2025 Modeling Fine-Grained Hand-Object Dynamics for Egocentric Video Representation Learning ICLR 2025 InternVideo2: Scaling Foundation Models for Multimodal Video Understanding ECCV 2024 Optimizing Efficiency and Effectiveness in Sequential Prompt Strategy for SAM using Reinforcement Learning MICCAI 2024 ActionVOS: Actions as Prompts for Video Object Segmentation ECCV 2024 Retrieval-Augmented Egocentric Video Captioning CVPR 2024 Ego-Exo4D: Understanding Skilled Human Activity from First- and Third-Person Perspectives CVPR 2024 EgoExoLearn: A Dataset for Bridging Asynchronous Ego- and Exo-centric View of Procedural Activities in Real World CVPR 2024 Masked Video and Body-worn IMU Autoencoder for Egocentric Action Recognition ECCV 2024 Pretraining Language Models with Text-Attributed Heterogeneous Graphs EMNLP 2023 Structural Multiplane Image: Bridging Neural View Synthesis and 3D Reconstruction CVPR 2023 Weakly Supervised Temporal Sentence Grounding With Uncertainty-Guided Self-Training CVPR 2023 Memory-and-Anticipation Transformer for Online Action Understanding ICCV 2023 3D Segmenter: 3D Transformer based Semantic Segmentation via 2D Panoramic Distillation ICLR 2023 Fine-Grained Affordance Annotation for Egocentric Hand-Object Interaction Videos WACV 2023 Ego4D: Around the World in 3,000 Hours of Egocentric Video CVPR 2022 Compound Prototype Matching for Few-Shot Action Recognition ECCV 2022 CLRNet: Cross Layer Refinement Network for Lane Detection CVPR 2022 Interact Before Align: Leveraging Cross-Modal Knowledge for Domain Adaptive Action Recognition CVPR 2022 FACIAL: Synthesizing Dynamic Talking Face With Implicit Attribute Learning ICCV 2021 Towards Visually Explaining Video Understanding Networks With Perturbation WACV 2021 Goal-Oriented Gaze Estimation for Zero-Shot Learning CVPR 2021 Commonsense Knowledge Aware Concept Selection For Diverse and Informative Visual Storytelling AAAI 2021 Learn to Recover Visible Color for Video Surveillance in a Day ECCV 2020 Improving Action Segmentation via Graph-Based Temporal Reasoning CVPR 2020 Predicting Gaze in Egocentric Video by Learning Task-dependent Attention Transition ECCV 2018