Wenguan Wang

101 papers · 2015–2026 · 9 conferences · across top CS/AI conferences

Achievements

+16 more ↓

🏃 Academic Marathon (10) 🌍 Conference Polyglot (9) 🧭 Keyword Pioneer 🌉 Interdisciplinary Bridge 🐝 Cross-Pollinator (10)

🐝 Cross-Pollinator (10) 🌈 Renaissance Researcher (9) 🗺️ Taxonomy Completionist (106) 🏠 Conference Loyalist (43) 👑 Triple Crown 🏆 Grand Slam 🏆 Keyword Champion (2) 🤝 Dynamic Duo (37) 🔬 Deep Specialist (15) 🗃️ Keyword Collector (375) 💎 Century Club (100) 📈 Trend Setter 🔥 Unstoppable (9) ❓ The Questioner ⚡ Prolific Year (15) 🚀 Conference Pioneer

Conferences

CVPR (43) ICCV (23) ECCV (16) NIPS (8) ICLR (5) AAAI (2) ICML (2) ACL (1) CONLL (1)

Top co-authors

Yi Yang (37) Jianbing Shen (31) Tianfei Zhou (14) Liulei Li (12) Luc Van Gool (10) Rui Liu (7) Xiankai Lu (7) Ling Shao (7) Song-chun Zhu (6) Siyuan Qi (6)

Research topics

Core AI (1)

Keywords

semantic segmentation (13) vision-language navigation (9) representation learning (7) object detection (7) video object segmentation (7) attention mechanism (5) graph neural network (5) video understanding (5) zero-shot learning (5) multimodal learning (5) self-supervised learning (4) instance segmentation (4) point cloud (4) salient object detection (4) contrastive learning (3) agent system (3) human parsing (3) diffusion model (3) scene understanding (3) video segmentation (3)

Papers

History-Enhanced Two-Stage Transformer for Aerial Vision-and-Language Navigation AAAI 2026 Dual Reciprocal Learning of Language-based Human Motion Understanding and Generation ICCV 2025 Do as We Do, Not as You Think: the Conformity of Large Language Models ICLR 2025 Learning Clustering-based Prototypes for Compositional Zero-Shot Learning ICLR 2025 Hydra-SGG: Hybrid Relation Assignment for One-stage Scene Graph Generation ICLR 2025 Underwater Visual SLAM with Depth Uncertainty and Medium Modeling ICCV 2025 3D Gaussian Map with Open-Set Semantic Grouping for Vision-Language Navigation ICCV 2025 Towards Human-like Virtual Beings: Simulating Human Behavior in 3D Scenes ICCV 2025 A Conditional Probability Framework for Compositional Zero-shot Learning ICCV 2025 Cycle-Consistent Learning for Joint Layout-to-Image Generation and Object Detection ICCV 2025 Gaussian-based World Model: Gaussian Priors for Voxel-Based Occupancy Prediction and Future Motion Prediction ICCV 2025 UNIALIGN: Scaling Multimodal Alignment within One Unified Model CVPR 2025 DiffVsgg: Diffusion-Driven Online Video Scene Graph Generation CVPR 2025 Scene Map-based Prompt Tuning for Navigation Instruction Generation CVPR 2025 TAGA: Self-supervised Learning for Template-free Animatable Gaussian Articulated Model CVPR 2025 Multi-view Reconstruction via SfM-guided Monocular Depth Estimation CVPR 2025 LOGICZSL: Exploring Logic-induced Representation for Compositional Zero-shot Learning CVPR 2025 Neural Clustering based Visual Representation Learning CVPR 2024 Psychometry: An Omnifit Model for Image Reconstruction from Human Brain Activity CVPR 2024 Poly Kernel Inception Network for Remote Sensing Detection CVPR 2024 Clustering Propagation for Universal Medical Image Segmentation CVPR 2024 Shape2Scene: 3D Scene Representation Learning Through Pre-training on Shape Data ECCV 2024 Controllable Navigation Instruction Generation with Chain of Thought Prompting ECCV 2024 Clustering for Protein Representation Learning CVPR 2024 LSK3DNet: Towards Effective and Efficient 3D Perception with Large Sparse Kernels CVPR 2024 Volumetric Environment Representation for Vision-Language Navigation CVPR 2024 Human-Object Interaction Detection Collaborated with Large Relation-driven Diffusion Models NIPS 2024 Vision-Language Navigation with Energy-Based Policy NIPS 2024 Scene Graph Generation with Role-Playing Large Language Models NIPS 2024 Interpretable3D: An Ad-Hoc Interpretable Classifier for 3D Point Clouds AAAI 2024 MS2SL: Multimodal Spoken Data-Driven Continuous Sign Language Production ACL 2024 Navigation Instruction Generation with BEV Perception and Large Language Models ECCV 2024 Nonverbal Interaction Detection ECCV 2024 Mutual Learning for Acoustic Matching and Dereverberation via Visual Scene-driven Diffusion ECCV 2024 Facing the Elephant in the Room: Visual Prompt Tuning or Full finetuning? ICLR 2024 General and Task-Oriented Video Segmentation ECCV 2024 DoraemonGPT: Toward Understanding Dynamic Scenes with Large Language Models (Exemplified as A Video Agent) ICML 2024 IS-Fusion: Instance-Scene Collaborative Fusion for Multimodal 3D Object Detection CVPR 2024 LANA: A Language-Capable Navigator for Instruction Following and Generation CVPR 2023 Neural-Logic Human-Object Interaction Detection NIPS 2023 ClusterFomer: Clustering As A Universal Visual Learner NIPS 2023 Boosting Video Object Segmentation via Space-Time Correspondence Learning CVPR 2023 Unified Mask Embedding and Correspondence Learning for Self-Supervised Video Segmentation CVPR 2023 Bird's-Eye-View Scene Graph for Vision-Language Navigation ICCV 2023 Logic-induced Diagnostic Reasoning for Semi-supervised Semantic Segmentation ICCV 2023 DREAMWALKER: Mental Planning for Continuous Vision-Language Navigation ICCV 2023 Omnidirectional Information Gathering for Knowledge Transfer-Based Audio-Visual Navigation ICCV 2023 LogicSeg: Parsing Visual Semantics with Neural Logic Learning and Reasoning ICCV 2023 E^2VPT: An Effective and Efficient Approach for Visual Prompt Tuning ICCV 2023 Large-Scale Person Detection and Localization Using Overhead Fisheye Cameras ICCV 2023 Clustering based Point Cloud Representation Learning for 3D Analysis ICCV 2023 Visual Recognition with Deep Nearest Centroids ICLR 2023 CLUSTSEG: Clustering for Universal Segmentation ICML 2023 Deep Hierarchical Semantic Segmentation CVPR 2022 Counterfactual Cycle-Consistent Learning for Instruction Following and Generation in Vision-Language Navigation CVPR 2022 Rethinking Semantic Segmentation: A Prototype View CVPR 2022 Visual Abductive Reasoning CVPR 2022 Learning Equivariant Segmentation with Instance-Unique Querying NIPS 2022 GMMSeg: Gaussian Mixture based Generative Semantic Segmentation Models NIPS 2022 Towards Versatile Embodied Navigation NIPS 2022 Towards Interpretable Video Super-Resolution via Alternating Optimization ECCV 2022 Semi-Supervised 3D Object Detection with Proficient Teachers ECCV 2022 ProposalContrast: Unsupervised Pre-training for LiDAR-Based 3D Object Detection ECCV 2022 Reference-Based Image Super-Resolution with Deformable Attention Transformer ECCV 2022 Locality-Aware Inter- and Intra-Video Reconstruction for Self-Supervised Correspondence Learning CVPR 2022 Exploring Cross-Image Pixel Contrast for Semantic Segmentation ICCV 2021 Collaborative Spatial-Temporal Modeling for Language-Queried Video Actor Segmentation CVPR 2021 Face Forensics in the Wild CVPR 2021 Differentiable Multi-Granularity Human Representation Learning for Instance-Aware Human Semantic Parsing CVPR 2021 Structured Scene Memory for Vision-Language Navigation CVPR 2021 Hierarchical Human Parsing With Typed Part-Relation Reasoning CVPR 2020 Mining Cross-Image Semantics for Weakly Supervised Semantic Segmentation ECCV 2020 Video Object Segmentation with Episodic Graph Memory Networks ECCV 2020 Weakly Supervised 3D Object Detection from Lidar Point Cloud ECCV 2020 Active Visual Information Gathering for Vision-Language Navigation ECCV 2020 A Unified Object Motion and Affinity Model for Online Multi-Object Tracking CVPR 2020 Learning Video Object Segmentation From Unlabeled Videos CVPR 2020 Cascaded Human-Object Interaction Recognition CVPR 2020 Shifting More Attention to Video Salient Object Detection CVPR 2019 Understanding Human Gaze Communication by Spatio-Temporal Graph Reasoning ICCV 2019 Learning Compositional Neural Information Fusion for Human Parsing ICCV 2019 Human-Aware Motion Deblurring ICCV 2019 Reasoning Visual Dialogs With Structural and Partial Observations CVPR 2019 An Iterative and Cooperative Top-Down and Bottom-Up Inference Network for Salient Object Detection CVPR 2019 Improving Neural Machine Translation by Achieving Knowledge Transfer with Sentence Alignment Learning CONLL 2019 Optimizing the F-Measure for Threshold-Free Salient Object Detection ICCV 2019 See More, Know More: Unsupervised Video Object Segmentation With Co-Attention Siamese Networks CVPR 2019 Learning Unsupervised Video Object Segmentation Through Visual Attention CVPR 2019 Salient Object Detection With Pyramid Attention and Salient Edges CVPR 2019 Zero-Shot Video Object Segmentation via Attentive Graph Neural Networks ICCV 2019 Pyramid Dilated Deeper ConvLSTM for Video Salient Object Detection ECCV 2018 Attentive Fashion Grammar Network for Fashion Landmark Detection and Clothing Category Classification CVPR 2018 Salient Object Detection Driven by Fixation Prediction CVPR 2018 Hyperparameter Optimization for Tracking With Continuous Deep Q-Learning CVPR 2018 Learning Descriptor Networks for 3D Shape Synthesis and Analysis CVPR 2018 Inferring Shared Attention in Social Scene Videos CVPR 2018 Learning Human-Object Interactions by Graph Parsing Neural Networks ECCV 2018 Revisiting Video Saliency: A Large-Scale Benchmark and a New Model CVPR 2018 Super-Trajectory for Video Segmentation ICCV 2017 Deep Cropping via Attention Box Prediction and Aesthetics Assessment ICCV 2017 Saliency-Aware Geodesic Video Object Segmentation CVPR 2015