Wenguan Wang
101 papers · 2015–2026 · 9 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+16 more ↓ Show less ↑
π Academic Marathon (10) π Conference Polyglot (9) π§ Keyword Pioneer π Interdisciplinary Bridge π Cross-Pollinator (10)
π
Cross-Pollinator
(10)
π
Renaissance Researcher
(9)
πΊοΈ
Taxonomy Completionist
(106)
π
Conference Loyalist
(43)
π
Triple Crown
π
Grand Slam
π
Keyword Champion
(2)
π€
Dynamic Duo
(37)
π¬
Deep Specialist
(15)
ποΈ
Keyword Collector
(375)
π
Century Club
(100)
π
Trend Setter
π₯
Unstoppable
(9)
β
The Questioner
β‘
Prolific Year
(15)
π
Conference Pioneer
Conferences
CVPR (43)
ICCV (23)
ECCV (16)
NIPS (8)
ICLR (5)
AAAI (2)
ICML (2)
ACL (1)
CONLL (1)
Top co-authors
Research topics
Keywords
semantic segmentation
(13)
vision-language navigation
(9)
representation learning
(7)
object detection
(7)
video object segmentation
(7)
attention mechanism
(5)
graph neural network
(5)
video understanding
(5)
zero-shot learning
(5)
multimodal learning
(5)
self-supervised learning
(4)
instance segmentation
(4)
point cloud
(4)
salient object detection
(4)
contrastive learning
(3)
agent system
(3)
human parsing
(3)
diffusion model
(3)
scene understanding
(3)
video segmentation
(3)
Papers
History-Enhanced Two-Stage Transformer for Aerial Vision-and-Language Navigation
AAAI 2026
Dual Reciprocal Learning of Language-based Human Motion Understanding and Generation
ICCV 2025
Do as We Do, Not as You Think: the Conformity of Large Language Models
ICLR 2025
Learning Clustering-based Prototypes for Compositional Zero-Shot Learning
ICLR 2025
Hydra-SGG: Hybrid Relation Assignment for One-stage Scene Graph Generation
ICLR 2025
Underwater Visual SLAM with Depth Uncertainty and Medium Modeling
ICCV 2025
3D Gaussian Map with Open-Set Semantic Grouping for Vision-Language Navigation
ICCV 2025
Towards Human-like Virtual Beings: Simulating Human Behavior in 3D Scenes
ICCV 2025
A Conditional Probability Framework for Compositional Zero-shot Learning
ICCV 2025
Cycle-Consistent Learning for Joint Layout-to-Image Generation and Object Detection
ICCV 2025
Gaussian-based World Model: Gaussian Priors for Voxel-Based Occupancy Prediction and Future Motion Prediction
ICCV 2025
UNIALIGN: Scaling Multimodal Alignment within One Unified Model
CVPR 2025
DiffVsgg: Diffusion-Driven Online Video Scene Graph Generation
CVPR 2025
Scene Map-based Prompt Tuning for Navigation Instruction Generation
CVPR 2025
TAGA: Self-supervised Learning for Template-free Animatable Gaussian Articulated Model
CVPR 2025
Multi-view Reconstruction via SfM-guided Monocular Depth Estimation
CVPR 2025
LOGICZSL: Exploring Logic-induced Representation for Compositional Zero-shot Learning
CVPR 2025
Neural Clustering based Visual Representation Learning
CVPR 2024
Psychometry: An Omnifit Model for Image Reconstruction from Human Brain Activity
CVPR 2024
Poly Kernel Inception Network for Remote Sensing Detection
CVPR 2024
Clustering Propagation for Universal Medical Image Segmentation
CVPR 2024
Shape2Scene: 3D Scene Representation Learning Through Pre-training on Shape Data
ECCV 2024
Controllable Navigation Instruction Generation with Chain of Thought Prompting
ECCV 2024
Clustering for Protein Representation Learning
CVPR 2024
LSK3DNet: Towards Effective and Efficient 3D Perception with Large Sparse Kernels
CVPR 2024
Volumetric Environment Representation for Vision-Language Navigation
CVPR 2024
Human-Object Interaction Detection Collaborated with Large Relation-driven Diffusion Models
NIPS 2024
Vision-Language Navigation with Energy-Based Policy
NIPS 2024
Scene Graph Generation with Role-Playing Large Language Models
NIPS 2024
Interpretable3D: An Ad-Hoc Interpretable Classifier for 3D Point Clouds
AAAI 2024
MS2SL: Multimodal Spoken Data-Driven Continuous Sign Language Production
ACL 2024
Navigation Instruction Generation with BEV Perception and Large Language Models
ECCV 2024
Nonverbal Interaction Detection
ECCV 2024
Mutual Learning for Acoustic Matching and Dereverberation via Visual Scene-driven Diffusion
ECCV 2024
Facing the Elephant in the Room: Visual Prompt Tuning or Full finetuning?
ICLR 2024
General and Task-Oriented Video Segmentation
ECCV 2024
DoraemonGPT: Toward Understanding Dynamic Scenes with Large Language Models (Exemplified as A Video Agent)
ICML 2024
IS-Fusion: Instance-Scene Collaborative Fusion for Multimodal 3D Object Detection
CVPR 2024
LANA: A Language-Capable Navigator for Instruction Following and Generation
CVPR 2023
Neural-Logic Human-Object Interaction Detection
NIPS 2023
ClusterFomer: Clustering As A Universal Visual Learner
NIPS 2023
Boosting Video Object Segmentation via Space-Time Correspondence Learning
CVPR 2023
Unified Mask Embedding and Correspondence Learning for Self-Supervised Video Segmentation
CVPR 2023
Bird's-Eye-View Scene Graph for Vision-Language Navigation
ICCV 2023
Logic-induced Diagnostic Reasoning for Semi-supervised Semantic Segmentation
ICCV 2023
DREAMWALKER: Mental Planning for Continuous Vision-Language Navigation
ICCV 2023
Omnidirectional Information Gathering for Knowledge Transfer-Based Audio-Visual Navigation
ICCV 2023
LogicSeg: Parsing Visual Semantics with Neural Logic Learning and Reasoning
ICCV 2023
E^2VPT: An Effective and Efficient Approach for Visual Prompt Tuning
ICCV 2023
Large-Scale Person Detection and Localization Using Overhead Fisheye Cameras
ICCV 2023
Clustering based Point Cloud Representation Learning for 3D Analysis
ICCV 2023
Visual Recognition with Deep Nearest Centroids
ICLR 2023
CLUSTSEG: Clustering for Universal Segmentation
ICML 2023
Deep Hierarchical Semantic Segmentation
CVPR 2022
Counterfactual Cycle-Consistent Learning for Instruction Following and Generation in Vision-Language Navigation
CVPR 2022
Rethinking Semantic Segmentation: A Prototype View
CVPR 2022
Visual Abductive Reasoning
CVPR 2022
Learning Equivariant Segmentation with Instance-Unique Querying
NIPS 2022
GMMSeg: Gaussian Mixture based Generative Semantic Segmentation Models
NIPS 2022
Towards Versatile Embodied Navigation
NIPS 2022
Towards Interpretable Video Super-Resolution via Alternating Optimization
ECCV 2022
Semi-Supervised 3D Object Detection with Proficient Teachers
ECCV 2022
ProposalContrast: Unsupervised Pre-training for LiDAR-Based 3D Object Detection
ECCV 2022
Reference-Based Image Super-Resolution with Deformable Attention Transformer
ECCV 2022
Locality-Aware Inter- and Intra-Video Reconstruction for Self-Supervised Correspondence Learning
CVPR 2022
Exploring Cross-Image Pixel Contrast for Semantic Segmentation
ICCV 2021
Collaborative Spatial-Temporal Modeling for Language-Queried Video Actor Segmentation
CVPR 2021
Face Forensics in the Wild
CVPR 2021
Differentiable Multi-Granularity Human Representation Learning for Instance-Aware Human Semantic Parsing
CVPR 2021
Structured Scene Memory for Vision-Language Navigation
CVPR 2021
Hierarchical Human Parsing With Typed Part-Relation Reasoning
CVPR 2020
Mining Cross-Image Semantics for Weakly Supervised Semantic Segmentation
ECCV 2020
Video Object Segmentation with Episodic Graph Memory Networks
ECCV 2020
Weakly Supervised 3D Object Detection from Lidar Point Cloud
ECCV 2020
Active Visual Information Gathering for Vision-Language Navigation
ECCV 2020
A Unified Object Motion and Affinity Model for Online Multi-Object Tracking
CVPR 2020
Learning Video Object Segmentation From Unlabeled Videos
CVPR 2020
Cascaded Human-Object Interaction Recognition
CVPR 2020
Shifting More Attention to Video Salient Object Detection
CVPR 2019
Understanding Human Gaze Communication by Spatio-Temporal Graph Reasoning
ICCV 2019
Learning Compositional Neural Information Fusion for Human Parsing
ICCV 2019
Human-Aware Motion Deblurring
ICCV 2019
Reasoning Visual Dialogs With Structural and Partial Observations
CVPR 2019
An Iterative and Cooperative Top-Down and Bottom-Up Inference Network for Salient Object Detection
CVPR 2019
Improving Neural Machine Translation by Achieving Knowledge Transfer with Sentence Alignment Learning
CONLL 2019
Optimizing the F-Measure for Threshold-Free Salient Object Detection
ICCV 2019
See More, Know More: Unsupervised Video Object Segmentation With Co-Attention Siamese Networks
CVPR 2019
Learning Unsupervised Video Object Segmentation Through Visual Attention
CVPR 2019
Salient Object Detection With Pyramid Attention and Salient Edges
CVPR 2019
Zero-Shot Video Object Segmentation via Attentive Graph Neural Networks
ICCV 2019
Pyramid Dilated Deeper ConvLSTM for Video Salient Object Detection
ECCV 2018
Attentive Fashion Grammar Network for Fashion Landmark Detection and Clothing Category Classification
CVPR 2018
Salient Object Detection Driven by Fixation Prediction
CVPR 2018
Hyperparameter Optimization for Tracking With Continuous Deep Q-Learning
CVPR 2018
Learning Descriptor Networks for 3D Shape Synthesis and Analysis
CVPR 2018
Inferring Shared Attention in Social Scene Videos
CVPR 2018
Learning Human-Object Interactions by Graph Parsing Neural Networks
ECCV 2018
Revisiting Video Saliency: A Large-Scale Benchmark and a New Model
CVPR 2018
Super-Trajectory for Video Segmentation
ICCV 2017
Deep Cropping via Attention Box Prediction and Aesthetics Assessment
ICCV 2017
Saliency-Aware Geodesic Video Object Segmentation
CVPR 2015