Tai WANG
32 papers · 2020–2026 · 10 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+11 more ↓ Show less ↑
π Cross-Pollinator (9) π§ Keyword Pioneer π Academic Marathon (5) π Conference Polyglot (8) π Renaissance Researcher (9)
π
Renaissance Researcher
(9)
π
Interdisciplinary Bridge
πΊοΈ
Taxonomy Completionist
(43)
π
Keyword Champion
π¬
Deep Specialist
(11)
π§¬
Topic Evolution
π€
Dynamic Duo
(19)
ποΈ
Keyword Collector
(132)
β‘
Prolific Year
(12)
π₯
Unstoppable
(6)
π
Century Club
(30)
Conferences
ICCV (7)
CVPR (6)
ECCV (5)
NIPS (5)
CORL (4)
AAAI (1)
ACL (1)
EMNLP (1)
ICLR (1)
WACV (1)
Top co-authors
Keywords
scene understanding
(5)
3d object detection
(4)
autonomous driving
(3)
question answering
(3)
large language model
(3)
visual grounding
(3)
3d scene understanding
(3)
depth estimation
(3)
robot manipulation
(2)
computer vision
(2)
3d reconstruction
(2)
semantic segmentation
(2)
humanoid robot
(2)
object detection
(2)
embodied ai
(2)
robotic manipulation
(2)
3d vision
(2)
point cloud
(2)
multi-modal learning
(2)
language grounding
(2)
Papers
Unleashing Spatial Reasoning in Multimodal Large Language Models via Textual Representation Guided Reasoning
ACL 2026
Towards Efficient and Robust Manipulation via Multi-Frame Vision-Language-Action Modeling
AAAI 2026
GENMANIP: LLM-driven Simulation for Generalizable Instruction-Following Manipulation
CVPR 2025
RoboGround: Robotic Manipulation with Grounded Vision-Language Priors
CVPR 2025
LLaVA-3D: A Simple yet Effective Pathway to Empowering LMMs with 3D Capabilities
ICCV 2025
Language-to-Space Programming for Training-Free 3D Visual Grounding
EMNLP 2025
GLEAM: Learning Generalizable Exploration Policy for Active Mapping in Complex 3D Indoor Scene
ICCV 2025
Rethinking the Embodied Gap in Vision-and-Language Navigation: A Holistic Study of Physical and Visual Disparities
ICCV 2025
VFlowOpt: A Token Pruning Framework for LMMs with Visual Information Flow-Guided Optimization
ICCV 2025
Omni6D: Large-Vocabulary 3D Object Dataset for Category-Level 6D Object Pose Estimation
ECCV 2024
Learning to Adapt SAM for Segmenting Cross-domain Point Clouds
ECCV 2024
ScanReason: Empowering 3D Visual Grounding with Reasoning Capabilities
ECCV 2024
MMScan: A Multi-Modal 3D Scene Dataset with Hierarchical Grounded Language Annotations
NIPS 2024
OctreeOcc: Efficient and Multi-Granularity Occupancy Prediction Using Octree Queries
NIPS 2024
CooHOI: Learning Cooperative Human-Object Interaction with Manipulated Object Dynamics
NIPS 2024
Chat-Scene: Bridging 3D Scene and Large Language Models with Object Identifiers
NIPS 2024
VLM-Grounder: A VLM Agent for Zero-Shot 3D Visual Grounding
CORL 2024
EmbodiedScan: A Holistic Multi-Modal 3D Perception Suite Towards Embodied AI
CVPR 2024
GenNBV: Generalizable Next-Best-View Policy for Active 3D Reconstruction
CVPR 2024
PointLLM: Empowering Large Language Models to Understand Point Clouds
ECCV 2024
Unified Human-Scene Interaction via Prompted Chain-of-Contacts
ICLR 2024
Scene as Occupancy
ICCV 2023
DORT: Modeling Dynamic Objects in Recurrent for Multi-Camera 3D Object Detection and Tracking
CORL 2023
GeoMIM: Towards Better 3D Knowledge Transfer via Masked Image Modeling for Multi-view 3D Understanding
ICCV 2023
MonoDETR: Depth-guided Transformer for Monocular 3D Object Detection
ICCV 2023
MV-JAR: Masked Voxel Jigsaw and Reconstruction for LiDAR-Based Self-Supervised Pre-Training
CVPR 2023
Monocular 3D Object Detection with Depth from Motion
ECCV 2022
SIDE: Center-Based Stereo 3D Detector With Structure-Aware Instance Depth Estimation
WACV 2022
Balanced Chamfer Distance as a Comprehensive Metric for Point Cloud Completion
NIPS 2021
Cylindrical and Asymmetrical 3D Convolution Networks for LiDAR Segmentation
CVPR 2021
Probabilistic and Geometric Depth: Detecting Objects in Perspective
CORL 2021
Reconfigurable Voxels: A New Representation for LiDAR-Based Point Clouds
CORL 2020