Yuntao Chen

29 papers · 2019–2025 · 7 conferences · across top CS/AI conferences

Achievements

+10 more ↓

🏃 Academic Marathon (6) 🌉 Interdisciplinary Bridge 🧭 Keyword Pioneer 🌍 Conference Polyglot (7) 🐝 Cross-Pollinator (4)

🌍 Conference Polyglot (7) 🏃 Academic Marathon (6) 🌈 Renaissance Researcher (6) 🤝 Dynamic Duo (24) 🔬 Deep Specialist (10) 🏆 Keyword Champion (2) 💎 Century Club (29) ⚡ Prolific Year (9) 🔥 Unstoppable (5) 🗃️ Keyword Collector (135)

Conferences

CVPR (9) ICCV (7) ECCV (4) NIPS (4) ACL (2) ICLR (2) JMLR (1)

Top co-authors

Zhaoxiang Zhang (24) Yuqi Wang (9) Naiyan Wang (7) Jiawei He (6) Lue Fan (6) Hongxin Li (5) ZHAO-XIANG ZHANG (4) Jingran Su (4) Jifeng Dai (4) Xizhou Zhu (4)

Keywords

autonomous driving (6) object detection (5) 3d object detection (4) instance segmentation (4) world model (3) vision-language model (3) semantic segmentation (3) point cloud (3) end-to-end planning (2) unsupervised object detection (2) graphical user interface (2) unsupervised learning (2) lidar point cloud (2) spectral clustering (2) multi-modal learning (2) trajectory planning (2) deformable convolution (1) video prediction (1) autoregressive transformer (1) image generation (1)

Papers

DrivingGPT: Unifying Driving World Modeling and Planning with Multi-modal Autoregressive Transformers ICCV 2025 UIPro: Unleashing Superior Interaction Capability For GUI Agents ICCV 2025 AutoGUI: Scaling GUI Grounding with Automatic Functionality Annotations from LLMs ACL 2025 Activation Steering Decoding: Mitigating Hallucination in Large Vision-Language Models through Bidirectional Hidden State Intervention ACL 2025 Enhancing End-to-End Autonomous Driving with Latent World Model ICLR 2025 FreeVS: Generative View Synthesis on Free Driving Trajectory ICLR 2025 Dita: Scaling Diffusion Transformer for Generalist Vision-Language-Action Policy ICCV 2025 Monocular Occupancy Prediction for Scalable Indoor Scenes ECCV 2024 Continual Forgetting for Pre-trained Vision Models CVPR 2024 DrivingDojo Dataset: Advancing Interactive and Knowledge-Enriched Driving World Model NIPS 2024 OpenSatMap: A Fine-grained High-resolution Satellite Dataset for Large-scale Map Construction NIPS 2024 OneTrack: Demystifying the Conflict Between Detection and Tracking in End-to-End 3D Trackers ECCV 2024 CSOT: Cross-Scan Object Transfer for Semi-Supervised LiDAR Object Detection ECCV 2024 Driving into the Future: Multiview Visual Forecasting and Planning with World Model for Autonomous Driving CVPR 2024 PanoOcc: Unified Occupancy Representation for Camera-based 3D Panoptic Segmentation CVPR 2024 Efficient Deformable ConvNets: Rethinking Dynamic and Sparse Operator for Vision Applications CVPR 2024 FrustumFormer: Adaptive Instance-Aware Resampling for Multi-View 3D Detection CVPR 2023 BEVFormer v2: Adapting Modern Image Backbones to Bird's-Eye-View Recognition via Perspective Supervision CVPR 2023 3D Video Object Detection With Learnable Object-Centric Global Optimization CVPR 2023 Once Detected, Never Lost: Surpassing Human Performance in Offline LiDAR based 3D Object Detection ICCV 2023 SheetCopilot: Bringing Software Productivity to the Next Level through Large Language Models NIPS 2023 4D Unsupervised Object Discovery NIPS 2022 GIFS: Neural Implicit Function for General Shape Representation CVPR 2022 Densely Constrained Depth Estimator for Monocular 3D Object Detection ECCV 2022 Unsupervised Object Detection With LIDAR Clues CVPR 2021 Sequence Level Semantics Aggregation for Video Object Detection ICCV 2019 Scale-Aware Trident Networks for Object Detection ICCV 2019 Spectral Feature Transformation for Person Re-Identification ICCV 2019 SimpleDet: A Simple and Versatile Distributed Framework for Object Detection and Instance Recognition JMLR 2019