conftrace_

Hengshuang Zhao

88 papers · 2017–2026 · 9 conferences · across top CS/AI conferences

Achievements

Jump to papers ↓
+15 more ↓ πŸƒ Academic Marathon (8) 🧭 Keyword Pioneer 🌍 Conference Polyglot (9) πŸŒ‰ Interdisciplinary Bridge 🐝 Cross-Pollinator (14)
🐝 Cross-Pollinator (14) 🌈 Renaissance Researcher (7) πŸ—ΊοΈ Taxonomy Completionist (95) 🏠 Conference Loyalist (38) πŸ† Grand Slam πŸ”¬ Deep Specialist (22) πŸ‘₯ Mega-Team (30) πŸ† Keyword Champion (2) 🀝 Dynamic Duo (24) πŸ’Ž Century Club (87) πŸ—ƒοΈ Keyword Collector (316) πŸ”₯ Unstoppable (9) ❓ The Questioner ⚑ Prolific Year (25) πŸš€ Conference Pioneer

Conferences

CVPR (38) ECCV (14) NIPS (12) ICCV (11) ICML (6) AAAI (2) ICLR (2) IJCAI (2) EMNLP (1)

Papers

Game Ground Bench: Probing the Limits of LVLMs in Complex Semantic Grounding Across Game Universes AAAI 2026 EMOVA: Empowering Language Models to See, Hear and Speak with Vivid Emotions CVPR 2025 HiRes-LLaVA: Restoring Fragmentation Input in High-Resolution Large Vision-Language Models CVPR 2025 Sonata: Self-Supervised Learning of Reliable Point Representations CVPR 2025 UniReal: Universal Image Generation and Editing via Learning Real-world Dynamics CVPR 2025 Empowering Large Language Models with 3D Situation Awareness CVPR 2025 DriveGPT4-V2: Harnessing Large Language Model Capabilities for Enhanced Closed-Loop Autonomous Driving CVPR 2025 TGDPO: Harnessing Token-Level Reward Guidance for Enhancing Direct Preference Optimization ICML 2025 HaploVL: A Single-Transformer Baseline for Multi-Modal Understanding ICML 2025 Orient Anything: Learning Robust Object Orientation Estimation from Rendering 3D Models ICML 2025 BOOD: Boundary-based Out-Of-Distribution Data Generation ICML 2025 LARM: Large Auto-Regressive Model for Long-Horizon Embodied Intelligence ICML 2025 VIP: Vision Instructed Pre-training for Robotic Manipulation ICML 2025 OmniBind: Large-scale Omni Multimodal Representation via Binding Spaces ICLR 2025 ViLLa: Video Reasoning Segmentation with Large Language Model ICCV 2025 DisCo: Towards Distinct and Coherent Visual Encapsulation in Video MLLMs ICCV 2025 StableDepth: Scene-Consistent and Scale-Invariant Monocular Depth ICCV 2025 DiffDoctor: Diagnosing Image Diffusion Models Before Treating ICCV 2025 HERMES: A Unified Self-Driving World Model for Simultaneous 3D Scene Understanding and Generation ICCV 2025 Enhancing LLM Knowledge Learning through Generalization EMNLP 2025 PanDA: Towards Panoramic Depth Anything with Unlabeled Panoramas and Mobius Spatial Augmentation CVPR 2025 SpatialCLIP: Learning 3D-aware Image Representations from Spatially Discriminative Language CVPR 2025 UniMODE: Unified Monocular 3D Object Detection CVPR 2024 LION: Linear Group RNN for 3D Object Detection in Point Clouds NIPS 2024 Depth Anything V2 NIPS 2024 SyncVIS: Synchronized Video Instance Segmentation NIPS 2024 One for All: Multi-Domain Joint Training for Point Cloud Based 3D Object Detection NIPS 2024 Zero-shot Image Editing with Reference Imitation NIPS 2024 LiT: Unifying LiDAR "Languages" with LiDAR Translator NIPS 2024 GPT4Point: A Unified Framework for Point-Language Understanding and Generation CVPR 2024 OA-CNNs: Omni-Adaptive Sparse CNNs for 3D Semantic Segmentation CVPR 2024 Depth Anything: Unleashing the Power of Large-Scale Unlabeled Data CVPR 2024 Towards Large-scale 3D Representation Learning with Multi-dataset Point Prompt Training CVPR 2024 UniPAD: A Universal Pre-training Paradigm for Autonomous Driving CVPR 2024 DreamComposer: Controllable 3D Object Generation via Multi-View Conditions CVPR 2024 Visual Programming for Zero-shot Open-Vocabulary 3D Visual Grounding CVPR 2024 Point Transformer V3: Simpler Faster Stronger CVPR 2024 GroupContrast: Semantic-aware Self-supervised Representation Learning for 3D Understanding CVPR 2024 AnyDoor: Zero-shot Object-level Image Customization CVPR 2024 LivePhoto: Real Image Animation with Text-guided Motion Control ECCV 2024 Pixel-GS Density Control with Pixel-aware Gradient for 3D Gaussian Splatting ECCV 2024 InsMapper: Exploring Inner-instance Information for Vectorized HD Mapping ECCV 2024 Mind the Interference: Retaining Pre-trained Knowledge in Parameter Efficient Continual Learning of Vision-Language Models ECCV 2024 OV-Uni3DETR: Towards Unified Open-Vocabulary 3D Object Detection via Cycle-Modality Propagation ECCV 2024 LogoSticker: Inserting Logos into Diffusion Models for Customized Generation ECCV 2024 OpenIns3D: Snap and Lookup for 3D Open-vocabulary Instance Segmentation ECCV 2024 Influencer Backdoor Attack on Semantic Segmentation ICLR 2024 CorresNeRF: Image Correspondence Priors for Neural Radiance Fields NIPS 2023 Uni3DETR: Unified 3D Detection Transformer NIPS 2023 TMT-VIS: Taxonomy-aware Multi-dataset Joint Training for Video Instance Segmentation NIPS 2023 FreeMask: Synthetic Images with Dense Annotations Make Stronger Segmentation Models NIPS 2023 Mod-Squad: Designing Mixtures of Experts As Modular Multi-Task Learners CVPR 2023 Masked Scene Contrast: A Scalable Framework for Unsupervised 3D Representation Learning CVPR 2023 Open-vocabulary Panoptic Segmentation with Embedding Modulation ICCV 2023 Universal Adaptive Data Augmentation IJCAI 2023 BT^2: Backward-compatible Training with Basis Transformation ICCV 2023 Shrinking Class Space for Enhanced Certainty in Semi-Supervised Learning ICCV 2023 Detecting Everything in the Open World: Towards Universal Object Detection CVPR 2023 Semantics-Aware Dynamic Localization and Refinement for Referring Image Segmentation AAAI 2023 FocalClick: Towards Practical Interactive Image Segmentation CVPR 2022 MTFormer: Multi-task Learning via Transformer and Cross-Task Reasoning ECCV 2022 SegPGD: An Effective and Efficient Adversarial Attack for Evaluating and Boosting Segmentation Robustness ECCV 2022 DecoupleNet: Decoupled Network for Domain Adaptive Semantic Segmentation ECCV 2022 Point Transformer V2: Grouped Vector Attention and Partition-based Pooling NIPS 2022 Stratified Transformer for 3D Point Cloud Segmentation CVPR 2022 LAVT: Language-Aware Vision Transformer for Referring Image Segmentation CVPR 2022 PhysFormer: Facial Video-Based Physiological Measurement With Temporal Difference Transformer CVPR 2022 Generalized Few-Shot Semantic Segmentation CVPR 2022 Point Transformer ICCV 2021 Fully Convolutional Networks for Panoptic Segmentation CVPR 2021 Bidirectional Projection Network for Cross Dimension Scene Understanding CVPR 2021 Do Different Tracking Tasks Require Different Appearance Models? NIPS 2021 Rethinking Semantic Segmentation From a Sequence-to-Sequence Perspective With Transformers CVPR 2021 Semi-Supervised Semantic Segmentation With Directional Context-Aware Consistency CVPR 2021 Distilling Knowledge via Knowledge Review CVPR 2021 Dual-Cross Central Difference Network for Face Anti-Spoofing IJCAI 2021 PAConv: Position Adaptive Convolution With Dynamic Kernel Assembling on Point Clouds CVPR 2021 Dynamic Divide-and-Conquer Adversarial Training for Robust Semantic Segmentation ICCV 2021 PointGroup: Dual-Set Point Grouping for 3D Instance Segmentation CVPR 2020 Exploring Self-Attention for Image Recognition CVPR 2020 UPSNet: A Unified Panoptic Segmentation Network CVPR 2019 PointWeb: Enhancing Local Neighborhood Features for Point Cloud Processing CVPR 2019 Hierarchical Point-Edge Interaction Network for Point Cloud Semantic Segmentation ICCV 2019 SegStereo: Exploiting Semantic Information for Disparity Estimation ECCV 2018 ICNet for Real-Time Semantic Segmentation on High-Resolution Images ECCV 2018 Compositing-aware Image Search ECCV 2018 PSANet: Point-wise Spatial Attention Network for Scene Parsing ECCV 2018 Pyramid Scene Parsing Network CVPR 2017