Di Huang

85 papers · 2018–2026 · 12 conferences · across top CS/AI conferences

Achievements

+15 more ↓

🏃 Academic Marathon (7) 🌍 Conference Polyglot (11) 🧭 Keyword Pioneer 🌉 Interdisciplinary Bridge 🐝 Cross-Pollinator (13)

🐝 Cross-Pollinator (13) 🌈 Renaissance Researcher (9) 🗺️ Taxonomy Completionist (115) 🏠 Conference Loyalist (21) 🧬 Topic Evolution 🤝 Dynamic Duo (14) 🏆 Keyword Champion (2) 👑 Triple Crown 🏆 Grand Slam 🔬 Deep Specialist (15) 🚀 Conference Pioneer 🗃️ Keyword Collector (321) 🔥 Unstoppable (8) 💎 Century Club (78) ⚡ Prolific Year (13)

Conferences

AAAI (21) CVPR (21) ECCV (12) NIPS (9) ICCV (8) ICLR (7) ICML (2) ACL (1) CORL (1) IJCAI (1) OSDI (1) WACV (1)

Top co-authors

Wanli Ouyang (14) Yunhong Wang (14) JIAXIN CHEN (14) Yunji Chen (13) Rui Zhang (13) Zidong Du (12) Tong He (12) Qi Guo (12) Hongyu Yang (11) Jiaming Guo (9)

Research topics

Robotics (1) Optimization (1)

Keywords

large language model (7) object detection (6) point cloud (5) multi-agent system (4) code generation (4) 3d object detection (4) self-supervised learning (4) neural network optimization (3) 3d reconstruction (3) robotic grasping (3) domain adaptation (3) contrastive learning (3) attention mechanism (3) 3d vision (3) adversarial learning (3) representation learning (3) feature learning (3) vision transformer (3) transfer learning (3) model compression (3)

Papers

Safety Alignment of Large Language Models via Contrasting Safe and Harmful Distributions AAAI 2026 StepFun-Formalizer: Unlocking the Autoformalization Potential of LLMs Through Knowledge-Reasoning Fusion AAAI 2026 RMAdapter: Reconstruction-based Multi-Modal Adapter for Vision-Language Models AAAI 2026 Run, Ruminate, and Regulate: A Dual-process Thinking System for Vision-and-Language Navigation AAAI 2026 SceneGenesis: 3D Scene Synthesis via Semantic Structural Priors and Mesh-Guided Video-Geometry Fusion AAAI 2026 QiMeng-CRUX: Narrowing the Gap Between Natural Language and Verilog via Core Refined Understanding eXpression AAAI 2026 QiMeng-PRepair: Precise Code Repair via Edit-Aware Reward Optimization ACL 2026 APHQ-ViT: Post-Training Quantization with Average Perturbation Hessian Based Reconstruction for Vision Transformers CVPR 2025 QiMeng-Xpiler: Transcompiling Tensor Programs for Deep Learning Systems with a Neural-Symbolic Approach OSDI 2025 Where Am I and What Will I See: An Auto-Regressive Model for Spatial Localization and View Prediction ICLR 2025 Progressive Parameter Efficient Transfer Learning for Semantic Segmentation ICLR 2025 Depth Any Video with Scalable Synthetic Data ICLR 2025 ND-SDF: Learning Normal Deflection Fields for High-Fidelity Indoor Reconstruction ICLR 2025 GigaGS: 3D Gaussian Based Planar Representation for Large-Scene Surface Reconstruction AAAI 2025 Micro-macro Wavelet-based Gaussian Splatting for 3D Reconstruction from Unconstrained Images AAAI 2025 Unveiling the Knowledge of CLIP for Training-Free Open-Vocabulary Semantic Segmentation AAAI 2025 3D²-Actor: Learning Pose-Conditioned 3D-Aware Denoiser for Realistic Gaussian Avatar Modeling AAAI 2025 MeshAnything: Artist-Created Mesh Generation with Autoregressive Transformers ICLR 2025 InverseCoder: Self-improving Instruction-Tuned Code LLMs with Inverse-Instruct AAAI 2025 ShortFT: Diffusion Model Alignment via Shortcut-based Fine-Tuning ICCV 2025 Constraint-Aware Feature Learning for Parametric Point Cloud ICCV 2025 Towards Training-free Anomaly Detection with Vision and Language Foundation Models CVPR 2025 ComfyBench: Benchmarking LLM-based Agents in ComfyUI for Autonomously Designing Collaborative AI Systems CVPR 2025 CoSDH: Communication-Efficient Collaborative Perception via Supply-Demand Awareness and Intermediate-Late Hybridization CVPR 2025 Diffusion-4K: Ultra-High-Resolution Image Synthesis with Latent Diffusion Models CVPR 2025 GVGEN: Text-to-3D Generation with Volumetric Representation ECCV 2024 Active Perception for Grasp Detection via Neural Graspness Field NIPS 2024 Point Cloud Matters: Rethinking the Impact of Different Observation Spaces on Robot Learning NIPS 2024 Transforming Vision Transformer: Towards Efficient Multi-Task Asynchronous Learner NIPS 2024 NeuRodin: A Two-stage Framework for High-Fidelity Neural Surface Reconstruction NIPS 2024 MotionGPT: Finetuned LLMs Are General-Purpose Motion Generators AAAI 2024 Hypothesis, Verification, and Induction: Grounding Large Language Models with Self-Driven Skill Learning AAAI 2024 Emergent Communication for Numerical Concepts Generalization AAAI 2024 Generalizing 6-DoF Grasp Detection via Domain Prior Knowledge CVPR 2024 UniPAD: A Universal Pre-training Paradigm for Autonomous Driving CVPR 2024 InitNO: Boosting Text-to-Image Diffusion Models via Initial Noise Optimization CVPR 2024 Agent3D-Zero: An Agent for Zero-shot 3D Understanding ECCV 2024 AdaLog: Post-Training Quantization for Vision Transformers with Adaptive Logarithm Quantizer ECCV 2024 Multi-modal Relation Distillation for Unified 3D Representation Learning ECCV 2024 PredBench: Benchmarking Spatio-Temporal Prediction across Diverse Disciplines ECCV 2024 Crowd-SAM:SAM as a smart annotator for object detection in crowded scenes ECCV 2024 Rotation Has Two Sides: Evaluating Data Augmentation for Deep One-class Classification ICLR 2024 FiT: Flexible Vision Transformer for Diffusion Model ICML 2024 BirdSAT: Cross-View Contrastive Masked Autoencoders for Bird Species Classification and Mapping WACV 2024 Online Symbolic Regression with Informative Query AAAI 2023 Compressed Video Prompt Tuning NIPS 2023 Seeing is not always believing: Benchmarking Human and Model Perception of AI-Generated Images NIPS 2023 Adaptive Sparse Convolutional Networks With Global Context Enhancement for Faster Object Detection on Drone Images CVPR 2023 NeuFace: Realistic 3D Neural Face Rendering From Multi-View Images CVPR 2023 ANPL: Towards Natural Programming with Interactive Decomposition NIPS 2023 OcTr: Octree-Based Transformer for 3D Object Detection CVPR 2023 Unilaterally Aggregated Contrastive Learning with Hierarchical Augmentation for Anomaly Detection ICCV 2023 Ponder: Point Cloud Pre-training via Neural Rendering ICCV 2023 Denoising Diffusion Autoencoders are Unified Self-supervised Learners ICCV 2023 DR-Tune: Improving Fine-tuning of Pretrained Visual Models by Distribution Regularization with Semantic Calibration ICCV 2023 Emergent Communication for Rules Reasoning NIPS 2023 Learning Polysemantic Spoof Trace: A Multi-Modal Disentanglement Network for Face Anti-spoofing AAAI 2023 Neural Program Synthesis with Query ICLR 2022 ACGNet: Action Complement Graph Network for Weakly-Supervised Temporal Action Localization AAAI 2022 Target-Relevant Knowledge Preservation for Multi-Source Domain Adaptive Object Detection CVPR 2022 ImFace: A Nonlinear 3D Morphable Face Model With Implicit Neural Representations CVPR 2022 Representation Learning for Compressed Video Action Recognition via Attentive Cross-modal Interaction with Motion Enhancement IJCAI 2022 ABPN: Adaptive Blend Pyramid Network for Real-Time Local Retouching of Ultra High-Resolution Photo CVPR 2022 Video Anomaly Detection by Solving Decoupled Spatio-Temporal Jigsaw Puzzles ECCV 2022 Motion Sensitive Contrastive Learning for Self-Supervised Video Representation ECCV 2022 OnePose++: Keypoint-Free One-Shot Object Pose Estimation without CAD Models NIPS 2022 UFPMP-Det:Toward Accurate and Efficient Object Detection on Drone Imagery AAAI 2022 Entropy-Based Active Learning for Object Detection With Progressive Diversity Constraint CVPR 2022 CAT-Det: Contrastively Augmented Transformer for Multi-Modal 3D Object Detection CVPR 2022 Towards Scale Balanced 6-DoF Grasp Detection in Cluttered Scenes CORL 2022 Image Inpainting via Conditional Texture and Structure Dual Generation ICCV 2021 PR-GCN: A Deep Graph Convolutional Network With Point Refinement for 6D Pose Estimation ICCV 2021 PC-RGNN: Point Cloud Completion and Graph Neural Network for 3D Object Detection AAAI 2021 Beyond Synthetic Noise: Deep Learning on Controlled Noisy Labels ICML 2020 Improving Object Detection with Selective Self-Supervised Self-Training ECCV 2020 Multi-Scale Positive Sample Refinement for Few-Shot Object Detection ECCV 2020 Fixed-Point Back-Propagation Training CVPR 2020 Beyond 3DMM Space: Towards Fine-grained 3D Face Reconstruction ECCV 2020 Cross-domain Object Detection through Coarse-to-Fine Feature Adaptation CVPR 2020 DWM: A Decomposable Winograd Method for Convolution Acceleration AAAI 2020 Distraction-Aware Feature Learning for Human Attribute Recognition via Coarse-to-Fine Attention Mechanism AAAI 2020 Led3D: A Lightweight and Efficient Deep Approach to Recognizing Low-Quality 3D Faces CVPR 2019 Adaptive NMS: Refining Pedestrian Detection in a Crowd CVPR 2019 Learning Face Age Progression: A Pyramid Architecture of GANs CVPR 2018 Receptive Field Block Net for Accurate and Fast Object Detection ECCV 2018