Gao Huang

114 papers · 2013–2026 · 11 conferences · across top CS/AI conferences

Achievements

+17 more ↓

🧭 Keyword Pioneer 🌍 Conference Polyglot (10) 🗺️ Taxonomy Completionist (10) 🌉 Interdisciplinary Bridge 🏃 Academic Marathon (12)

🗺️ Taxonomy Completionist (10) 🧭 Keyword Pioneer 🏃 Academic Marathon (12) 🌟 Keyword Trendsetter Combo (3) 🏠 Conference Loyalist (25) 🤝 Dynamic Duo (49) 🔬 Deep Specialist (18) 🧬 Topic Evolution 🏆 Keyword Champion (5) 👑 Triple Crown 🏆 Grand Slam 🗃️ Keyword Collector (424) 💎 Century Club (111) 🔥 Unstoppable (10) 🚀 Conference Pioneer ❓ The Questioner ⚡ Prolific Year (20)

Conferences

CVPR (32) NIPS (25) ICCV (13) ECCV (12) AAAI (11) ICLR (11) ICML (5) ACL (2) EACL (1) MICCAI (1) NAACL (1)

Top co-authors

Shiji Song (49) Yizeng Han (21) Yulin Wang (21) Yifan Pu (12) Xuran Pan (12) Jiayi Guo (12) Jifeng Dai (11) Zhuofan Xia (11) Yang Yue (11) Xizhou Zhu (11)

Research topics

Core AI (1)

Keywords

vision transformer (9) image classification (8) efficient computing (7) diffusion model (7) object detection (7) model compression (7) representation learning (7) multimodal large language model (5) adaptive inference (5) neural network optimization (5) transfer learning (5) convolutional neural network (5) image generation (5) self-supervised learning (4) dynamic inference (4) inference efficiency (4) feature reuse (4) contrastive learning (4) offline reinforcement learning (4) spatial redundancy (4)

Papers

Vision Transformers Are Circulant Attention Learners AAAI 2026 SpatialActor: Exploring Disentangled Spatial Representations for Robust Robotic Manipulation AAAI 2026 Are My Optimized Prompts Compromised? Exploring Vulnerabilities of LLM-based Optimizers EACL 2026 Dynamic Diffusion Transformer ICLR 2025 GridMix: Exploring Spatial Modulation for Neural Fields in PDE Modeling ICLR 2025 DiveR-CT: Diversity-enhanced Red Teaming Large Language Model Assistants with Relaxing Constraints AAAI 2025 DenseGrounding: Improving Dense Language-Vision Semantics for Ego-centric 3D Visual Grounding ICLR 2025 Everything to the Synthetic: Diffusion-driven Test-time Adaptation via Synthetic-Domain Alignment CVPR 2025 Differential Transformer ICLR 2025 CODA: Repurposing Continuous VAEs for Discrete Tokenization ICCV 2025 IMG: Calibrating Diffusion Models via Implicit Multimodal Guidance ICCV 2025 ART: Anonymous Region Transformer for Variable Multi-Layer Transparent Image Generation CVPR 2025 HoVLE: Unleashing the Power of Monolithic Vision-Language Models with Holistic Vision-Language Embedding CVPR 2025 CheXWorld: Exploring Image World Modeling for Radiograph Representation Learning CVPR 2025 DTOS: Dynamic Time Object Sensing with Large Multimodal Model CVPR 2025 4D LangSplat: 4D Language Gaussian Splatting via Multimodal Large Language Models CVPR 2025 ProxyTransformation: Preshaping Point Cloud Manifold With Proxy Attention For 3D Visual Grounding CVPR 2025 Model Surgery: Modulating LLM’s Behavior Via Simple Parameter Editing NAACL 2025 EchoWorld: Learning Motion-Aware World Models for Echocardiography Probe Guidance CVPR 2025 Towards Understanding Text Hallucination of Diffusion Models via Local Generation Bias ICLR 2025 How Far Is Video Generation from World Model: A Physical Law Perspective ICML 2025 Boosting LLM Agents with Recursive Contemplation for Effective Deception Handling ACL 2024 Glyph-ByT5: A Customized Text Encoder for Accurate Visual Text Rendering ECCV 2024 Agent Attention: On the Integration of Softmax and Linear Attention ECCV 2024 DyFADet: Dynamic Feature Aggregation for Temporal Action Detection ECCV 2024 Segment3D: Learning Fine-Grained Class-Agnostic 3D Segmentation without Manual Labels ECCV 2024 GRA: Detecting Oriented Objects through Group-wise Rotating and Attention ECCV 2024 AdaNAT: Exploring Adaptive Policy for Token-Based Image Generation ECCV 2024 Efficient Diffusion Transformer with Step-wise Dynamic Attention Mediators ECCV 2024 SimPro: A Simple Probabilistic Framework Towards Realistic Long-Tailed Semi-Supervised Learning ICML 2024 Smooth Diffusion: Crafting Smooth Latent Spaces in Diffusion Models CVPR 2024 Prompt-Free Diffusion: Taking "Text" out of Text-to-Image Diffusion Models CVPR 2024 Revisiting Non-Autoregressive Transformers for Efficient Image Synthesis CVPR 2024 Mask Grounding for Referring Image Segmentation CVPR 2024 GSVA: Generalized Segmentation via Multimodal Large Language Models CVPR 2024 Learning 1D Causal Visual Representation with De-focus Attention Networks NIPS 2024 DeeR-VLA: Dynamic Inference of Multimodal Large Language Models for Efficient Robot Execution NIPS 2024 Training an Open-Vocabulary Monocular 3D Detection Model without 3D Data NIPS 2024 Bridging the Divide: Reconsidering Softmax and Linear Attention NIPS 2024 ENAT: Rethinking Spatial-temporal Interactions in Token-based Image Synthesis NIPS 2024 COVE: Unleashing the Diffusion Feature Correspondence for Consistent Video Editing NIPS 2024 Dynamic Tuning Towards Parameter and Inference Efficiency for ViT Adaptation NIPS 2024 Demystify Mamba in Vision: A Linear Attention Perspective NIPS 2024 ADDP: Learning General Representations for Image Recognition and Generation with Alternating Denoising Diffusion Process ICLR 2024 LLaVA-UHD: an LMM Perceiving any Aspect Ratio and High-Resolution Images ECCV 2024 Cardiac Copilot: Automatic Probe Guidance for Echocardiography with World Model MICCAI 2024 Exploring Temporal Feature Correlation for Efficient and Stable Video Semantic Segmentation AAAI 2024 ExpeL: LLM Agents Are Experiential Learners AAAI 2024 PsychoGAT: A Novel Psychological Measurement Paradigm through Interactive Fiction Games with LLM Agents ACL 2024 Causal Intervention for Human Trajectory Prediction with Cross Attention Mechanism AAAI 2023 Rank-DETR for High Quality Object Detection NIPS 2023 STORM: Efficient Stochastic Transformer based World Models for Reinforcement Learning NIPS 2023 Train Once, Get a Family: State-Adaptive Balances for Offline-to-Online Reinforcement Learning NIPS 2023 Understanding, Predicting and Better Resolving Q-Value Divergence in Offline-RL NIPS 2023 Boosted Dynamic Neural Networks AAAI 2023 Value-Consistent Representation Learning for Data-Efficient Reinforcement Learning AAAI 2023 Towards All-in-One Pre-Training via Maximizing Multi-Modal Mutual Information CVPR 2023 BEVFormer v2: Adapting Modern Image Backbones to Bird's-Eye-View Recognition via Perspective Supervision CVPR 2023 Zero-Shot Generative Model Adaptation via Image-Specific Prompt Learning CVPR 2023 Siamese Image Modeling for Self-Supervised Vision Representation Learning CVPR 2023 Slide-Transformer: Hierarchical Vision Transformer With Local Self-Attention CVPR 2023 FLatten Transformer: Vision Transformer using Focused Linear Attention ICCV 2023 Dynamic Perceiver for Efficient Visual Recognition ICCV 2023 Adaptive Rotated Convolution for Rotated Object Detection ICCV 2023 EfficientTrain: Exploring Generalized Curriculum Learning for Training Visual Backbones ICCV 2023 Deep Incubation: Training Large Models by Divide-and-Conquering ICCV 2023 Borrowing Knowledge From Pre-trained Language Model: A New Data-efficient Visual Learning Paradigm ICCV 2023 Budgeted Training for Vision Transformer ICLR 2023 Boosting Offline Reinforcement Learning with Action Preference Query ICML 2023 Efficient Knowledge Distillation from Model Checkpoints NIPS 2022 DiSparse: Disentangled Sparsification for Multitask Model Compression CVPR 2022 On the Integration of Self-Attention and Convolution CVPR 2022 Pseudo-Q: Generating Pseudo Language Queries for Visual Grounding CVPR 2022 Provable General Function Class Representation Learning in Multitask Bandits and MDP NIPS 2022 Contrastive Language-Image Pre-Training with Knowledge Graphs NIPS 2022 Assessing a Single Image in Reference-Guided Image Synthesis AAAI 2022 A Mixture Of Surprises for Unsupervised Reinforcement Learning NIPS 2022 Latency-aware Spatial-wise Dynamic Networks NIPS 2022 AdaFocusV3: On Unified Spatial-Temporal Dynamic Video Recognition ECCV 2022 AutoLoss-Zero: Searching Loss Functions From Scratch for Generic Tasks CVPR 2022 Exploring the Equivalence of Siamese Self-Supervised Learning via a Unified Gradient Framework CVPR 2022 AdaFocus V2: End-to-End Training of Spatial Dynamic Networks for Video Recognition CVPR 2022 ActiveNeRF: Learning Where to See with Uncertainty Estimation ECCV 2022 Learning to Weight Samples for Dynamic Early-Exiting Networks ECCV 2022 Vision Transformer With Deformable Attention CVPR 2022 Auto Seg-Loss: Searching Metric Surrogates for Semantic Segmentation ICLR 2021 3D Object Detection With Pointformer CVPR 2021 Cross-Iteration Batch Normalization CVPR 2021 CondenseNet V2: Sparse Feature Reactivation for Deep Networks CVPR 2021 Adaptive Focus for Efficient Video Recognition ICCV 2021 Towards Learning Spatially Discriminative Feature Representations ICCV 2021 Frequency Domain Image Translation: More Photo-Realistic, Better Identity-Preserving ICCV 2021 Searching Parameterized AP Loss for Object Detection NIPS 2021 Not All Images are Worth 16x16 Words: Dynamic Transformers for Efficient Image Recognition NIPS 2021 Believe What You See: Implicit Constraint Approach for Offline Multi-Agent Reinforcement Learning NIPS 2021 Revisiting Locally Supervised Learning: an Alternative to End-to-end Training ICLR 2021 Evolving Attention with Residual Convolutions ICML 2021 Resolution Adaptive Networks for Efficient Inference CVPR 2020 Glance and Focus: a Dynamic Approach to Reducing Spatial Redundancy in Image Classification NIPS 2020 Domain Conditioned Adaptation Network AAAI 2020 Spatially Adaptive Inference with Stochastic Feature Sampling and Interpolation ECCV 2020 Rethinking the Value of Network Pruning ICLR 2019 Horizontal Pyramid Matching for Person Re-Identification AAAI 2019 Regularized Anderson Acceleration for Off-Policy Deep Reinforcement Learning NIPS 2019 Asymmetric Valleys: Beyond Sharp and Flat Local Minima NIPS 2019 Improved Techniques for Training Adaptive Deep Networks ICCV 2019 Implicit Semantic Data Augmentation for Deep Networks NIPS 2019 CondenseNet: An Efficient DenseNet Using Learned Group Convolutions CVPR 2018 Resource Aware Person Re-Identification Across Multiple Resolutions CVPR 2018 Multi-Scale Dense Networks for Resource Efficient Image Classification ICLR 2018 Learning Efficient Convolutional Networks Through Network Slimming ICCV 2017 Densely Connected Convolutional Networks CVPR 2017 Supervised Word Mover's Distance NIPS 2016 Anytime Representation Learning ICML 2013