Gao Huang
114 papers · 2013–2026 · 11 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+17 more ↓ Show less ↑
π§ Keyword Pioneer π Conference Polyglot (10) πΊοΈ Taxonomy Completionist (10) π Interdisciplinary Bridge π Academic Marathon (12)
πΊοΈ
Taxonomy Completionist
(10)
π§
Keyword Pioneer
π
Academic Marathon
(12)
π
Keyword Trendsetter Combo
(3)
π
Conference Loyalist
(25)
π€
Dynamic Duo
(49)
π¬
Deep Specialist
(18)
π§¬
Topic Evolution
π
Keyword Champion
(5)
π
Triple Crown
π
Grand Slam
ποΈ
Keyword Collector
(424)
π
Century Club
(111)
π₯
Unstoppable
(10)
π
Conference Pioneer
β
The Questioner
β‘
Prolific Year
(20)
Conferences
CVPR (32)
NIPS (25)
ICCV (13)
ECCV (12)
AAAI (11)
ICLR (11)
ICML (5)
ACL (2)
EACL (1)
MICCAI (1)
NAACL (1)
Top co-authors
Research topics
Keywords
vision transformer
(9)
image classification
(8)
efficient computing
(7)
diffusion model
(7)
object detection
(7)
model compression
(7)
representation learning
(7)
multimodal large language model
(5)
adaptive inference
(5)
neural network optimization
(5)
transfer learning
(5)
convolutional neural network
(5)
image generation
(5)
self-supervised learning
(4)
dynamic inference
(4)
inference efficiency
(4)
feature reuse
(4)
contrastive learning
(4)
offline reinforcement learning
(4)
spatial redundancy
(4)
Papers
Vision Transformers Are Circulant Attention Learners
AAAI 2026
SpatialActor: Exploring Disentangled Spatial Representations for Robust Robotic Manipulation
AAAI 2026
Are My Optimized Prompts Compromised? Exploring Vulnerabilities of LLM-based Optimizers
EACL 2026
Dynamic Diffusion Transformer
ICLR 2025
GridMix: Exploring Spatial Modulation for Neural Fields in PDE Modeling
ICLR 2025
DiveR-CT: Diversity-enhanced Red Teaming Large Language Model Assistants with Relaxing Constraints
AAAI 2025
DenseGrounding: Improving Dense Language-Vision Semantics for Ego-centric 3D Visual Grounding
ICLR 2025
Everything to the Synthetic: Diffusion-driven Test-time Adaptation via Synthetic-Domain Alignment
CVPR 2025
Differential Transformer
ICLR 2025
CODA: Repurposing Continuous VAEs for Discrete Tokenization
ICCV 2025
IMG: Calibrating Diffusion Models via Implicit Multimodal Guidance
ICCV 2025
ART: Anonymous Region Transformer for Variable Multi-Layer Transparent Image Generation
CVPR 2025
HoVLE: Unleashing the Power of Monolithic Vision-Language Models with Holistic Vision-Language Embedding
CVPR 2025
CheXWorld: Exploring Image World Modeling for Radiograph Representation Learning
CVPR 2025
DTOS: Dynamic Time Object Sensing with Large Multimodal Model
CVPR 2025
4D LangSplat: 4D Language Gaussian Splatting via Multimodal Large Language Models
CVPR 2025
ProxyTransformation: Preshaping Point Cloud Manifold With Proxy Attention For 3D Visual Grounding
CVPR 2025
Model Surgery: Modulating LLMβs Behavior Via Simple Parameter Editing
NAACL 2025
EchoWorld: Learning Motion-Aware World Models for Echocardiography Probe Guidance
CVPR 2025
Towards Understanding Text Hallucination of Diffusion Models via Local Generation Bias
ICLR 2025
How Far Is Video Generation from World Model: A Physical Law Perspective
ICML 2025
Boosting LLM Agents with Recursive Contemplation for Effective Deception Handling
ACL 2024
Glyph-ByT5: A Customized Text Encoder for Accurate Visual Text Rendering
ECCV 2024
Agent Attention: On the Integration of Softmax and Linear Attention
ECCV 2024
DyFADet: Dynamic Feature Aggregation for Temporal Action Detection
ECCV 2024
Segment3D: Learning Fine-Grained Class-Agnostic 3D Segmentation without Manual Labels
ECCV 2024
GRA: Detecting Oriented Objects through Group-wise Rotating and Attention
ECCV 2024
AdaNAT: Exploring Adaptive Policy for Token-Based Image Generation
ECCV 2024
Efficient Diffusion Transformer with Step-wise Dynamic Attention Mediators
ECCV 2024
SimPro: A Simple Probabilistic Framework Towards Realistic Long-Tailed Semi-Supervised Learning
ICML 2024
Smooth Diffusion: Crafting Smooth Latent Spaces in Diffusion Models
CVPR 2024
Prompt-Free Diffusion: Taking "Text" out of Text-to-Image Diffusion Models
CVPR 2024
Revisiting Non-Autoregressive Transformers for Efficient Image Synthesis
CVPR 2024
Mask Grounding for Referring Image Segmentation
CVPR 2024
GSVA: Generalized Segmentation via Multimodal Large Language Models
CVPR 2024
Learning 1D Causal Visual Representation with De-focus Attention Networks
NIPS 2024
DeeR-VLA: Dynamic Inference of Multimodal Large Language Models for Efficient Robot Execution
NIPS 2024
Training an Open-Vocabulary Monocular 3D Detection Model without 3D Data
NIPS 2024
Bridging the Divide: Reconsidering Softmax and Linear Attention
NIPS 2024
ENAT: Rethinking Spatial-temporal Interactions in Token-based Image Synthesis
NIPS 2024
COVE: Unleashing the Diffusion Feature Correspondence for Consistent Video Editing
NIPS 2024
Dynamic Tuning Towards Parameter and Inference Efficiency for ViT Adaptation
NIPS 2024
Demystify Mamba in Vision: A Linear Attention Perspective
NIPS 2024
ADDP: Learning General Representations for Image Recognition and Generation with Alternating Denoising Diffusion Process
ICLR 2024
LLaVA-UHD: an LMM Perceiving any Aspect Ratio and High-Resolution Images
ECCV 2024
Cardiac Copilot: Automatic Probe Guidance for Echocardiography with World Model
MICCAI 2024
Exploring Temporal Feature Correlation for Efficient and Stable Video Semantic Segmentation
AAAI 2024
ExpeL: LLM Agents Are Experiential Learners
AAAI 2024
PsychoGAT: A Novel Psychological Measurement Paradigm through Interactive Fiction Games with LLM Agents
ACL 2024
Causal Intervention for Human Trajectory Prediction with Cross Attention Mechanism
AAAI 2023
Rank-DETR for High Quality Object Detection
NIPS 2023
STORM: Efficient Stochastic Transformer based World Models for Reinforcement Learning
NIPS 2023
Train Once, Get a Family: State-Adaptive Balances for Offline-to-Online Reinforcement Learning
NIPS 2023
Understanding, Predicting and Better Resolving Q-Value Divergence in Offline-RL
NIPS 2023
Boosted Dynamic Neural Networks
AAAI 2023
Value-Consistent Representation Learning for Data-Efficient Reinforcement Learning
AAAI 2023
Towards All-in-One Pre-Training via Maximizing Multi-Modal Mutual Information
CVPR 2023
BEVFormer v2: Adapting Modern Image Backbones to Bird's-Eye-View Recognition via Perspective Supervision
CVPR 2023
Zero-Shot Generative Model Adaptation via Image-Specific Prompt Learning
CVPR 2023
Siamese Image Modeling for Self-Supervised Vision Representation Learning
CVPR 2023
Slide-Transformer: Hierarchical Vision Transformer With Local Self-Attention
CVPR 2023
FLatten Transformer: Vision Transformer using Focused Linear Attention
ICCV 2023
Dynamic Perceiver for Efficient Visual Recognition
ICCV 2023
Adaptive Rotated Convolution for Rotated Object Detection
ICCV 2023
EfficientTrain: Exploring Generalized Curriculum Learning for Training Visual Backbones
ICCV 2023
Deep Incubation: Training Large Models by Divide-and-Conquering
ICCV 2023
Borrowing Knowledge From Pre-trained Language Model: A New Data-efficient Visual Learning Paradigm
ICCV 2023
Budgeted Training for Vision Transformer
ICLR 2023
Boosting Offline Reinforcement Learning with Action Preference Query
ICML 2023
Efficient Knowledge Distillation from Model Checkpoints
NIPS 2022
DiSparse: Disentangled Sparsification for Multitask Model Compression
CVPR 2022
On the Integration of Self-Attention and Convolution
CVPR 2022
Pseudo-Q: Generating Pseudo Language Queries for Visual Grounding
CVPR 2022
Provable General Function Class Representation Learning in Multitask Bandits and MDP
NIPS 2022
Contrastive Language-Image Pre-Training with Knowledge Graphs
NIPS 2022
Assessing a Single Image in Reference-Guided Image Synthesis
AAAI 2022
A Mixture Of Surprises for Unsupervised Reinforcement Learning
NIPS 2022
Latency-aware Spatial-wise Dynamic Networks
NIPS 2022
AdaFocusV3: On Unified Spatial-Temporal Dynamic Video Recognition
ECCV 2022
AutoLoss-Zero: Searching Loss Functions From Scratch for Generic Tasks
CVPR 2022
Exploring the Equivalence of Siamese Self-Supervised Learning via a Unified Gradient Framework
CVPR 2022
AdaFocus V2: End-to-End Training of Spatial Dynamic Networks for Video Recognition
CVPR 2022
ActiveNeRF: Learning Where to See with Uncertainty Estimation
ECCV 2022
Learning to Weight Samples for Dynamic Early-Exiting Networks
ECCV 2022
Vision Transformer With Deformable Attention
CVPR 2022
Auto Seg-Loss: Searching Metric Surrogates for Semantic Segmentation
ICLR 2021
3D Object Detection With Pointformer
CVPR 2021
Cross-Iteration Batch Normalization
CVPR 2021
CondenseNet V2: Sparse Feature Reactivation for Deep Networks
CVPR 2021
Adaptive Focus for Efficient Video Recognition
ICCV 2021
Towards Learning Spatially Discriminative Feature Representations
ICCV 2021
Frequency Domain Image Translation: More Photo-Realistic, Better Identity-Preserving
ICCV 2021
Searching Parameterized AP Loss for Object Detection
NIPS 2021
Not All Images are Worth 16x16 Words: Dynamic Transformers for Efficient Image Recognition
NIPS 2021
Believe What You See: Implicit Constraint Approach for Offline Multi-Agent Reinforcement Learning
NIPS 2021
Revisiting Locally Supervised Learning: an Alternative to End-to-end Training
ICLR 2021
Evolving Attention with Residual Convolutions
ICML 2021
Resolution Adaptive Networks for Efficient Inference
CVPR 2020
Glance and Focus: a Dynamic Approach to Reducing Spatial Redundancy in Image Classification
NIPS 2020
Domain Conditioned Adaptation Network
AAAI 2020
Spatially Adaptive Inference with Stochastic Feature Sampling and Interpolation
ECCV 2020
Rethinking the Value of Network Pruning
ICLR 2019
Horizontal Pyramid Matching for Person Re-Identification
AAAI 2019
Regularized Anderson Acceleration for Off-Policy Deep Reinforcement Learning
NIPS 2019
Asymmetric Valleys: Beyond Sharp and Flat Local Minima
NIPS 2019
Improved Techniques for Training Adaptive Deep Networks
ICCV 2019
Implicit Semantic Data Augmentation for Deep Networks
NIPS 2019
CondenseNet: An Efficient DenseNet Using Learned Group Convolutions
CVPR 2018
Resource Aware Person Re-Identification Across Multiple Resolutions
CVPR 2018
Multi-Scale Dense Networks for Resource Efficient Image Classification
ICLR 2018
Learning Efficient Convolutional Networks Through Network Slimming
ICCV 2017
Densely Connected Convolutional Networks
CVPR 2017
Supervised Word Mover's Distance
NIPS 2016
Anytime Representation Learning
ICML 2013