Han Hu

91 papers · 2013–2026 · 12 conferences · across top CS/AI conferences

Achievements

+17 more ↓

🌍 Conference Polyglot (11) 🏃 Academic Marathon (12) 🐣 Hot Topic Early Bird 🌉 Interdisciplinary Bridge 🐝 Cross-Pollinator (14)

🐝 Cross-Pollinator (14) 🐣 Hot Topic Early Bird 🏃 Academic Marathon (12) 🏠 Conference Loyalist (28) 🤝 Dynamic Duo (31) 🏆 Grand Slam 👥 Mega-Team (33) 🔬 Deep Specialist (19) 🧬 Topic Evolution 🏆 Keyword Champion (3) 🚀 Conference Pioneer 🗃️ Keyword Collector (306) ❓ The Questioner 💎 Century Club (89) 🔥 Unstoppable (9) ⚡ Prolific Year (28) 📈 Trend Setter

Conferences

CVPR (28) ICCV (17) NIPS (16) ECCV (8) ICLR (5) ICML (4) AAAI (3) EMNLP (3) IJCAI (3) WACV (2) ACL (1) JMLR (1)

Top co-authors

Zheng Zhang (31) Yue Cao (21) Yutong Lin (13) Stephen Lin (12) Fangyun Wei (10) Houwen Peng (9) Yixuan Wei (9) Zhenda Xie (9) Ze Liu (8) Qi Dai (8)

Research topics

Differential Privacy (1) Digital Humanities (1) Core AI (1)

Keywords

object detection (18) vision transformer (11) representation learning (11) semantic segmentation (10) knowledge distillation (7) masked image modeling (7) diffusion model (7) transfer learning (6) contrastive learning (6) image classification (6) self-supervised learning (6) federated learning (5) model compression (5) zero-shot learning (4) video recognition (3) instance segmentation (3) data augmentation (3) semi-supervised learning (3) action recognition (3) attention mechanism (3)

Papers

Beyond Ranking: Fine-Grained Diagnostics and Self-Improvement for MLLMs ACL 2026 From Text to Simulation: A Multi-Agent LLM Workflow for Automated Chemical Process Design AAAI 2026 InterIDEAS: Philosophical Intertextuality via LLMs EMNLP 2025 DeepMIM: Deep Supervision for Masked Image Modeling WACV 2025 FusionBench: A Unified Library and Comprehensive Benchmark for Deep Model Fusion JMLR 2025 Federated Deconfounding and Debiasing Learning for Out-of-Distribution Generalization IJCAI 2025 Targeted Low-rank Refinement: Enhancing Sparse Language Models with Precision ICML 2025 Cross-Silo Feature Space Alignment for Federated Learning on Clients with Imbalanced Data AAAI 2025 RBench: Graduate-level Multi-disciplinary Benchmarks for LLM & MLLM Complex Reasoning Evaluation ICML 2025 Noise-Resistant Video Anomaly Detection via RGB Error-Guided Multiscale Predictive Coding and Dynamic Memory CVPR 2025 BigCodeBench: Benchmarking Code Generation with Diverse Function Calls and Complex Instructions ICLR 2025 Revisit the Open Nature of Open Vocabulary Semantic Segmentation ICLR 2025 Neural Architecture Search Driven by Locally Guided Diffusion for Personalized Federated Learning ICCV 2025 MotionEditor: Editing Video Motion via Content-Aware Diffusion CVPR 2024 SimDA: Simple Diffusion Adapter for Efficient Video Generation CVPR 2024 POCE: Primal Policy Optimization with Conservative Estimation for Multi-constraint Offline Reinforcement Learning CVPR 2024 ScalingFilter: Assessing Data Quality through Inverse Utilization of Scaling Laws EMNLP 2024 Segment and Caption Anything CVPR 2024 Unsupervised Graphic Layout Grouping With Transformers WACV 2024 InstructDiffusion: A Generalist Modeling Interface for Vision Tasks CVPR 2024 Multiple View Geometry Transformers for 3D Human Pose Estimation CVPR 2024 Joint Input and Output Coordination for Class-Incremental Learning IJCAI 2024 V-DETR: DETR with Vertex Relative Position Encoding for 3D Object Detection ICLR 2024 Parameter-Efficient Multi-Task Model Fusion with Partial Linearization ICLR 2024 GAIA: Zero-shot Talking Avatar Generation ICLR 2024 Data-efficient Large Vision Models through Sequential Autoregression ICML 2024 Implicit Temporal Modeling with Learnable Alignment for Video Recognition ICCV 2023 Revisit the Power of Vanilla Knowledge Distillation: from Small Scale to Large Scale NIPS 2023 Rank-DETR for High Quality Object Detection NIPS 2023 GlyphControl: Glyph Conditional Control for Visual Text Generation NIPS 2023 ImageBrush: Learning Visual In-Context Instructions for Exemplar-Based Image Manipulation NIPS 2023 Federated Learning with Manifold Regularization and Normalized Update Reaggregation NIPS 2023 Pairwise GUI Dataset Construction Between Android Phones and Tablets NIPS 2023 One-for-All: Bridge the Gap Between Heterogeneous Architectures in Knowledge Distillation NIPS 2023 FedABC: Targeting Fair Competition in Personalized Federated Learning AAAI 2023 ResFormer: Scaling ViTs With Multi-Resolution Training CVPR 2023 TinyMIM: An Empirical Study of Distilling MIM Pre-Trained Models CVPR 2023 SeqTrack: Sequence to Sequence Learning for Visual Object Tracking CVPR 2023 EfficientViT: Memory Efficient Vision Transformer With Cascaded Group Attention CVPR 2023 SVFormer: Semi-Supervised Video Transformer for Action Recognition CVPR 2023 On Data Scaling in Masked Image Modeling CVPR 2023 Side Adapter Network for Open-Vocabulary Semantic Segmentation CVPR 2023 Revealing the Dark Secrets of Masked Image Modeling CVPR 2023 Human Pose As Compositional Tokens CVPR 2023 DETRs With Hybrid Matching CVPR 2023 iCLIP: Bridging Image Classification and Contrastive Language-Image Pre-Training for Visual Recognition CVPR 2023 Mask-Attention-Free Transformer for 3D Instance Segmentation ICCV 2023 TinyCLIP: CLIP Distillation via Affinity Mimicking and Weight Inheritance ICCV 2023 Efficient Diffusion Training via Min-SNR Weighting Strategy ICCV 2023 Attentive Mask CLIP ICCV 2023 DETR Does Not Need Multi-Scale or Locality Design ICCV 2023 All in Tokens: Unifying Output Space of Visual Tasks via Soft Token ICCV 2023 Improving CLIP Fine-tuning Performance ICCV 2023 Improving Heterogeneous Model Reuse by Density Estimation IJCAI 2023 Graph Hawkes Transformer for Extrapolated Reasoning on Temporal Knowledge Graphs EMNLP 2022 Video Swin Transformer CVPR 2022 SimMIM: A Simple Framework for Masked Image Modeling CVPR 2022 Learning Efficient Vision Transformers via Fine-Grained Manifold Distillation NIPS 2022 Could Giant Pre-trained Image Models Extract Universal Representations? NIPS 2022 Swin Transformer V2: Scaling Up Capacity and Resolution CVPR 2022 Expediting Large-Scale Vision Transformer for Dense Prediction without Fine-tuning NIPS 2022 "A Simple Approach and Benchmark for 21,000-Category Object Detection" ECCV 2022 RankSeg: Adaptive Pixel Classification with Image Category Ranking for Segmentation ECCV 2022 A Simple Baseline for Open-Vocabulary Semantic Segmentation with Pre-trained Vision-Language Model ECCV 2022 End-to-End Semi-Supervised Object Detection With Soft Teacher ICCV 2021 Aligning Pretraining for Detection via Object-Level Contrastive Learning NIPS 2021 Semi-Supervised Semantic Segmentation via Adaptive Equalization Learning NIPS 2021 Bootstrap Your Object Detector via Mixed Training NIPS 2021 Capsule Network Is Not More Robust Than Convolutional Network CVPR 2021 Propagate Yourself: Exploring Pixel-Level Consistency for Unsupervised Visual Representation Learning CVPR 2021 Group-Free 3D Object Detection via Transformers ICCV 2021 Swin Transformer: Hierarchical Vision Transformer Using Shifted Windows ICCV 2021 Disentangled Non-local Neural Networks ECCV 2020 A Closer Look at Local Aggregation Operators in Point Cloud Analysis ECCV 2020 Negative Margin Matters: Understanding Margin in Few-shot Classification ECCV 2020 RelationNet++: Bridging Visual Representations for Object Detection via Transformer Decoder NIPS 2020 Memory Enhanced Global-Local Aggregation for Video Object Detection CVPR 2020 Dense RepPoints: Representing Visual Objects with Dense Point Sets ECCV 2020 Parametric Instance Classification for Unsupervised Visual Feature learning NIPS 2020 Scalable Differential Privacy with Certified Robustness in Adversarial Learning ICML 2020 RepPoints v2: Verification Meets Regression for Object Detection NIPS 2020 Local Relation Networks for Image Recognition ICCV 2019 Spatial-Temporal Relation Networks for Multi-Object Tracking ICCV 2019 RepPoints: Point Set Representation for Object Detection ICCV 2019 Deformable ConvNets V2: More Deformable, Better Results CVPR 2019 Learning Region Features for Object Detection ECCV 2018 Relation Networks for Object Detection CVPR 2018 Deformable Convolutional Networks ICCV 2017 WordSup: Exploiting Word Annotations for Character Based Text Detection ICCV 2017 Smooth Representation Clustering CVPR 2014 Pose from Flow and Flow from Pose CVPR 2013