Han Zhang

111 papers · 2016–2026 · 16 conferences · across top CS/AI conferences

Achievements

+15 more ↓

🌍 Conference Polyglot (16) 🏃 Academic Marathon (10) 🧭 Keyword Pioneer 🌉 Interdisciplinary Bridge 🐝 Cross-Pollinator (6)

🧭 Keyword Pioneer 🌈 Renaissance Researcher (12) 🐣 Hot Topic Early Bird 🤝 Dynamic Duo (12) 👑 Triple Crown 🏆 Grand Slam 👥 Mega-Team (20) 🔬 Deep Specialist (14) 🧬 Topic Evolution 🗃️ Keyword Collector (362) 📈 Trend Setter ⚡ Prolific Year (20) 🚀 Conference Pioneer 🔥 Unstoppable (11) 💎 Century Club (104)

Conferences

CVPR (17) ICLR (16) AAAI (13) MICCAI (9) EMNLP (8) IJCAI (8) NIPS (8) ECCV (7) ACL (6) ICML (6) ICCV (5) NAACL (3) WACV (2) CORL (1) RSS (1) SEMEVAL (1)

Top co-authors

Zizhao Zhang (12) Huiwen Chang (9) Zhantao Yang (7) Lu Jiang (6) Ruifeng Xu (6) Dinggang Shen (6) Tomas Pfister (6) Honglak Lee (5) Augustus Odena (5) Hui Wang (5)

Keywords

large language model (11) generative adversarial network (8) image generation (8) diffusion model (6) knowledge distillation (5) generative model (4) continual learning (4) image synthesis (4) video generation (4) text-to-image generation (4) benchmark evaluation (3) convolutional neural network (3) vision-language model (3) catastrophic forgetting (3) reinforcement learning (3) multimodal learning (3) contrastive learning (3) vision transformer (3) feature extraction (3) multi-objective optimization (3)

Papers

FACTGUARD: Event-Centric and Commonsense-Guided Fake News Detection AAAI 2026 Spikingformer: A Key Foundation Model for Spiking Neural Networks AAAI 2026 Conditional Memory via Scalable Lookup: A New Axis of Sparsity for Large Language Models ACL 2026 STELAR-VISION: Self-Topology-Aware Efficient Learning for Aligned Reasoning in Vision AAAI 2026 Out-of-Context Misinformation Detection via Variational Domain-Invariant Learning with Test-Time Training AAAI 2026 Diagnosing Hidden Instabilities in Model Editing via Uncertainty Quantification ACL 2026 Neo-Classic: A Benchmark for Evaluating Linguistic-Aesthetic Reasoning in Classical Chinese Poetry ACL 2026 DICE: Discrete Inversion Enabling Controllable Editing for Masked Generative Models WACV 2026 pEBR: A Probabilistic Approach to Embedding Based Retrieval EMNLP 2025 Accelerating Diffusion Sampling via Exploiting Local Transition Coherence ICCV 2025 Correcting Large Language Model Behavior via Influence Function AAAI 2025 Inheriting Generalized Learngene for Efficient Knowledge Transfer across Multiple Tasks AAAI 2025 BeyondGender: A Multifaceted Bilingual Dataset for Practical Sexism Detection AAAI 2025 COPR: Continual Human Preference Learning via Optimal Policy Regularization ACL 2025 DOGlove: Dexterous Manipulation with a Low-Cost Open-Source Haptic Force Feedback Glove RSS 2025 ALTER: Augmentation for Large-Table-Based Reasoning NAACL 2025 Wavelet-driven Decoupling and Physics-informed Mapping Network for Accelerated Multi-parametric MR Imaging MICCAI 2025 Sparsely Labeled fMRI Data Denoising with Meta-Learning-Based Semi-Supervised Domain Adaptation MICCAI 2025 MAK-GAN: Multi-level Adaptive Convolutional Kernels for Asymmetric Multi-modal PET Reconstruction MICCAI 2025 DexUMI: Using Human Hand as the Universal Manipulation Interface for Dexterous Manipulation CORL 2025 MITracker: Multi-View Integration for Visual Object Tracking CVPR 2025 BACON: Improving Clarity of Image Captions via Bag-of-Concept Graphs CVPR 2025 GeneMorphFormer: Transformer-Driven Cross-Scale Mapping from Gene Expression to Cortical Morphology MICCAI 2025 FluoroSAM: A Language-promptable Foundation Model for Flexible X-ray Image Segmentation MICCAI 2025 DiffOSeg: Omni Medical Image Segmentation via Multi-Expert Collaboration Diffusion Model MICCAI 2025 Towards Comprehensive and Prerequisite-Free Explainer for Graph Neural Networks IJCAI 2025 Epsilon-VAE: Denoising as Visual Decoding ICML 2025 Beyond Human Labels: A Multi-Linguistic Auto-Generated Benchmark for Evaluating Large Language Models on Resume Parsing EMNLP 2025 CHURRO: Making History Readable with an Open-Weight Large Vision-Language Model for High-Accuracy, Low-Cost Historical Text Recognition EMNLP 2025 Protein Large Language Models: A Comprehensive Survey EMNLP 2025 MagicMan: Generative Novel View Synthesis of Humans with 3D-Aware Diffusion and Iterative Refinement AAAI 2025 MUC: Mixture of Uncalibrated Cameras for Robust 3D Human Body Reconstruction AAAI 2025 A Multi-Level Framework for Accelerating Training Transformer Models ICLR 2024 CPPO: Continual Learning for Reinforcement Learning with Human Feedback ICLR 2024 BatteryML: An Open-source Platform for Machine Learning on Battery Degradation ICLR 2024 Theoretical Study on Multi-objective Heuristic Search IJCAI 2024 DreamClean: Restoring Clean Image Using Deep Diffusion Prior ICLR 2024 CCM: Real-Time Controllable Visual Content Creation Using Text-to-Image Consistency Models ICML 2024 QKFormer: Hierarchical Spiking Transformer using Q-K Attention NIPS 2024 4DBInfer: A 4D Benchmarking Toolbox for Graph-Centric Predictive Modeling on RDBs NIPS 2024 EmpathyEar: An Open-source Avatar Multimodal Empathetic Chatbot ACL 2024 Incremental pre-training from smaller language models ACL 2024 WSSADN: A Weakly Supervised Spherical Age-Disentanglement Network for Detecting Developmental Disorders with Structural MRI MICCAI 2024 Lipschitz Singularities in Diffusion Models ICLR 2024 Leveraging Unpaired Data for Vision-Language Generative Models via Cycle Consistency ICLR 2024 Steering Prototypes With Prompt-Tuning for Rehearsal-Free Continual Learning WACV 2024 NUS-Emo at SemEval-2024 Task 3: Instruction-Tuning LLM for Multimodal Emotion-Cause Analysis in Conversations SEMEVAL 2024 Mixed Integer Linear Programming for Discrete Sampling Scheme Design in Diffusion MRI MICCAI 2024 LoCI-DiffCom: Longitudinal Consistency-Informed Diffusion Model for 3D Infant Brain Image Completion MICCAI 2024 NUS-Emo at SemEval-2024 Task 3: Instruction-Tuning LLM for Multimodal Emotion-Cause Analysis in Conversations NAACL 2024 Parrot: Pareto-optimal Multi-Reward Reinforcement Learning Framework for Text-to-Image Generation ECCV 2024 Efficient Multi-view Unsupervised Feature Selection with Adaptive Structure Learning and Inference IJCAI 2024 Multi-objective Search via Lazy and Efficient Dominance Checks IJCAI 2023 Decision Tree for Locally Private Estimation with Public Data NIPS 2023 StoryBench: A Multifaceted Benchmark for Continuous Story Visualization NIPS 2023 Diversify Your Vision Datasets with Automatic Diffusion-based Augmentation NIPS 2023 Visual Prompt Tuning for Generative Transfer Learning CVPR 2023 MAGVIT: Masked Generative Video Transformer CVPR 2023 MAGE: MAsked Generative Encoder To Unify Representation Learning and Image Synthesis CVPR 2023 Dimensionality-Varying Diffusion Process CVPR 2023 Enhanced Training of Query-Based Object Detection via Selective Query Recollection CVPR 2023 SVDiff: Compact Parameter Space for Diffusion Fine-Tuning ICCV 2023 FineDance: A Fine-grained Choreography Dataset for 3D Full Body Dance Generation ICCV 2023 VQ3D: Learning a 3D-Aware Generative Model on ImageNet ICCV 2023 Phenaki: Variable Length Video Generation from Open Domain Textual Descriptions ICLR 2023 Using Language to Extend to Unseen Domains ICLR 2023 Muse: Text-To-Image Generation via Masked Generative Transformers ICML 2023 Heuristic-Search Approaches for the Multi-Objective Shortest-Path Problem: Progress and Research Opportunities IJCAI 2023 DualPrompt: Complementary Prompting for Rehearsal-Free Continual Learning ECCV 2022 BLT: Bidirectional Layout Transformer for Controllable Layout Generation ECCV 2022 "Unitail: Detecting, Reading, and Matching in Retail Scene" ECCV 2022 MaskGIT: Masked Generative Image Transformer CVPR 2022 MAXIM: Multi-Axis MLP for Image Processing CVPR 2022 Learning To Prompt for Continual Learning CVPR 2022 Dimension Reduction for Efficient Dense Retrieval via Conditional Autoencoder EMNLP 2022 Nested Hierarchical Transformer: Towards Accurate, Data-Efficient and Interpretable Visual Understanding AAAI 2022 Powering Finetuning in Few-Shot Learning: Domain-Agnostic Bias Reduction with Selected Sampling AAAI 2022 Lane Detection Transformer Based on Multi-Frame Horizontal and Vertical Attention and Visual Transformer Module ECCV 2022 Learning Instance-Specific Adaptation for Cross-Domain Segmentation ECCV 2022 MaxViT: Multi-axis Vision Transformer ECCV 2022 GLOBEM Dataset: Multi-Year Datasets for Longitudinal Human Behavior Modeling Generalization NIPS 2022 ViTGAN: Training GANs with Vision Transformers ICLR 2022 CLLE: A Benchmark for Continual Language Learning Evaluation in Multilingual Machine Translation EMNLP 2022 Vector-quantized Image Modeling with Improved VQGAN ICLR 2022 Givens Coordinate Descent Methods for Rotation Matrix Learning in Trainable Embedding Indexes ICLR 2022 ERNIE-Gram: Pre-Training with Explicitly N-Gram Masked Language Modeling for Natural Language Understanding NAACL 2021 Cross-Modal Contrastive Learning for Text-to-Image Generation CVPR 2021 Revisiting Hierarchical Approach for Persistent Long-Term Video Prediction ICLR 2021 Improved Transformer for High-Resolution GANs NIPS 2021 PseudoSeg: Designing Pseudo Labels for Semantic Segmentation ICLR 2021 Improved Consistency Regularization for GANs AAAI 2021 Auto-GAN: Self-Supervised Collaborative Learning for Medical Image Synthesis AAAI 2020 ReMixMatch: Semi-Supervised Learning with Distribution Matching and Augmentation Anchoring ICLR 2020 Consistency Regularization for Generative Adversarial Networks ICLR 2020 Small-GAN: Speeding up GAN Training using Core-Sets ICML 2020 Approximation Capabilities of Neural ODEs and Invertible Residual Networks ICML 2020 Semi-supervised Clustering via Pairwise Constrained Optimal Graph IJCAI 2020 ERNIE-GEN: An Enhanced Multi-Flow Pre-training and Fine-tuning Framework for Natural Language Generation IJCAI 2020 FixMatch: Simplifying Semi-Supervised Learning with Consistency and Confidence NIPS 2020 Your Local GAN: Designing Two Dimensional Local Attention Mechanisms for Generative Models CVPR 2020 Distilling Effective Supervision From Severe Label Noise CVPR 2020 Co-Occurrent Features in Semantic Segmentation CVPR 2019 Multimodal, Multilingual Grapheme-to-Phoneme Conversion for Low-Resource Languages EMNLP 2019 Decoding EEG by Visual-guided Deep Neural Networks IJCAI 2019 Self-Attention Generative Adversarial Networks ICML 2019 A Teacher-Student Framework for Maintainable Dialog Manager EMNLP 2018 Improving GANs Using Optimal Transport ICLR 2018 AttnGAN: Fine-Grained Text to Image Generation With Attentional Generative Adversarial Networks CVPR 2018 StackGAN: Text to Photo-Realistic Image Synthesis With Stacked Generative Adversarial Networks ICCV 2017 Link the Head to the "Beak": Zero Shot Learning From Noisy Text Description at Part Precision CVPR 2017 SPDA-CNN: Unifying Semantic Part Detection and Abstraction for Fine-Grained Recognition CVPR 2016