Humphrey Shi

63 papers · 2021–2026 · 7 conferences · across top CS/AI conferences

Achievements

+12 more ↓

🏃 Academic Marathon (5) 🌉 Interdisciplinary Bridge 🧭 Keyword Pioneer 🌍 Conference Polyglot (7) 🐝 Cross-Pollinator (10)

🐣 Hot Topic Early Bird 🌍 Conference Polyglot (7) 🏃 Academic Marathon (5) 🏠 Conference Loyalist (23) 🤝 Dynamic Duo (15) 👥 Mega-Team (20) 🔬 Deep Specialist (12) 🧬 Topic Evolution ⚡ Prolific Year (19) 💎 Century Club (63) 🔥 Unstoppable (6) 🗃️ Keyword Collector (252)

Conferences

CVPR (23) WACV (12) ICCV (9) ECCV (6) NIPS (5) AAAI (4) ICLR (4)

Top co-authors

Zhangyang Wang (15) Xingqian Xu (13) Shant Navasardyan (12) Gao Huang (9) Yunchao Wei (8) Kai Wang (7) Jiachen Li (6) Yao Zhao (6) Ali Hassani (6) Vidit Goel (5)

Keywords

diffusion model (12) text-to-image generation (6) semantic segmentation (6) image generation (5) domain adaptation (5) model compression (5) attention mechanism (4) multimodal learning (4) object detection (4) image inpainting (4) instance segmentation (4) generative model (4) generative adversarial network (3) vision-language model (3) transformer architecture (3) image editing (3) style transfer (3) efficient computing (3) vision transformer (2) semi-supervised learning (2)

Papers

Beyond Realism: Learning the Art of Expressive Composition with StickerNet WACV 2026 Safe Vision-Language Models via Unsafe Weights Manipulation WACV 2026 StreamingT2V: Consistent, Dynamic, and Extendable Long Video Generation from Text CVPR 2025 Everything to the Synthetic: Diffusion-driven Test-time Adaptation via Synthetic-Domain Alignment CVPR 2025 Eagle: Exploring The Design Space for Multimodal LLMs with Mixture of Encoders ICLR 2025 ClassDiffusion: More Aligned Personalization Tuning with Explicit Class Guidance ICLR 2025 HD-Painter: High-Resolution and Prompt-Faithful Text-Guided Image Inpainting with Diffusion Models ICLR 2025 IMG: Calibrating Diffusion Models via Implicit Multimodal Guidance ICCV 2025 CLIP-GS: Unifying Vision-Language Representation with 3D Gaussian Splatting ICCV 2025 T2I-Copilot: A Training-Free Multi-Agent Text-to-Image System for Enhanced Prompt Interpretation and Interactive Generation ICCV 2025 HyPiDecoder: Hybrid Pixel Decoder for Efficient Segmentation and Detection ICCV 2025 VCoder: Versatile Vision Encoders for Multimodal Large Language Models CVPR 2024 FineStyle: Fine-grained Controllable Style Personalization for Text-to-image Models NIPS 2024 Faster Neighborhood Attention: Reducing the O(n^2) Cost of Self Attention at the Threadblock Level NIPS 2024 CuMo: Scaling Multimodal LLM with Co-Upcycled Mixture-of-Experts NIPS 2024 Smooth Diffusion: Crafting Smooth Latent Spaces in Diffusion Models CVPR 2024 Zero-Painter: Training-Free Layout Control for Text-to-Image Synthesis CVPR 2024 Prompt-Free Diffusion: Taking "Text" out of Text-to-Image Diffusion Models CVPR 2024 PAIR Diffusion: A Comprehensive Multimodal Object-Level Image Editor CVPR 2024 OpenBias: Open-set Bias Detection in Text-to-Image Generative Models CVPR 2024 Brush2Prompt: Contextual Prompt Generator for Object Inpainting CVPR 2024 Collaborative Vision-Text Representation Optimizing for Open-Vocabulary Segmentation ECCV 2024 Benchmarking Object Detectors with COCO: A New Path Forward ECCV 2024 Diffusion for Natural Image Matting ECCV 2024 Social Reward: Evaluating and Enhancing Generative AI through Million-User Feedback from an Online Creative Community ICLR 2024 FarSight: A Physics-Driven Whole-Body Biometric System at Large Distance and Altitude WACV 2024 Video Instance Matting WACV 2024 Continuous Adaptation for Interactive Segmentation Using Teacher-Student Architecture WACV 2024 Towards Better Structured Pruning Saliency by Reorganizing Convolution WACV 2024 VMFormer: End-to-End Video Matting With Transformer WACV 2024 Specialist Diffusion: Plug-and-Play Sample-Efficient Fine-Tuning of Text-to-Image Diffusion Models To Learn Any Unseen Style CVPR 2023 Neighborhood Attention Transformer CVPR 2023 Zero-Shot Generative Model Adaptation via Image-Specific Prompt Learning CVPR 2023 Sim2RealVS: A New Benchmark for Video Stabilization With a Strong Baseline WACV 2023 Learning Mask-aware CLIP Representations for Zero-Shot Segmentation NIPS 2023 OneFormer: One Transformer To Rule Universal Image Segmentation CVPR 2023 Graph Transformer GANs for Graph-Constrained House Generation CVPR 2023 Automatic High Resolution Wire Segmentation and Removal CVPR 2023 Keys To Better Image Inpainting: Structure and Texture Go Hand in Hand WACV 2023 Image Completion With Heterogeneously Filtered Spectral Hints WACV 2023 MI-GAN: A Simple Baseline for Image Inpainting on Mobile Devices ICCV 2023 Versatile Diffusion: Text, Images and Variations All in One Diffusion Model ICCV 2023 Text2Video-Zero: Text-to-Image Diffusion Models are Zero-Shot Video Generators ICCV 2023 More Control for Free! Image Synthesis With Semantic Diffusion Guidance WACV 2023 Boosted Dynamic Neural Networks AAAI 2023 Object Localization Under Single Coarse Point Supervision CVPR 2022 VideoINR: Learning Video Implicit Neural Representation for Continuous Space-Time Super-Resolution CVPR 2022 AdaFocusV3: On Unified Spatial-Temporal Dynamic Video Recognition ECCV 2022 Point-to-Box Network for Accurate Object Detection via Single Point Supervision ECCV 2022 SinNeRF: Training Neural Radiance Fields on Complex Scenes from a Single Image ECCV 2022 DiSparse: Disentangled Sparsification for Multitask Model Compression CVPR 2022 Towards Layer-Wise Image Vectorization CVPR 2022 AdaFocus V2: End-to-End Training of Spatial Dynamic Networks for Video Recognition CVPR 2022 Mask Matching Transformer for Few-Shot Segmentation NIPS 2022 Auto-X3D: Ultra-Efficient Video Understanding via Finer-Grained Neural Architecture Search WACV 2022 Adaptive Consistency Regularization for Semi-Supervised Transfer Learning CVPR 2021 Learning to Track Instances without Video Annotations CVPR 2021 Interpretable Visual Reasoning via Induced Symbolic Space ICCV 2021 Any-Precision Deep Neural Networks AAAI 2021 High-Resolution Deep Image Matting AAAI 2021 A Multi-Mode Modulator for Multi-Domain Few-Shot Classification ICCV 2021 CompFeat: Comprehensive Feature Aggregation for Video Instance Segmentation AAAI 2021 Rethinking Text Segmentation: A Novel Dataset and a Text-Specific Refinement Approach CVPR 2021