Jianhua Han
37 papers · 2017–2025 · 10 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+9 more ↓ Show less ↑
π Cross-Pollinator (12) π§ Keyword Pioneer π Academic Marathon (8) π Conference Polyglot (10) π Renaissance Researcher (5)
π
Interdisciplinary Bridge
πΊοΈ
Taxonomy Completionist
(48)
π§
Keyword Pioneer
π€
Dynamic Duo
(36)
π
Keyword Champion
(2)
π₯
Mega-Team
(30)
β‘
Prolific Year
(14)
π
Century Club
(37)
ποΈ
Keyword Collector
(139)
Conferences
CVPR (9)
ECCV (8)
NIPS (5)
AAAI (4)
ICLR (4)
ICCV (3)
ACL (1)
EMNLP (1)
IJCAI (1)
WACV (1)
Top co-authors
Keywords
object detection
(6)
multimodal learning
(5)
large language model
(5)
vision-language model
(5)
zero-shot detection
(4)
autonomous driving
(4)
contrastive learning
(4)
semantic segmentation
(3)
image generation
(3)
zero-shot learning
(3)
zero-shot classification
(3)
transfer learning
(3)
cross-modal alignment
(2)
vision language model
(2)
multimodal large language model
(2)
multi-modal learning
(2)
lane detection
(2)
video understanding
(2)
self-supervised learning
(2)
multi-task learning
(2)
Papers
DisCo: Discovering Common Affordance from Large Models for Actionable Part Perception
WACV 2025
HiRes-LLaVA: Restoring Fragmentation Input in High-Resolution Large Vision-Language Models
CVPR 2025
EMOVA: Empowering Language Models to See, Hear and Speak with Vivid Emotions
CVPR 2025
ILLUME: Illuminating Your LLMs to See, Draw, and Self-Enhance
ICCV 2025
G-LLaVA: Solving Geometric Problem with Multi-Modal Large Language Model
ICLR 2025
CorNav: Autonomous Agent with Self-Corrected Planning for Zero-Shot Vision-and-Language Navigation
ACL 2024
VidMan: Exploiting Implicit Dynamics from Video Diffusion Model for Effective Robot Manipulation
NIPS 2024
SlowFocus: Enhancing Fine-grained Temporal Understanding in Video LLM
NIPS 2024
Implicit Concept Removal of Diffusion Models
ECCV 2024
Reason2Drive: Towards Interpretable and Chain-based Reasoning for Autonomous Driving
ECCV 2024
HumanRefiner: Benchmarking Abnormal Human Generation and Refining with Coarse-to-fine Pose-Reversible Guidance
ECCV 2024
PanGu-Draw: Advancing Resource-Efficient Text-to-Image Synthesis with Time-Decoupled Training and Reusable Coop-Diffusion
ECCV 2024
LayerDiff: Exploring Text-guided Multi-layered Composable Image Synthesis via Layer-Collaborative Diffusion Model
ECCV 2024
Ins-DetCLIP: Aligning Detection Model to Follow Human-Language Instruction
ICLR 2024
UNIT: Unifying Image and Text Recognition in One Vision Encoder
NIPS 2024
Holistic Autonomous Driving Understanding by Bird's-Eye-View Injected Multi-Modal Large Models
CVPR 2024
DetCLIPv3: Towards Versatile Generative Open-vocabulary Object Detection
CVPR 2024
Gaining Wisdom from Setbacks: Aligning Large Language Models via Mistake Analysis
ICLR 2024
Any-Size-Diffusion: Toward Efficient Text-Driven Synthesis for Any-Size HD Images
AAAI 2024
DetGPT: Detect What You Need via Reasoning
EMNLP 2023
GrowCLIP: Data-Aware Automatic Model Growing for Large-scale Contrastive Language-Image Pre-Training
ICCV 2023
Task-customized Masked Autoencoder via Mixture of Cluster-conditional Experts
ICLR 2023
Visual Exemplar Driven Task-Prompting for Unified Perception in Autonomous Driving
CVPR 2023
CLIP2: Contrastive Language-Image-Point Pretraining From Real-World Point Cloud Data
CVPR 2023
CapDet: Unifying Dense Captioning and Open-World Detection Pretraining
CVPR 2023
NLIP: Noise-Robust Language-Image Pre-training
AAAI 2023
DetCLIPv2: Scalable Open-Vocabulary Object Detection Pre-Training via Word-Region Alignment
CVPR 2023
DiffDis: Empowering Generative Diffusion Model with Cross-Modal Discrimination Capability
ICCV 2023
CODA: A Real-World Road Corner Case Dataset for Object Detection in Autonomous Driving
ECCV 2022
Task-Customized Self-Supervised Pre-training with Scalable Dynamic Routing
AAAI 2022
ONCE-3DLanes: Building Monocular 3D Lane Detection
CVPR 2022
DetCLIP: Dictionary-Enriched Visual-Concept Paralleled Pre-training for Open-world Detection
NIPS 2022
Open-World Semantic Segmentation via Contrasting and Clustering Vision-Language Embedding
ECCV 2022
Generative Negative Text Replay for Continual Vision-Language Pretraining
ECCV 2022
Laneformer: Object-Aware Row-Column Transformers for Lane Detection
AAAI 2022
Effective Adaptation in Multi-Task Co-Training for Unified Autonomous Driving
NIPS 2022
Aggregating Crowd Wisdoms with Label-aware Autoencoders
IJCAI 2017