Hang Hua
11 papers · 2019–2026 · 7 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+6 more ↓ Show less ↑
🌉 Interdisciplinary Bridge 🐝 Cross-Pollinator (14) 🌍 Conference Polyglot (7) 🏃 Academic Marathon (6) 🌈 Renaissance Researcher (5)
🐝
Cross-Pollinator
(14)
🌈
Renaissance Researcher
(5)
🧬
Topic Evolution
🗃️
Keyword Collector
(58)
💎
Century Club
(10)
❓
The Questioner
Conferences
AAAI (3)
CVPR (2)
NIPS (2)
ECCV (1)
EMNLP (1)
ICCV (1)
NAACL (1)
Top co-authors
Keywords
vision-language model
(3)
multimodal learning
(2)
vision language model
(2)
image captioning
(2)
zero-shot learning
(2)
large language model
(2)
video captioning
(1)
decision making
(1)
benchmark evaluation
(1)
video understanding
(1)
image editing
(1)
text-to-image generation
(1)
object tracking
(1)
instruction following
(1)
cross-modal learning
(1)
image restoration
(1)
diffusion model
(1)
image processing
(1)
latent representation
(1)
instruction tuning
(1)
Papers
Caption Anything in Video: Fine-grained Object-centric Captioning via Spatiotemporal Multimodal Prompting
AAAI 2026
V2Xum-LLM: Cross-Modal Video Summarization with Temporal Prompt Instruction Tuning
AAAI 2025
Empowering LLMs with Pseudo-Untrimmed Videos for Audio-Visual Temporal Understanding
AAAI 2025
VidComposition: Can MLLMs Analyze Compositions in Compiled Videos?
CVPR 2025
FINECAPTION: Compositional Image Captioning Focusing on Wherever You Want at Any Granularity
CVPR 2025
FineMatch: Aspect-based Fine-grained Image and Text Mismatch Detection and Correction
ECCV 2024
BattleAgent: Multi-modal Dynamic Emulation on Historical Battles to Complement Historical Analysis
EMNLP 2024
PromptFix: You Prompt and We Fix the Photo
NIPS 2024
PromptCap: Prompt-Guided Image Captioning for VQA with GPT-3
ICCV 2023
Noise Stability Regularization for Improving BERT Fine-tuning
NAACL 2021
Controllable Unsupervised Text Attribute Transfer via Editing Entangled Latent Representation
NIPS 2019