Tingting Gao

17 papers · 2022–2026 · 7 conferences · across top CS/AI conferences

Achievements

+5 more ↓

🐝 Cross-Pollinator (15) 🗺️ Taxonomy Completionist (31) 🧭 Keyword Pioneer 🐣 Hot Topic Early Bird 🌍 Conference Polyglot (7)

🌈 Renaissance Researcher (5) 🌉 Interdisciplinary Bridge 💎 Century Club (13) ⚡ Prolific Year (9) 🗃️ Keyword Collector (74)

Conferences

CVPR (5) ACL (4) AAAI (3) ICLR (2) ECCV (1) ICCV (1) ICML (1)

Top co-authors

Di Zhang (8) Fan Yang (8) Bin Wen (4) Yan Li (3) Guorui Zhou (3) Chenkai Zhang (2) Yiming Lei (2) Longrong Yang (2) ShaoGuo Liu (2) Chaoxiang Cai (2)

Keywords

multimodal large language model (4) video understanding (3) large language model (2) contrastive learning (2) diffusion model (2) representation learning (2) text-to-image generation (2) video generation (1) chain-of-thought reasoning (1) style transfer (1) in-context learning (1) multimodal learning (1) image synthesis (1) text-to-image synthesis (1) feature extraction (1) narrative understanding (1) cross-modal retrieval (1) self-supervised learning (1) semantic alignment (1) visual storytelling (1)

Papers

TIME: Temporal-Sensitive Multi-Dimensional Instruction Tuning and Robust Benchmarking for Video-LLMs AAAI 2026 Beyond Tokens: Dynamic Latent Reasoning via Semantic Residual Refinement AAAI 2026 Compressing then Matching: An Efficient Pre-training Paradigm for Multimodal Embedding ACL 2026 IGenBench: Benchmarking the Reliability of Text-to-Infographic Generation ACL 2026 GODBench: A Benchmark for Multimodal Large Language Models in Video Comment Art ACL 2025 SeriesBench: A Benchmark for Narrative-Driven Drama Series Understanding CVPR 2025 Libra-Merging: Importance-redundancy and Pruning-merging Trade-off for Acceleration Plug-in in Large Vision-Language Model CVPR 2025 CoMM: A Coherent Interleaved Image-Text Dataset for Multimodal Understanding and Generation CVPR 2025 iMOVE : Instance-Motion-Aware Video Understanding ACL 2025 MUSE: Multi-Subject Unified Synthesis via Explicit Layout Semantic Expansion ICCV 2025 Solving Token Gradient Conflict in Mixture-of-Experts for Large Vision-Language Model ICLR 2025 TaskGalaxy: Scaling Multi-modal Instruction Fine-tuning with Tens of Thousands Vision Task Types ICLR 2025 MM-RLHF: The Next Step Forward in Multimodal LLM Alignment ICML 2025 Drag Anything: Motion Control for Anything using Entity Representation ECCV 2024 Learning Multi-Dimensional Human Preference for Text-to-Image Generation CVPR 2024 Decouple Content and Motion for Conditional Image-to-Video Generation AAAI 2024 Domain Generalization via Shuffled Style Assembly for Face Anti-Spoofing CVPR 2022