conftrace_

Licheng Yu

37 papers · 2015–2026 · 8 conferences · across top CS/AI conferences

Achievements

Jump to papers ↓
+11 more ↓ 🌈 Renaissance Researcher (9) πŸŒ‰ Interdisciplinary Bridge 🌍 Conference Polyglot (8) πŸƒ Academic Marathon (10) πŸ—ΊοΈ Taxonomy Completionist (67)
πŸ—ΊοΈ Taxonomy Completionist (67) 🧭 Keyword Pioneer 🐣 Hot Topic Early Bird πŸ”¬ Deep Specialist (13) 🧬 Topic Evolution πŸ’Ž Century Club (36) πŸ“ˆ Trend Setter ❓ The Questioner πŸ”₯ Unstoppable (9) πŸ—ƒοΈ Keyword Collector (161) ⚑ Prolific Year (7)

Conferences

CVPR (19) ECCV (6) EMNLP (5) ACL (2) ICCV (2) EACL (1) ICLR (1) NAACL (1)

Papers

AdvancedIF: Rubric-Based Benchmarking and Reinforcement Learning for Advancing LLM Instruction Following ACL 2026 ROICtrl: Boosting Instance Control for Visual Generation CVPR 2025 Accelerating Multimodal Large Language Models by Searching Optimal Vision Token Reduction CVPR 2025 Apollo: An Exploration of Video Understanding in Large Multimodal Models CVPR 2025 Building a Mind Palace: Structuring Environment-Grounded Semantic Graphs for Effective Long Video Analysis with LLMs CVPR 2025 Fairy: Fast Parallelized Instruction-Guided Video-to-Video Synthesis CVPR 2024 Layout-Agnostic Scene Text Image Synthesis with Diffusion Models CVPR 2024 VideoSwap: Customized Video Subject Swapping with Interactive Semantic Point Correspondence CVPR 2024 FlowVid: Taming Imperfect Optical Flows for Consistent Video-to-Video Synthesis CVPR 2024 Text-to-Sticker: Style Tailoring Latent Diffusion Models for Human Expression ECCV 2024 Ameli: Enhancing Multimodal Entity Linking with Fine-Grained Attributes EACL 2024 AVID: Any-Length Video Inpainting with Diffusion Model CVPR 2024 Learning Procedure-Aware Video Representation From Instructional Videos and Their Narrations CVPR 2023 Tell Me What Happened: Unifying Text-Guided Video Completion via Multimodal Masked Video Generation CVPR 2023 FAME-ViL: Multi-Tasking Vision-Language Model for Heterogeneous Fashion Tasks CVPR 2023 RoPAWS: Robust Semi-supervised Representation Learning from Uncurated Data ICLR 2023 CiT: Curation in Training for Effective Vision-Language Data ICCV 2023 Unsupervised Vision-and-Language Pre-Training via Retrieval-Based Multi-Granular Alignment CVPR 2022 FaD-VLP: Fashion Vision-and-Language Pre-training towards Unified Retrieval and Captioning EMNLP 2022 "GEB+: A Benchmark for Generic Event Boundary Captioning, Grounding and Retrieval" ECCV 2022 FashionViL: Fashion-Focused Vision-and-Language Representation Learning ECCV 2022 Connecting What To Say With Where To Look by Modeling Human Attention Traces CVPR 2021 BachGAN: High-Resolution Image Synthesis From Salient Object Layout CVPR 2020 TVQA+: Spatio-Temporal Grounding for Video Question Answering ACL 2020 Behind the Scene: Revealing the Secrets of Pre-trained Vision-and-Language Models ECCV 2020 TVR: A Large-Scale Dataset for Video-Subtitle Moment Retrieval ECCV 2020 UNITER: UNiversal Image-TExt Representation Learning ECCV 2020 HERO: Hierarchical Encoder for Video+Language Omni-representation Pre-training EMNLP 2020 What is More Likely to Happen Next? Video-and-Language Future Event Prediction EMNLP 2020 Violin: A Large-Scale Dataset for Video-and-Language Inference CVPR 2020 Learning to Navigate Unseen Environments: Back Translation with Environmental Dropout NAACL 2019 Multi-Target Embodied Question Answering CVPR 2019 MAttNet: Modular Attention Network for Referring Expression Comprehension CVPR 2018 TVQA: Localized, Compositional Video Question Answering EMNLP 2018 Hierarchically-Attentive RNN for Album Summarization and Storytelling EMNLP 2017 A Joint Speaker-Listener-Reinforcer Model for Referring Expressions CVPR 2017 Visual Madlibs: Fill in the Blank Description Generation and Question Answering ICCV 2015