conftrace_

Wenyi Hong

13 papers · 2021–2026 · 6 conferences · across top CS/AI conferences

Achievements

Jump to papers ↓

+7 more ↓

🗺️ Taxonomy Completionist (18) 🌈 Renaissance Researcher (5) 🌍 Conference Polyglot (5) 🌉 Interdisciplinary Bridge 🧭 Keyword Pioneer

🐣 Hot Topic Early Bird 🌍 Conference Polyglot (5) 🤝 Dynamic Duo (12) 👥 Mega-Team (28) ⚡ Prolific Year (5) 💎 Century Club (12) 🔥 Unstoppable (5)

Conferences

ICLR (5) NIPS (3) CVPR (2) ACL (1) ECCV (1) ICCV (1)

Top co-authors

Jie Tang (13) Ming Ding (11) Yuxiao Dong (8) Weihan Wang (7) Zhuoyi Yang (6) Wendi Zheng (6) Xiaotao Gu (5) Bin Xu (4) Ji Qi (3) Qingsong Lv (3)

Keywords

visual question answering (3) video question answering (2) text-to-image generation (2) multimodal large language model (2) visual language model (2) cross-modal learning (1) image synthesis (1) video understanding (1) visual grounding (1) vector quantization (1) vision language model (1) generative adversarial network (1) multi-modal large language model (1) vision-language model (1) context window (1) long video understanding (1) video benchmark (1) graphical user interface (1) token compression (1) hierarchical transformer (1)

Papers

Glyph: Scaling Context Windows via Visual-Text Compression ACL 2026 LVBench: An Extreme Long Video Understanding Benchmark ICCV 2025 MotionBench: Benchmarking and Improving Fine-grained Video Motion Understanding for Vision Language Models CVPR 2025 CogVideoX: Text-to-Video Diffusion Models with An Expert Transformer ICLR 2025 VisualAgentBench: Towards Large Multimodal Models as Visual Foundation Agents ICLR 2025 CogCoM: A Visual Language Model with Chain-of-Manipulations Reasoning ICLR 2025 CogVLM: Visual Expert for Pretrained Language Models NIPS 2024 Relay Diffusion: Unifying diffusion process across resolutions for image synthesis ICLR 2024 CogAgent: A Visual Language Model for GUI Agents CVPR 2024 Inf-DiT: Upsampling any-resolution image with memory-efficient diffusion transformer. ECCV 2024 CogVideo: Large-scale Pretraining for Text-to-Video Generation via Transformers ICLR 2023 CogView2: Faster and Better Text-to-Image Generation via Hierarchical Transformers NIPS 2022 CogView: Mastering Text-to-Image Generation via Transformers NIPS 2021