Ziyang Chen

26 papers · 2017–2026 · 10 conferences · across top CS/AI conferences

Achievements

+11 more ↓

🏃 Academic Marathon (8) 🌍 Conference Polyglot (10) 🌉 Interdisciplinary Bridge 🧭 Keyword Pioneer 🐝 Cross-Pollinator (5)

🌈 Renaissance Researcher (10) 🌍 Conference Polyglot (10) 🏃 Academic Marathon (8) 🤝 Dynamic Duo (12) 👥 Mega-Team (53) 🧬 Topic Evolution 💎 Century Club (25) ❓ The Questioner ⚡ Prolific Year (8) 🗃️ Keyword Collector (133) 🔥 Unstoppable (5)

Conferences

CVPR (12) ACL (4) AAAI (2) NIPS (2) CORL (1) ECCV (1) EMNLP (1) ICCV (1) IJCAI (1) MICCAI (1)

Top co-authors

Andrew Owens (12) Yong Xia (5) Yiwen Ye (5) Chao Feng (3) Bryan Russell (2) Yutong Xie (2) Nan Du (2) Xixi Hu (2) Xiaolong Li (2) Jianpeng Zhang (2)

Keywords

self-supervised learning (6) audio-visual learning (5) multimodal learning (4) medical image segmentation (3) diffusion model (2) audio generation (2) large language model (2) continual learning (2) test-time adaptation (2) depth estimation (2) 3d reconstruction (2) sound localization (2) vision-language model (2) multimodal representation (2) image generation (2) temporal reasoning (2) representation learning (2) question answering (2) attention mechanism (2) sound generation (2)

Papers

LiteLong: Resource-Efficient Long-Context Data Synthesis for LLMs AAAI 2026 GPS as a Control Signal for Image Generation CVPR 2025 Gradient Alignment Improves Test-Time Adaptation for Medical Image Segmentation AAAI 2025 Enjoying Information Dividend: Gaze Track-based Medical Weakly Supervised Segmentation MICCAI 2025 Advancing General Multimodal Capability of Vision-language Models with Pyramid-descent Visual Position Encoding ACL 2025 ToolExpNet: Optimizing Multi-Tool Selection in LLMs with Similarity and Dependency-Aware Experience Networks ACL 2025 Supervising Sound Localization by In-the-wild Egomotion CVPR 2025 Video-Guided Foley Sound Generation with Multimodal Controls CVPR 2025 Seeing Far and Clearly: Mitigating Hallucinations in MLLMs with Attention Causal Decoding CVPR 2025 Real Acoustic Fields: An Audio-Visual Room Acoustics Dataset and Benchmark CVPR 2024 Images that Sound: Composing Images and Sounds on a Single Canvas NIPS 2024 Temporal Knowledge Question Answering via Abstract Reasoning Induction ACL 2024 MoCha-Stereo: Motif Channel Attention Network for Stereo Matching CVPR 2024 Binding Touch to Everything: Learning Unified Multimodal Tactile Representations CVPR 2024 Touchstone Benchmark: Are We on the Right Way for Evaluating AI Algorithms for Medical Segmentation? NIPS 2024 Each Test Image Deserves A Specific Prompt: Continual Test-Time Adaptation for 2D Medical Image Segmentation CVPR 2024 Continual Self-supervised Learning: Towards Universal Multi-modal Medical Data Representation Learning CVPR 2024 Conditional Generation of Audio From Video via Foley Analogies CVPR 2023 Self-Supervised Video Forensics by Audio-Visual Anomaly Detection CVPR 2023 Large Language Models Meet Harry Potter: A Dataset for Aligning Dialogue Agents with Characters EMNLP 2023 Sound Localization from Motion: Jointly Learning Sound Direction and Camera Rotation ICCV 2023 Multi-granularity Temporal Question Answering over Knowledge Graphs ACL 2023 Mix and Localize: Localizing Sound Sources in Mixtures CVPR 2022 Sound Localization by Self-Supervised Time Delay Estimation ECCV 2022 Structure from Silence: Learning Scene Structure from Ambient Sound CORL 2021 Modeling Trajectories with Recurrent Neural Networks IJCAI 2017