Ziyang Chen
26 papers · 2017–2026 · 10 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+11 more ↓ Show less ↑
🏃 Academic Marathon (8) 🌍 Conference Polyglot (10) 🌉 Interdisciplinary Bridge 🧭 Keyword Pioneer 🐝 Cross-Pollinator (5)
🌈
Renaissance Researcher
(10)
🌍
Conference Polyglot
(10)
🏃
Academic Marathon
(8)
🤝
Dynamic Duo
(12)
👥
Mega-Team
(53)
🧬
Topic Evolution
💎
Century Club
(25)
❓
The Questioner
⚡
Prolific Year
(8)
🗃️
Keyword Collector
(133)
🔥
Unstoppable
(5)
Conferences
CVPR (12)
ACL (4)
AAAI (2)
NIPS (2)
CORL (1)
ECCV (1)
EMNLP (1)
ICCV (1)
IJCAI (1)
MICCAI (1)
Top co-authors
Keywords
self-supervised learning
(6)
audio-visual learning
(5)
multimodal learning
(4)
medical image segmentation
(3)
diffusion model
(2)
audio generation
(2)
large language model
(2)
continual learning
(2)
test-time adaptation
(2)
depth estimation
(2)
3d reconstruction
(2)
sound localization
(2)
vision-language model
(2)
multimodal representation
(2)
image generation
(2)
temporal reasoning
(2)
representation learning
(2)
question answering
(2)
attention mechanism
(2)
sound generation
(2)
Papers
LiteLong: Resource-Efficient Long-Context Data Synthesis for LLMs
AAAI 2026
GPS as a Control Signal for Image Generation
CVPR 2025
Gradient Alignment Improves Test-Time Adaptation for Medical Image Segmentation
AAAI 2025
Enjoying Information Dividend: Gaze Track-based Medical Weakly Supervised Segmentation
MICCAI 2025
Advancing General Multimodal Capability of Vision-language Models with Pyramid-descent Visual Position Encoding
ACL 2025
ToolExpNet: Optimizing Multi-Tool Selection in LLMs with Similarity and Dependency-Aware Experience Networks
ACL 2025
Supervising Sound Localization by In-the-wild Egomotion
CVPR 2025
Video-Guided Foley Sound Generation with Multimodal Controls
CVPR 2025
Seeing Far and Clearly: Mitigating Hallucinations in MLLMs with Attention Causal Decoding
CVPR 2025
Real Acoustic Fields: An Audio-Visual Room Acoustics Dataset and Benchmark
CVPR 2024
Images that Sound: Composing Images and Sounds on a Single Canvas
NIPS 2024
Temporal Knowledge Question Answering via Abstract Reasoning Induction
ACL 2024
MoCha-Stereo: Motif Channel Attention Network for Stereo Matching
CVPR 2024
Binding Touch to Everything: Learning Unified Multimodal Tactile Representations
CVPR 2024
Touchstone Benchmark: Are We on the Right Way for Evaluating AI Algorithms for Medical Segmentation?
NIPS 2024
Each Test Image Deserves A Specific Prompt: Continual Test-Time Adaptation for 2D Medical Image Segmentation
CVPR 2024
Continual Self-supervised Learning: Towards Universal Multi-modal Medical Data Representation Learning
CVPR 2024
Conditional Generation of Audio From Video via Foley Analogies
CVPR 2023
Self-Supervised Video Forensics by Audio-Visual Anomaly Detection
CVPR 2023
Large Language Models Meet Harry Potter: A Dataset for Aligning Dialogue Agents with Characters
EMNLP 2023
Sound Localization from Motion: Jointly Learning Sound Direction and Camera Rotation
ICCV 2023
Multi-granularity Temporal Question Answering over Knowledge Graphs
ACL 2023
Mix and Localize: Localizing Sound Sources in Mixtures
CVPR 2022
Sound Localization by Self-Supervised Time Delay Estimation
ECCV 2022
Structure from Silence: Learning Scene Structure from Ambient Sound
CORL 2021
Modeling Trajectories with Recurrent Neural Networks
IJCAI 2017