Fengyu Yang

22 papers · 2020–2026 · 8 conferences · across top CS/AI conferences

Achievements

+9 more ↓

🌍 Conference Polyglot (8) 🌉 Interdisciplinary Bridge 🧭 Keyword Pioneer 🌈 Renaissance Researcher (5) 🏃 Academic Marathon (5)

🌈 Renaissance Researcher (5) 🌍 Conference Polyglot (8) 🏃 Academic Marathon (5) 🏆 Keyword Champion (2) 🧬 Topic Evolution 🔥 Unstoppable (6) 💎 Century Club (21) ⚡ Prolific Year (9) 🗃️ Keyword Collector (112)

Conferences

CVPR (7) INTERSPEECH (4) AAAI (3) ECCV (2) ICCV (2) NIPS (2) ACL (1) WACV (1)

Top co-authors

Daniel Wang (4) Hyoungseob Park (4) Andrew Owens (4) Alex Wong (4) Ziyao Zeng (3) Hanbin Zhao (3) Yujun Wang (3) Stefano Soatto (3) Lei Xie (3) DONG LAO (3)

Keywords

multimodal learning (4) tactile sensing (4) diffusion model (3) image restoration (2) tactile perception (2) monocular depth estimation (2) neural radiance field (2) text-to-speech synthesis (2) prosody modeling (2) metric scale (2) zero-shot learning (2) self-supervised learning (1) image generation (1) data augmentation (1) object detection (1) style transfer (1) semantic segmentation (1) contrastive learning (1) computer vision (1) test-time adaptation (1)

Papers

VideoSeg-R1:Reasoning Video Object Segmentation via Reinforcement Learning AAAI 2026 Discretized Gaussian Representation for Tomographic Reconstruction ICCV 2025 Tri-Ergon: Fine-Grained Video-to-Audio Generation with Multi-Modal Conditions and LUFS Control AAAI 2025 TextToucher: Fine-Grained Text-to-Touch Generation AAAI 2025 FreeMan: Towards Benchmarking 3D Human Pose Estimation under Real-World Conditions CVPR 2024 RSA: Resolving Scale Ambiguities in Monocular Depth Estimators through Language Descriptions NIPS 2024 APISR: Anime Production Inspired Real-World Anime Super-Resolution CVPR 2024 WorDepth: Variational Language Prior for Monocular Depth Estimation CVPR 2024 Binding Touch to Everything: Learning Unified Multimodal Tactile Representations CVPR 2024 Tactile-Augmented Radiance Fields CVPR 2024 On the Viability of Monocular Depth Pre-training for Semantic Segmentation ECCV 2024 Towards Expressive Zero-Shot Speech Synthesis with Hierarchical Prosody Modeling INTERSPEECH 2024 VCISR: Blind Single Image Super-Resolution With Video Compression Synthetic Data WACV 2024 The Xiaomi AI Lab’s Speech Translation Systems for IWSLT 2023 Offline Task, Simultaneous Task and Speech-to-Speech Task ACL 2023 Improving Bilingual TTS Using Language And Phonology Embedding With Embedding Strength Modulator INTERSPEECH 2023 Boosting Detection in Crowd Analysis via Underutilized Output Features CVPR 2023 Generating Visual Scenes from Touch ICCV 2023 RBC: Rectifying the Biased Context in Continual Semantic Segmentation ECCV 2022 Sparse and Complete Latent Organization for Geospatial Semantic Segmentation CVPR 2022 Touch and Go: Learning from Human-Collected Vision and Touch NIPS 2022 Enriching Source Style Transfer in Recognition-Synthesis Based Non-Parallel Voice Conversion INTERSPEECH 2021 Exploiting Deep Sentential Context for Expressive End-to-End Speech Synthesis INTERSPEECH 2020