Shan Yang

20 papers · 2017–2026 · 9 conferences · across top CS/AI conferences

Achievements

+8 more ↓

🏃 Academic Marathon (8) 🧭 Keyword Pioneer 🌉 Interdisciplinary Bridge 🌍 Conference Polyglot (9) 🐝 Cross-Pollinator (9)

🏃 Academic Marathon (8) 🧭 Keyword Pioneer 🐝 Cross-Pollinator (9) 🧬 Topic Evolution 🔥 Unstoppable (6) 💎 Century Club (19) ⚡ Prolific Year (7) 🗃️ Keyword Collector (96)

Conferences

INTERSPEECH (8) AAAI (3) EMNLP (2) ICCV (2) ACL (1) CVPR (1) ECCV (1) IJCNLP (1) NIPS (1)

Top co-authors

Lei Xie (7) Dan Su (6) Jian Cong (4) Junbang Liang (3) Dong Yu (2) Shiliang Pu (2) Guanglin Niu (2) Yu Lou (2) Yongfei Zhang (2) Na Hu (2)

Keywords

speech synthesis (5) text-to-speech synthesis (4) multimodal learning (3) semantic similarity (2) self-supervised learning (2) zero-shot learning (2) domain adversarial training (2) multimodal fusion (2) relation extraction (2) voice conversion (2) knowledge distillation (2) variational autoencoder (2) attention mechanism (2) flow-based model (2) singing voice synthesis (2) fine-grained classification (1) audio-visual learning (1) few-shot learning (1) acoustic model (1) video generation (1)

Papers

UniCUE: Unified Recognition and Generation Framework for Chinese Cued Speech Video-to-Speech Generation AAAI 2026 ThinkAnswer Loss: Balancing Semantic Similarity and Exact Matching for LLM Reasoning Enhancement EMNLP 2025 TokSing: Singing Voice Synthesis based on Discrete Tokens INTERSPEECH 2024 ViLA: Efficient Video-Language Alignment for Video Question Answering ECCV 2024 Unleashing the Power of Large Language Models in Zero-shot Relation Extraction via Self-Prompting EMNLP 2024 ICAR: Image-Based Complementary Auto Reasoning AAAI 2024 UniSyn: An End-to-End Unified Model for Text-to-Speech and Singing Voice Synthesis AAAI 2023 Multi-mode Neural Speech Coding Based on Deep Generative Networks INTERSPEECH 2023 Glow-WaveGAN 2: High-quality Zero-shot Text-to-speech Synthesis and Any-to-any Voice Conversion INTERSPEECH 2022 Learning Noise-independent Speech Representation for High-quality Voice Conversion for Noisy Target Speakers INTERSPEECH 2022 Entity Concept-enhanced Few-shot Relation Extraction IJCNLP 2021 Entity Concept-enhanced Few-shot Relation Extraction ACL 2021 GLAVNet: Global-Local Audio-Visual Cues for Fine-Grained Material Recognition CVPR 2021 AI Choreographer: Music Conditioned 3D Dance Generation With AIST++ ICCV 2021 Attention Bottlenecks for Multimodal Fusion NIPS 2021 Glow-WaveGAN: Learning Speech Representations from GAN-Based Variational Auto-Encoder for High Fidelity Flow-Based Speech Synthesis INTERSPEECH 2021 Controllable Context-Aware Conversational Speech Synthesis INTERSPEECH 2021 Exploiting Deep Sentential Context for Expressive End-to-End Speech Synthesis INTERSPEECH 2020 Data Efficient Voice Cloning from Noisy Samples with Domain Adversarial Training INTERSPEECH 2020 Learning-Based Cloth Material Recovery From Video ICCV 2017