Yingming Gao

13 papers · 2019–2026 · 3 conferences · across top CS/AI conferences

Achievements

+8 more ↓

🧭 Keyword Pioneer 🌉 Interdisciplinary Bridge 🌍 Conference Polyglot (3) 🏃 Academic Marathon (6) 🐝 Cross-Pollinator (12)

🗺️ Taxonomy Completionist (28) 🐣 Hot Topic Early Bird 🌍 Conference Polyglot (3) 🏆 Keyword Champion (2) 🧬 Topic Evolution 💎 Century Club (12) 🗃️ Keyword Collector (72) 🔥 Unstoppable (5)

Conferences

INTERSPEECH (10) AAAI (2) ACL (1)

Top co-authors

Ya Li (8) Jinsong Zhang (4) Cong Wang (2) Fengping Wang (2) Yayue Deng (2) Yanlu Xie (2) Jinlong Xue (2) Peter Birkholz (2) Dengfeng Ke (2) Puyuan Guo (2)

Keywords

diffusion model (3) large language model (2) singing voice conversion (2) speech synthesis (2) zero-shot learning (2) text-to-speech synthesis (2) audio codec (2) mandarin speech (1) pitch tracking (1) knowledge distillation (1) multitask learning (1) phonetic analysis (1) multi-modal learning (1) u-net architecture (1) attention mechanism (1) convolutional neural network (1) speech processing (1) support vector machine (1) controllable generation (1) motion generation (1)

Papers

HQ-SVC: Towards High-Quality Zero-Shot Singing Voice Conversion in Low-Resource Scenarios AAAI 2026 Beyond Surface Simplicity: Revealing Hidden Reasoning Attributes for Precise Commonsense Diagnosis ACL 2025 Controllable 3D Dance Generation Using Diffusion-Based Transformer U-Net AAAI 2025 Enhancing Modal Fusion by Alignment and Label Matching for Multimodal Emotion Recognition INTERSPEECH 2024 Improving Audio Codec-based Zero-Shot Text-to-Speech Synthesis with Multi-Modal Context and Large Language Model INTERSPEECH 2024 Retrieval Augmented Generation in Prompt-based Text-to-Speech Synthesis with Context-Aware Contrastive Language-Audio Pretraining INTERSPEECH 2024 SPA-SVC: Self-supervised Pitch Augmentation for Singing Voice Conversion INTERSPEECH 2024 FTA-net: A Frequency and Time Attention Network for Speech Depression Detection INTERSPEECH 2023 Dual Audio Encoders Based Mandarin Prosodic Boundary Prediction by Using Multi-Granularity Prosodic Representations INTERSPEECH 2023 A study of production error analysis for Mandarin-speaking Children with Hearing Impairment INTERSPEECH 2022 Formant Tracking Using Dilated Convolutional Networks Through Dense Connection with Gating Mechanism INTERSPEECH 2020 An Investigation of the Target Approximation Model for Tone Modeling and Recognition in Continuous Mandarin Speech INTERSPEECH 2020 Articulatory Copy Synthesis Based on a Genetic Algorithm INTERSPEECH 2019