Xiang Yin

23 papers · 2020–2025 · 8 conferences · across top CS/AI conferences

Achievements

+9 more ↓

🌍 Conference Polyglot (8) 🌈 Renaissance Researcher (5) 🌉 Interdisciplinary Bridge 🧭 Keyword Pioneer 🏃 Academic Marathon (5)

🐝 Cross-Pollinator (13) 🗺️ Taxonomy Completionist (52) 🌈 Renaissance Researcher (5) 🤝 Dynamic Duo (11) 🏆 Grand Slam ⚡ Prolific Year (11) 🗃️ Keyword Collector (116) 🔥 Unstoppable (6) 💎 Century Club (23)

Conferences

INTERSPEECH (6) ACL (5) AAAI (4) ICLR (2) IJCAI (2) NIPS (2) ICCV (1) ICML (1)

Top co-authors

Yi Ren (11) Zejun Ma (9) Jinglin Liu (6) Zhenhui Ye (6) Zhou Zhao (6) Rongjie Huang (5) Chunfeng Wang (5) Jinzheng He (5) Pengfei Wei (5) Ziyue Jiang (4)

Keywords

prosody modeling (3) speech synthesis (3) emotion recognition (2) large language model (2) contrastive learning (2) voice conversion (2) text-to-speech synthesis (2) domain adaptation (2) conversational speech synthesis (2) style transfer (2) multimodal learning (2) diffusion model (2) speech-to-speech translation (2) data augmentation (1) action recognition (1) talking face generation (1) video generation (1) adversarial learning (1) neural machine translation (1) information bottleneck (1)

Papers

Chain-Talker: Chain Understanding and Rendering for Empathetic Conversational Speech Synthesis ACL 2025 Argumentative Large Language Models for Explainable and Contestable Claim Verification AAAI 2025 Mega-TTS 2: Boosting Prompting Mechanisms for Zero-Shot Speech Synthesis ICLR 2024 Real3D-Portrait: One-shot Realistic 3D Talking Portrait Synthesis ICLR 2024 Explaining Arguments’ Strength: Unveiling the Role of Attacks and Supports IJCAI 2024 FedST: Federated Style Transfer Learning for Non-IID Image Segmentation AAAI 2024 MimicTalk: Mimicking a personalized and expressive 3D talking face in minutes NIPS 2024 Emotion Rendering for Conversational Speech Synthesis with Heterogeneous Graph-Based Context Modeling AAAI 2024 GenerTTS: Pronunciation Disentanglement for Timbre and Style Generalization in Cross-Lingual Text-to-Speech INTERSPEECH 2023 Explaining Random Forests Using Bipolar Argumentation and Markov Networks AAAI 2023 UniLG: A Unified Structure-aware Framework for Lyrics Generation ACL 2023 AV-TranSpeech: Audio-Visual Robust Speech-to-Speech Translation ACL 2023 CLAPSpeech: Learning Prosody from Text Context with Contrastive Language-Audio Pre-Training ACL 2023 Virtual Try-On with Pose-Garment Keypoints Guided Inpainting ICCV 2023 Make-An-Audio: Text-To-Audio Generation with Prompt-Enhanced Diffusion Models ICML 2023 AudioQR: Deep Neural Audio Watermarks For QR Code IJCAI 2023 StyleS2ST: Zero-shot Style Transfer for Direct Speech-to-speech Translation INTERSPEECH 2023 S2CD: Self-heuristic Speaker Content Disentanglement for Any-to-Any Voice Conversion INTERSPEECH 2023 Unsupervised Video Domain Adaptation for Action Recognition: A Disentanglement Perspective NIPS 2023 Towards high-fidelity singing voice conversion with acoustic reference and contrastive predictive coding INTERSPEECH 2022 An Automatic Soundtracking System for Text-to-Speech Audiobooks INTERSPEECH 2022 Fine-Grained Prosody Modeling in Neural Speech Synthesis Using ToBI Representation INTERSPEECH 2021 Xiaomingbot: A Multilingual Robot News Reporter ACL 2020