Chao Weng

21 papers · 2018–2025 · 4 conferences · across top CS/AI conferences

Achievements

+10 more ↓

🌍 Conference Polyglot (4) 🌉 Interdisciplinary Bridge 🗺️ Taxonomy Completionist (12) 🧭 Keyword Pioneer 🏃 Academic Marathon (7)

🏃 Academic Marathon (7) 🐝 Cross-Pollinator (15) 🌈 Renaissance Researcher (6) 👥 Mega-Team (20) 🤝 Dynamic Duo (13) 🧬 Topic Evolution 💎 Century Club (21) 🗃️ Keyword Collector (97) 🔥 Unstoppable (8) ⚡ Prolific Year (5)

Conferences

INTERSPEECH (17) ACL (2) CVPR (1) ICLR (1)

Top co-authors

Dong Yu (13) Jianwei Yu (7) Chengzhu Yu (6) Dan Su (5) Jia Cui (4) Chunlei Zhang (4) Shinji Watanabe (3) Jinchuan Tian (3) Meng Yu (3) Helin Wang (3)

Keywords

speech recognition (3) automatic speech recognition (2) language model (2) expressive speech synthesis (2) diffusion model (2) speech enhancement (2) character error rate (2) end-to-end speech recognition (2) duration informed attention network (2) speech synthesis (2) minimum bayes risk (2) attention mechanism (2) multitask learning (1) speech separation (1) multimodal learning (1) noise suppression (1) zero-shot learning (1) semi-supervised training (1) knowledge distillation (1) bayesian inference (1)

Papers

LLaSE-G1: Incentivizing Generalization Capability for LLaMA-based Speech Enhancement ACL 2025 Make-A-Voice: Revisiting Voice Large Language Models as Scalable Multilingual and Multitask Learners ACL 2024 VideoCrafter2: Overcoming Data Limitations for High-Quality Video Diffusion Models CVPR 2024 Bayes Risk Transducer: Transducer with Controllable Alignment Prediction INTERSPEECH 2023 BAYES RISK CTC: CONTROLLABLE CTC ALIGNMENT IN SEQUENCE-TO-SEQUENCE TASKS ICLR 2023 High Fidelity Speech Enhancement with Band-split RNN INTERSPEECH 2023 Ultra Dual-Path Compression For Joint Echo Cancellation And Noise Suppression INTERSPEECH 2023 NoreSpeech: Knowledge Distillation based Conditional Diffusion Model for Noise-robust Expressive TTS INTERSPEECH 2023 Diverse and Expressive Speech Prosody Prediction with Denoising Diffusion Probabilistic Model INTERSPEECH 2023 Improving Target Sound Extraction with Timestamp Information INTERSPEECH 2022 TeCANet: Temporal-Contextual Attention Network for Environment-Aware Speech Dereverberation INTERSPEECH 2021 GigaSpeech: An Evolving, Multi-Domain ASR Corpus with 10,000 Hours of Transcribed Audio INTERSPEECH 2021 Raw Waveform Encoder with Multi-Scale Globally Attentive Locally Recurrent Networks for End-to-End Speech Recognition INTERSPEECH 2021 Minimum Bayes Risk Training of RNN-Transducer for End-to-End Speech Recognition INTERSPEECH 2020 DurIAN-SC: Duration Informed Attention Network Based Singing Voice Conversion System INTERSPEECH 2020 DurIAN: Duration Informed Attention Network for Speech Synthesis INTERSPEECH 2020 Neural Spatio-Temporal Beamformer for Target Speech Separation INTERSPEECH 2020 Peking Opera Synthesis via Duration Informed Attention Network INTERSPEECH 2020 Large Margin Training for Attention Based End-to-End Speech Recognition INTERSPEECH 2019 Improving Attention Based Sequence-to-Sequence Models for End-to-End English Conversational Speech Recognition INTERSPEECH 2018 A Multistage Training Framework for Acoustic-to-Word Model INTERSPEECH 2018