Kaizhi Qian

18 papers · 2017–2025 · 7 conferences · across top CS/AI conferences

Achievements

+8 more ↓

🏃 Academic Marathon (8) 🧭 Keyword Pioneer 🌉 Interdisciplinary Bridge 🌍 Conference Polyglot (7) 🐣 Hot Topic Early Bird

🌍 Conference Polyglot (7) 🏃 Academic Marathon (8) 🧭 Keyword Pioneer 🤝 Dynamic Duo (15) 🧬 Topic Evolution 🗃️ Keyword Collector (78) 💎 Century Club (18) 🔥 Unstoppable (7)

Conferences

ICML (7) INTERSPEECH (5) NIPS (2) AAAI (1) ACL (1) CVPR (1) ICCV (1)

Top co-authors

Yang Zhang (15) Shiyu Chang (11) Mark Hasegawa-Johnson (8) Heting Gao (5) Junrui Ni (5) Chuang Gan (5) David Cox (4) Yonggan Fu (2) Jiaben Chen (2) Xuesong Yang (2)

Research topics

Synthesis (1)

Keywords

automatic speech recognition (4) voice conversion (3) unsupervised learning (3) representation learning (3) zero-shot learning (3) self-supervised learning (3) speech recognition (2) speech processing (2) model pruning (2) speech representation (2) transfer learning (2) speech synthesis (2) speech enhancement (2) multimodal generation (2) knowledge transfer (1) autoregressive transformer (1) style transfer (1) information bottleneck (1) few-shot learning (1) bayesian inference (1)

Papers

PLAY2PROMPT: Zero-shot Tool Instruction Optimization for LLM Agents via Tool Play ACL 2025 UniMuMo: Unified Text, Music, and Motion Generation AAAI 2025 RapVerse: Coherent Vocals and Whole-Body Motion Generation from Text ICCV 2025 Speech Self-Supervised Learning Using Diffusion Model Synthetic Data ICML 2024 Decomposing Uncertainty for Large Language Models through Input Clarification Ensembling ICML 2024 Physics-Driven Diffusion Models for Impact Sound Synthesis From Videos CVPR 2023 Master-ASR: Achieving Multilingual Scalability and Low-Resource Adaptation in ASR with Modular Learning ICML 2023 WavPrompt: Towards Few-Shot Spoken Language Understanding with Frozen Language Models INTERSPEECH 2022 Losses Can Be Blessings: Routing Self-Supervised Speech Representations Towards Efficient Multilingual and Multitask Speech Processing NIPS 2022 ContentVec: An Improved Self-Supervised Speech Representation by Disentangling Speakers ICML 2022 Unsupervised Text-to-Speech Synthesis by Unsupervised Automatic Speech Recognition INTERSPEECH 2022 Zero-Shot Cross-Lingual Phonetic Recognition with External Language Embedding INTERSPEECH 2021 Speech Denoising with Auditory Models INTERSPEECH 2021 PARP: Prune, Adjust and Re-Prune for Self-Supervised Speech Recognition NIPS 2021 Global Prosody Style Transfer Without Text Transcriptions ICML 2021 Unsupervised Speech Decomposition via Triple Information Bottleneck ICML 2020 AutoVC: Zero-Shot Voice Style Transfer with Only Autoencoder Loss ICML 2019 Speech Enhancement Using Bayesian Wavenet INTERSPEECH 2017