Zhizheng Wu

21 papers · 2016–2026 · 8 conferences · across top CS/AI conferences

Achievements

+10 more ↓

🐝 Cross-Pollinator (8) 🧭 Keyword Pioneer 🏃 Academic Marathon (9) 🌍 Conference Polyglot (6) 🌈 Renaissance Researcher (5)

🏃 Academic Marathon (9) 🐝 Cross-Pollinator (8) 🌉 Interdisciplinary Bridge 🏆 Grand Slam 🧬 Topic Evolution 🏆 Keyword Champion 💎 Century Club (18) 🚀 Conference Pioneer ⚡ Prolific Year (5) 🗃️ Keyword Collector (69)

Conferences

INTERSPEECH (10) ICLR (3) ACL (2) NIPS (2) AAAI (1) AACL (1) ICML (1) IJCNLP (1)

Top co-authors

Yuancheng Wang (6) Xueyao Zhang (5) Haizhou Li (4) Xiaohai Tian (3) Simon King (3) Xu Tan (2) Jinyu Li (2) Chen Chen (2) Mirjam Wester (2) Lei He (2)

Keywords

speech synthesis (5) speaker identity (3) preference alignment (2) data poisoning (2) voice conversion (2) model compression (2) poison pill attack (2) knowledge retrieval (2) direct preference optimization (2) model pruning (2) spoofing detection (1) deep learning (1) instruction following (1) speaker embedding (1) zero-shot learning (1) face recognition (1) connectionist temporal classification (1) multimodal learning (1) signal processing (1) speaker verification (1)

Papers

Closing the Modality Reasoning Gap for Speech Large Language Models ACL 2026 Multi-Metric Preference Alignment for Generative Speech Restoration AAAI 2026 MaskGCT: Zero-Shot Text-to-Speech with Masked Generative Codec Transformer ICLR 2025 Vevo: Controllable Zero-Shot Voice Imitation with Self-Supervised Disentanglement ICLR 2025 LOKI: A Comprehensive Synthetic Data Detection Benchmark using Large Multimodal Models ICLR 2025 Swallowing the Poison Pills: Insights from Vulnerability Disparity Among LLMs IJCNLP 2025 Swallowing the Poison Pills: Insights from Vulnerability Disparity Among LLMs AACL 2025 Advancing Zero-shot Text-to-Speech Intelligibility across Diverse Domains via Preference Alignment ACL 2025 NaturalSpeech 3: Zero-Shot Speech Synthesis with Factorized Codec and Diffusion Models ICML 2024 SD-Eval: A Benchmark Dataset for Spoken Dialogue Understanding Beyond Words NIPS 2024 AUDIT: Audio Editing by Following Instructions with Latent Diffusion Models NIPS 2023 PIAVE: A Pose-Invariant Audio-Visual Speaker Extraction Network INTERSPEECH 2023 Cross-Lingual Voice Conversion with a Cycle Consistency Loss on Linguistic Representation INTERSPEECH 2021 Building a Mixed-Lingual Neural TTS System with Only Monolingual Data INTERSPEECH 2019 Siri On-Device Deep Learning-Guided Unit Selection Text-to-Speech System INTERSPEECH 2017 An Investigation of Spoofing Speech Detection Under Additive Noise and Reverberant Conditions INTERSPEECH 2016 Analysis of the Voice Conversion Challenge 2016 Evaluation Results INTERSPEECH 2016 A Template-Based Approach for Speech Synthesis Intonation Generation Using LSTMs INTERSPEECH 2016 Waveform Generation Based on Signal Reshaping for Statistical Parametric Speech Synthesis INTERSPEECH 2016 GlottDNN — A Full-Band Glottal Vocoder for Statistical Parametric Speech Synthesis INTERSPEECH 2016 The Voice Conversion Challenge 2016 INTERSPEECH 2016