Zhizheng Wu
21 papers · 2016–2026 · 8 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+10 more ↓ Show less ↑
🐝 Cross-Pollinator (8) 🧭 Keyword Pioneer 🏃 Academic Marathon (9) 🌍 Conference Polyglot (6) 🌈 Renaissance Researcher (5)
🏃
Academic Marathon
(9)
🐝
Cross-Pollinator
(8)
🌉
Interdisciplinary Bridge
🏆
Grand Slam
🧬
Topic Evolution
🏆
Keyword Champion
💎
Century Club
(18)
🚀
Conference Pioneer
⚡
Prolific Year
(5)
🗃️
Keyword Collector
(69)
Conferences
INTERSPEECH (10)
ICLR (3)
ACL (2)
NIPS (2)
AAAI (1)
AACL (1)
ICML (1)
IJCNLP (1)
Top co-authors
Keywords
speech synthesis
(5)
speaker identity
(3)
preference alignment
(2)
data poisoning
(2)
voice conversion
(2)
model compression
(2)
poison pill attack
(2)
knowledge retrieval
(2)
direct preference optimization
(2)
model pruning
(2)
spoofing detection
(1)
deep learning
(1)
instruction following
(1)
speaker embedding
(1)
zero-shot learning
(1)
face recognition
(1)
connectionist temporal classification
(1)
multimodal learning
(1)
signal processing
(1)
speaker verification
(1)
Papers
Closing the Modality Reasoning Gap for Speech Large Language Models
ACL 2026
Multi-Metric Preference Alignment for Generative Speech Restoration
AAAI 2026
MaskGCT: Zero-Shot Text-to-Speech with Masked Generative Codec Transformer
ICLR 2025
Vevo: Controllable Zero-Shot Voice Imitation with Self-Supervised Disentanglement
ICLR 2025
LOKI: A Comprehensive Synthetic Data Detection Benchmark using Large Multimodal Models
ICLR 2025
Swallowing the Poison Pills: Insights from Vulnerability Disparity Among LLMs
IJCNLP 2025
Swallowing the Poison Pills: Insights from Vulnerability Disparity Among LLMs
AACL 2025
Advancing Zero-shot Text-to-Speech Intelligibility across Diverse Domains via Preference Alignment
ACL 2025
NaturalSpeech 3: Zero-Shot Speech Synthesis with Factorized Codec and Diffusion Models
ICML 2024
SD-Eval: A Benchmark Dataset for Spoken Dialogue Understanding Beyond Words
NIPS 2024
AUDIT: Audio Editing by Following Instructions with Latent Diffusion Models
NIPS 2023
PIAVE: A Pose-Invariant Audio-Visual Speaker Extraction Network
INTERSPEECH 2023
Cross-Lingual Voice Conversion with a Cycle Consistency Loss on Linguistic Representation
INTERSPEECH 2021
Building a Mixed-Lingual Neural TTS System with Only Monolingual Data
INTERSPEECH 2019
Siri On-Device Deep Learning-Guided Unit Selection Text-to-Speech System
INTERSPEECH 2017
An Investigation of Spoofing Speech Detection Under Additive Noise and Reverberant Conditions
INTERSPEECH 2016
Analysis of the Voice Conversion Challenge 2016 Evaluation Results
INTERSPEECH 2016
A Template-Based Approach for Speech Synthesis Intonation Generation Using LSTMs
INTERSPEECH 2016
Waveform Generation Based on Signal Reshaping for Statistical Parametric Speech Synthesis
INTERSPEECH 2016
GlottDNN — A Full-Band Glottal Vocoder for Statistical Parametric Speech Synthesis
INTERSPEECH 2016
The Voice Conversion Challenge 2016
INTERSPEECH 2016