Heiga Zen

20 papers · 2016–2024 · 5 conferences · across top CS/AI conferences

Achievements

+12 more ↓

🧭 Keyword Pioneer 🗺️ Taxonomy Completionist (10) 🌈 Renaissance Researcher (5) 🌉 Interdisciplinary Bridge 🌍 Conference Polyglot (5)

🌍 Conference Polyglot (5) 🏃 Academic Marathon (8) 🐝 Cross-Pollinator (13) 🏆 Keyword Champion (2) 👥 Mega-Team (22) 🤝 Dynamic Duo (11) 🔬 Deep Specialist (13) 🧬 Topic Evolution 🗃️ Keyword Collector (88) ⚡ Prolific Year (5) 🚀 Conference Pioneer 💎 Century Club (20)

Conferences

INTERSPEECH (14) ICLR (3) CORL (1) ICML (1) NIPS (1)

Top co-authors

Yu Zhang (11) Yonghui Wu (6) Ye Jia (6) Jonathan Shen (4) Ron J. Weiss (4) Michiel Bacchiani (3) Yuma Koizumi (3) zhifeng Chen (3) Bhuvana Ramabhadran (3) Nanxin Chen (3)

Keywords

speech synthesis (5) text-to-speech synthesis (4) prosody prediction (2) waveform generation (2) large language model (2) long short-term memory (2) multilingual speech (2) recurrent neural network (2) statistical parametric speech synthesis (2) speech restoration (2) speech corpus (2) non-autoregressive model (2) multilingual speech synthesis (2) language model alignment (1) automatic speech recognition (1) self-supervised learning (1) multi-modal learning (1) text representation (1) direct preference optimization (1) semi-supervised learning (1)

Papers

Geometric-Averaged Preference Optimization for Soft Preference Labels NIPS 2024 FLEURS-R: A Restored Multilingual Speech Corpus for Generation Tasks INTERSPEECH 2024 SayTap: Language to Quadrupedal Locomotion CORL 2023 LibriTTS-R: A Restored Multi-Speaker Text-to-Speech Corpus INTERSPEECH 2023 MAESTRO: Matched Speech Text Representations through Modality Matching INTERSPEECH 2022 SpecGrad: Diffusion Probabilistic Model based Neural Vocoder with Adaptive Noise Spectral Shaping INTERSPEECH 2022 Training Text-To-Speech Systems From Synthetic Data: A Practical Approach For Accent Transfer Tasks INTERSPEECH 2022 Parallel Tacotron 2: A Non-Autoregressive Neural TTS Model with Differentiable Duration Modeling INTERSPEECH 2021 WaveGrad: Estimating Gradients for Waveform Generation ICLR 2021 PnG BERT: Augmented BERT on Phonemes and Graphemes for Neural TTS INTERSPEECH 2021 Semi-Supervision in ASR: Sequential MixMatch and Factorized TTS-Based Augmentation INTERSPEECH 2021 WaveGrad 2: Iterative Refinement for Text-to-Speech Synthesis INTERSPEECH 2021 Learning to Speak Fluently in a Foreign Language: Multilingual Speech Synthesis and Cross-Language Voice Cloning INTERSPEECH 2019 Hierarchical Generative Modeling for Controllable Speech Synthesis ICLR 2019 Sample Efficient Adaptive Text-to-Speech ICLR 2019 LibriTTS: A Corpus Derived from LibriSpeech for Text-to-Speech INTERSPEECH 2019 Sequence-to-sequence Neural Network Model with 2D Attention for Learning Japanese Pitch Accents INTERSPEECH 2018 Parallel WaveNet: Fast High-Fidelity Speech Synthesis ICML 2018 Multi-Language Multi-Speaker Acoustic Modeling for LSTM-RNN Based Statistical Parametric Speech Synthesis INTERSPEECH 2016 Fast, Compact, and High Quality LSTM-RNN Based Statistical Parametric Speech Synthesizers for Mobile Devices INTERSPEECH 2016