Yonghui Wu

40 papers · 2014–2026 · 11 conferences · across top CS/AI conferences

Achievements

+13 more ↓

🧭 Keyword Pioneer 🌈 Renaissance Researcher (5) 🌉 Interdisciplinary Bridge 🗺️ Taxonomy Completionist (11) 🐣 Hot Topic Early Bird

🧭 Keyword Pioneer 🐣 Hot Topic Early Bird 🏃 Academic Marathon (11) 🤝 Dynamic Duo (12) 🏆 Grand Slam 👥 Mega-Team (27) 🧬 Topic Evolution 🏆 Keyword Champion ⚡ Prolific Year (6) 📈 Trend Setter 💎 Century Club (39) 🔥 Unstoppable (12) 🗃️ Keyword Collector (149)

Conferences

INTERSPEECH (15) EMNLP (5) ACL (4) ICLR (4) NIPS (3) SEMEVAL (3) ICML (2) AAAI (1) COLING (1) CVPR (1) NAACL (1)

Top co-authors

zhifeng Chen (12) Yu Zhang (11) Chung-Cheng Chiu (9) Ye Jia (9) Ruoming Pang (8) Yuan Cao (7) Jiahui Yu (6) Ron J. Weiss (6) Heiga Zen (6) Jonathan Shen (5)

Keywords

text-to-speech synthesis (4) automatic speech recognition (4) speech recognition (4) word error rate (4) convolutional neural network (3) attention mechanism (3) sequence-to-sequence model (3) neural machine translation (3) neural network (3) speech synthesis (3) dialogue state tracking (3) language model (3) self-supervised learning (2) vision-language model (2) text generation (2) zero-shot learning (2) named entity recognition (2) data augmentation (2) transfer learning (2) machine translation (2)

Papers

MARCH: Multi-Agent Radiology Clinical Hierarchy for CT Report Generation ACL 2026 MAP: Low-compute Model Merging with Amortized Pareto Fronts via Quadratic Approximation ICLR 2025 Comprehensive Study on German Language Models for Clinical and Biomedical Text Understanding COLING 2024 UF-HOBI at “Discharge Me!”: A Hybrid Solution for Discharge Summary Generation Through Prompt-based Tuning of GatorTronGPT Models ACL 2024 VILA: Learning Image Aesthetics From User Comments With Vision-Language Pretraining CVPR 2023 AnyTOD: A Programmable Task-Oriented Dialog System EMNLP 2023 On the Impact of Cross-Domain Data on German Language Models EMNLP 2023 Training Text-To-Speech Systems From Synthetic Data: A Practical Approach For Accent Transfer Tasks INTERSPEECH 2022 Show, Don’t Tell: Demonstrations Outperform Descriptions for Schema-Guided Task-Oriented Dialogue NAACL 2022 SGD-X: A Benchmark for Robust Generalization in Schema-Guided Dialogue Systems AAAI 2022 Vector-quantized Image Modeling with Improved VQGAN ICLR 2022 Self-supervised learning with random-projection quantizer for speech recognition ICML 2022 GLaM: Efficient Scaling of Language Models with Mixture-of-Experts ICML 2022 Dual-mode ASR: Unify and Improve Streaming ASR with Full-context Modeling ICLR 2021 Parallel Tacotron 2: A Non-Autoregressive Neural TTS Model with Differentiable Duration Modeling INTERSPEECH 2021 Effective Sequence-to-Sequence Dialogue State Tracking EMNLP 2021 PnG BERT: Augmented BERT on Phonemes and Graphemes for Neural TTS INTERSPEECH 2021 Improved Noisy Student Training for Automatic Speech Recognition INTERSPEECH 2020 Conformer: Convolution-augmented Transformer for Speech Recognition INTERSPEECH 2020 ContextNet: Improving Convolutional Neural Networks for Automatic Speech Recognition with Global Context INTERSPEECH 2020 Leveraging Monolingual Data with Self-Supervision for Multilingual Neural Machine Translation ACL 2020 Large-Scale Multilingual Speech Recognition with a Streaming End-to-End Model INTERSPEECH 2019 GPipe: Efficient Training of Giant Neural Networks using Pipeline Parallelism NIPS 2019 Hierarchical Generative Modeling for Controllable Speech Synthesis ICLR 2019 Direct Speech-to-Speech Translation with a Sequence-to-Sequence Model INTERSPEECH 2019 LibriTTS: A Corpus Derived from LibriSpeech for Text-to-Speech INTERSPEECH 2019 Learning to Speak Fluently in a Foreign Language: Multilingual Speech Synthesis and Cross-Language Voice Cloning INTERSPEECH 2019 Two-Pass End-to-End Speech Recognition INTERSPEECH 2019 Training Deeper Neural Machine Translation Models with Transparent Attention EMNLP 2018 The Best of Both Worlds: Combining Recent Advances in Neural Machine Translation ACL 2018 Transfer Learning from Speaker Verification to Multispeaker Text-To-Speech Synthesis NIPS 2018 Compression of End-to-End Models INTERSPEECH 2018 Speech Recognition for Medical Conversations INTERSPEECH 2018 CaLcs: Continuously Approximating Longest Common Subsequence for Sequence Level Optimization EMNLP 2018 Sequence-to-Sequence Models Can Directly Translate Foreign Speech INTERSPEECH 2017 Tacotron: Towards End-to-End Speech Synthesis INTERSPEECH 2017 UTHealth at SemEval-2016 Task 12: an End-to-End System for Temporal Information Extraction from Clinical Notes SEMEVAL 2016 Reward Augmented Maximum Likelihood for Neural Structured Prediction NIPS 2016 UTH-CCB: The Participation of the SemEval 2015 Challenge – Task 14 SEMEVAL 2015 UTH_CCB: A report for SemEval 2014 – Task 7 Analysis of Clinical Text SEMEVAL 2014