Yonghui Wu
40 papers · 2014–2026 · 11 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+13 more ↓ Show less ↑
π§ Keyword Pioneer π Renaissance Researcher (5) π Interdisciplinary Bridge πΊοΈ Taxonomy Completionist (11) π£ Hot Topic Early Bird
π§
Keyword Pioneer
π£
Hot Topic Early Bird
π
Academic Marathon
(11)
π€
Dynamic Duo
(12)
π
Grand Slam
π₯
Mega-Team
(27)
π§¬
Topic Evolution
π
Keyword Champion
β‘
Prolific Year
(6)
π
Trend Setter
π
Century Club
(39)
π₯
Unstoppable
(12)
ποΈ
Keyword Collector
(149)
Conferences
INTERSPEECH (15)
EMNLP (5)
ACL (4)
ICLR (4)
NIPS (3)
SEMEVAL (3)
ICML (2)
AAAI (1)
COLING (1)
CVPR (1)
NAACL (1)
Top co-authors
Keywords
text-to-speech synthesis
(4)
automatic speech recognition
(4)
speech recognition
(4)
word error rate
(4)
convolutional neural network
(3)
attention mechanism
(3)
sequence-to-sequence model
(3)
neural machine translation
(3)
neural network
(3)
speech synthesis
(3)
dialogue state tracking
(3)
language model
(3)
self-supervised learning
(2)
vision-language model
(2)
text generation
(2)
zero-shot learning
(2)
named entity recognition
(2)
data augmentation
(2)
transfer learning
(2)
machine translation
(2)
Papers
MARCH: Multi-Agent Radiology Clinical Hierarchy for CT Report Generation
ACL 2026
MAP: Low-compute Model Merging with Amortized Pareto Fronts via Quadratic Approximation
ICLR 2025
Comprehensive Study on German Language Models for Clinical and Biomedical Text Understanding
COLING 2024
UF-HOBI at βDischarge Me!β: A Hybrid Solution for Discharge Summary Generation Through Prompt-based Tuning of GatorTronGPT Models
ACL 2024
VILA: Learning Image Aesthetics From User Comments With Vision-Language Pretraining
CVPR 2023
AnyTOD: A Programmable Task-Oriented Dialog System
EMNLP 2023
On the Impact of Cross-Domain Data on German Language Models
EMNLP 2023
Training Text-To-Speech Systems From Synthetic Data: A Practical Approach For Accent Transfer Tasks
INTERSPEECH 2022
Show, Donβt Tell: Demonstrations Outperform Descriptions for Schema-Guided Task-Oriented Dialogue
NAACL 2022
SGD-X: A Benchmark for Robust Generalization in Schema-Guided Dialogue Systems
AAAI 2022
Vector-quantized Image Modeling with Improved VQGAN
ICLR 2022
Self-supervised learning with random-projection quantizer for speech recognition
ICML 2022
GLaM: Efficient Scaling of Language Models with Mixture-of-Experts
ICML 2022
Dual-mode ASR: Unify and Improve Streaming ASR with Full-context Modeling
ICLR 2021
Parallel Tacotron 2: A Non-Autoregressive Neural TTS Model with Differentiable Duration Modeling
INTERSPEECH 2021
Effective Sequence-to-Sequence Dialogue State Tracking
EMNLP 2021
PnG BERT: Augmented BERT on Phonemes and Graphemes for Neural TTS
INTERSPEECH 2021
Improved Noisy Student Training for Automatic Speech Recognition
INTERSPEECH 2020
Conformer: Convolution-augmented Transformer for Speech Recognition
INTERSPEECH 2020
ContextNet: Improving Convolutional Neural Networks for Automatic Speech Recognition with Global Context
INTERSPEECH 2020
Leveraging Monolingual Data with Self-Supervision for Multilingual Neural Machine Translation
ACL 2020
Large-Scale Multilingual Speech Recognition with a Streaming End-to-End Model
INTERSPEECH 2019
GPipe: Efficient Training of Giant Neural Networks using Pipeline Parallelism
NIPS 2019
Hierarchical Generative Modeling for Controllable Speech Synthesis
ICLR 2019
Direct Speech-to-Speech Translation with a Sequence-to-Sequence Model
INTERSPEECH 2019
LibriTTS: A Corpus Derived from LibriSpeech for Text-to-Speech
INTERSPEECH 2019
Learning to Speak Fluently in a Foreign Language: Multilingual Speech Synthesis and Cross-Language Voice Cloning
INTERSPEECH 2019
Two-Pass End-to-End Speech Recognition
INTERSPEECH 2019
Training Deeper Neural Machine Translation Models with Transparent Attention
EMNLP 2018
The Best of Both Worlds: Combining Recent Advances in Neural Machine Translation
ACL 2018
Transfer Learning from Speaker Verification to Multispeaker Text-To-Speech Synthesis
NIPS 2018
Compression of End-to-End Models
INTERSPEECH 2018
Speech Recognition for Medical Conversations
INTERSPEECH 2018
CaLcs: Continuously Approximating Longest Common Subsequence for Sequence Level Optimization
EMNLP 2018
Sequence-to-Sequence Models Can Directly Translate Foreign Speech
INTERSPEECH 2017
Tacotron: Towards End-to-End Speech Synthesis
INTERSPEECH 2017
UTHealth at SemEval-2016 Task 12: an End-to-End System for Temporal Information Extraction from Clinical Notes
SEMEVAL 2016
Reward Augmented Maximum Likelihood for Neural Structured Prediction
NIPS 2016
UTH-CCB: The Participation of the SemEval 2015 Challenge β Task 14
SEMEVAL 2015
UTH_CCB: A report for SemEval 2014 β Task 7 Analysis of Clinical Text
SEMEVAL 2014