Zhehuai Chen

21 papers · 2016–2026 · 4 conferences · across top CS/AI conferences

Achievements

+10 more ↓

🐣 Hot Topic Early Bird 🌉 Interdisciplinary Bridge 🗺️ Taxonomy Completionist (15) 🧭 Keyword Pioneer 🌍 Conference Polyglot (4)

🗺️ Taxonomy Completionist (15) 🧭 Keyword Pioneer 🐣 Hot Topic Early Bird 🧬 Topic Evolution 🗃️ Keyword Collector (83) 🚀 Conference Pioneer 📈 Trend Setter 💎 Century Club (20) 🔥 Unstoppable (8) ⚡ Prolific Year (5)

Conferences

INTERSPEECH (14) ACL (4) NAACL (2) ICLR (1)

Top co-authors

Boris Ginsburg (7) Bhuvana Ramabhadran (6) Chao-Han Huck Yang (5) Yu Zhang (5) Jagadeesh Balam (5) Andrew Rosenberg (5) Pedro J. Moreno (5) Gary Wang (4) Szu-Wei Fu (3) Kunal Dhawan (3)

Keywords

speech recognition (9) automatic speech recognition (7) large language model (6) data augmentation (4) speech translation (4) multimodal learning (3) viterbi beam search (2) semi-supervised learning (2) connectionist temporal classification (2) self-supervised learning (2) word error rate (2) question answering (2) speech language model (2) text-to-speech synthesis (2) machine translation (2) weighted finite-state transducer (2) contrastive learning (2) named entity recognition (1) multilingual translation (1) knowledge distillation (1)

Papers

Speech-Hands: A Self-Reflection Voice Agentic Approach to Speech Recognition and Audio Reasoning with Omni Perception ACL 2026 Anticipating Future with Large Language Model for Simultaneous Machine Translation NAACL 2025 SpeechIQ: Speech-Agentic Intelligence Quotient Across Cognitive Levels in Voice Understanding by Large Language Models ACL 2025 NeKo: Cross-Modality Post-Recognition Error Correction with Tasks-Guided Mixture-of-Experts Language Model ACL 2025 Audio Large Language Models Can Be Descriptive Speech Quality Evaluators ICLR 2025 VoiceTextBlender: Augmenting Large Language Models with Speech Capabilities via Single-Stage Joint Speech-Text Supervised Fine-Tuning NAACL 2025 GenTranslate: Large Language Models are Generative Multilingual Speech and Machine Translators ACL 2024 Less is More: Accurate Speech Recognition & Translation without Web-Scale Data INTERSPEECH 2024 Instruction Data Generation and Unsupervised Adaptation for Speech Language Models INTERSPEECH 2024 DeSTA: Enhancing Speech Language Models through Descriptive Speech-Text Alignment INTERSPEECH 2024 Using Text Injection to Improve Recognition of Personal Identifiers in Speech INTERSPEECH 2023 Unsupervised Data Selection via Discrete Speech Representation for ASR INTERSPEECH 2022 MAESTRO: Matched Speech Text Representations through Modality Matching INTERSPEECH 2022 Semi-Supervision in ASR: Sequential MixMatch and Factorized TTS-Based Augmentation INTERSPEECH 2021 Conformer Parrotron: A Faster and Stronger End-to-End Speech Conversion and Recognition Model for Atypical Speech INTERSPEECH 2021 SCADA: Stochastic, Consistent and Adversarial Data Augmentation to Improve ASR INTERSPEECH 2020 Improving Speech Recognition Using GAN-Based Speech Synthesis and Contrastive Unspoken Text Selection INTERSPEECH 2020 Joint Grapheme and Phoneme Embeddings for Contextual End-to-End ASR INTERSPEECH 2019 Knowledge Distillation for Sequence Model INTERSPEECH 2018 A GPU-based WFST Decoder with Exact Lattice Generation INTERSPEECH 2018 Phone Synchronous Decoding with CTC Lattice INTERSPEECH 2016