Chao-Han Huck Yang

39 papers · 2020–2026 · 11 conferences · across top CS/AI conferences

Achievements

+11 more ↓

🧭 Keyword Pioneer 🌈 Renaissance Researcher (5) 🌉 Interdisciplinary Bridge 🗺️ Taxonomy Completionist (17) 🌍 Conference Polyglot (11)

🐣 Hot Topic Early Bird 🧭 Keyword Pioneer 🐝 Cross-Pollinator (12) 🏆 Grand Slam 🏆 Keyword Champion (2) 👥 Mega-Team (76) 🔥 Unstoppable (6) ⚡ Prolific Year (7) 💎 Century Club (37) 🗃️ Keyword Collector (137) ❓ The Questioner

Conferences

INTERSPEECH (9) ICLR (8) ACL (7) EMNLP (6) ICML (2) NIPS (2) AAAI (1) ICCV (1) NAACL (1) UAI (1) WACV (1)

Top co-authors

Pin-Yu Chen (9) Chen Chen (7) Yuchen Hu (7) Shinji Watanabe (7) Sabato Marco Siniscalchi (7) Zhehuai Chen (5) Yu-Chiang Frank Wang (5) Yusuke Hirota (4) Chao Zhang (4) Ryo Hachiuma (4)

Research topics

Differential Privacy (1) Processing (1)

Keywords

automatic speech recognition (6) large language model (5) speech recognition (4) transfer learning (3) gender bia (3) vision-language model (3) machine translation (3) parameter efficiency (2) domain adaptation (2) error correction (2) differential privacy (2) deep reinforcement learning (2) evaluation benchmark (2) optimal transport (2) speech enhancement (2) image captioning (2) multimodal learning (2) representation learning (2) speech processing (2) parameter-efficient learning (2)

Papers

PRiSM: Benchmarking Phone Realization in Speech Models ACL 2026 Speech-Hands: A Self-Reflection Voice Agentic Approach to Speech Recognition and Audio Reasoning with Omni Perception ACL 2026 Towards Neural Scaling Laws for Time Series Foundation Models ICLR 2025 MISP-Meeting: A Real-World Dataset with Multimodal Cues for Long-form Meeting Transcription and Summarization ACL 2025 SpeechIQ: Speech-Agentic Intelligence Quotient Across Cognitive Levels in Voice Understanding by Large Language Models ACL 2025 NeKo: Cross-Modality Post-Recognition Error Correction with Tasks-Guided Mixture-of-Experts Language Model ACL 2025 LOTUS: A Leaderboard for Detailed Image Captioning from Quality to Societal Bias and User Preferences ACL 2025 CoVoGER: A Multilingual Multitask Benchmark for Speech-to-text Generative Error Correction with Large Language Models EMNLP 2025 Extending Automatic Machine Translation Evaluation to Book-Length Documents EMNLP 2025 Bias in Gender Bias Benchmarks: How Spurious Features Distort Evaluation ICCV 2025 Fugatto 1: Foundational Generative Audio Transformer Opus 1 ICLR 2025 Audio Large Language Models Can Be Descriptive Speech Quality Evaluators ICLR 2025 UniWav: Towards Unified Pre-training for Speech Representation Learning and Generation ICLR 2025 Dynamic-SUPERB Phase-2: A Collaboratively Expanding Benchmark for Measuring the Capabilities of Spoken Language Models with 180 Tasks ICLR 2025 A Quantum Circuit-Based Compression Perspective for Parameter-Efficient Learning ICLR 2025 OWLS: Scaling Laws for Multilingual Speech Recognition and Translation Models ICML 2025 ESPnet-SpeechLM: An Open Speech Language Model Toolkit NAACL 2025 GenTranslate: Large Language Models are Generative Multilingual Speech and Machine Translators ACL 2024 FastAdaSP: Multitask-Adapted Efficient Inference for Large Speech Language Model EMNLP 2024 Bayesian Example Selection Improves In-Context Learning for Speech, Text and Visual Modalities EMNLP 2024 Self-Taught Recognizer: Toward Unsupervised Adaptation for Speech Foundation Models NIPS 2024 It's Never Too Late: Fusing Acoustic Information into Large Language Models for Automatic Speech Recognition ICLR 2024 Large Language Models are Efficient Learners of Noise-Robust Speech Recognition ICLR 2024 From Descriptive Richness to Bias: Unveiling the Dark Side of Generative Image Caption Enrichment EMNLP 2024 Whispering LLaMA: A Cross-Modal Generative Error Correction Framework for Speech Recognition EMNLP 2023 How to Estimate Model Transferability of Pre-Trained Speech Models? INTERSPEECH 2023 Parameter-Efficient Learning for Text-to-Speech Accent Adaptation INTERSPEECH 2023 HyPoradise: An Open Baseline for Generative Speech Recognition with Large Language Models NIPS 2023 Pessimistic Model Selection for Offline Deep Reinforcement Learning UAI 2023 Treatment Learning Causal Transformer for Noisy Image Classification WACV 2023 A Neural State-Space Modeling Approach to Efficient Speech Separation INTERSPEECH 2023 Differentially Private Adapters for Parameter Efficient Acoustic Modeling INTERSPEECH 2023 A Parameter-Efficient Learning Approach to Arabic Dialect Identification with Pre-Trained General-Purpose Speech Model INTERSPEECH 2023 A Multi-dimensional Deep Structured State Space Approach to Speech Enhancement Using Small-footprint Models INTERSPEECH 2023 Neural Model Reprogramming with Similarity Based Mapping for Low-Resource Spoken Command Recognition INTERSPEECH 2023 Training a Resilient Q-network against Observational Interference AAAI 2022 PATE-AAE: Incorporating Adversarial Autoencoder into Private Aggregation of Teacher Ensembles for Spoken Command Classification INTERSPEECH 2021 Voice2Series: Reprogramming Acoustic Models for Time Series Classification ICML 2021 Exploring Deep Hybrid Tensor-to-Vector Network Architectures for Regression Based Speech Enhancement INTERSPEECH 2020