Xuankai Chang

30 papers · 2016–2024 · 7 conferences · across top CS/AI conferences

Achievements

+15 more ↓

🌍 Conference Polyglot (7) 🧭 Keyword Pioneer 🌉 Interdisciplinary Bridge 🗺️ Taxonomy Completionist (13) 🏃 Academic Marathon (8)

🏃 Academic Marathon (8) 🐝 Cross-Pollinator (12) 🌈 Renaissance Researcher (5) 🏠 Conference Loyalist (23) 👥 Mega-Team (20) 🧬 Topic Evolution 🌱 Topic Pioneer 🔬 Deep Specialist (14) 🤝 Dynamic Duo (23) 🗃️ Keyword Collector (58) 📈 Trend Setter 🔥 Unstoppable (9) ⚡ Prolific Year (7) 🚀 Conference Pioneer 💎 Century Club (30)

Conferences

INTERSPEECH (23) ACL (2) AAAI (1) EMNLP (1) ICML (1) NAACL (1) NIPS (1)

Top co-authors

Shinji Watanabe (23) Jiatong Shi (12) William Chen (6) Jinchuan Tian (6) Yuya Fujita (6) Yanmin Qian (6) Brian Yan (5) Wangyou Zhang (4) Hung-yi Lee (4) Yifan Peng (3)

Keywords

self-supervised learning (7) automatic speech recognition (7) speech recognition (5) speech separation (5) transfer learning (4) speech representation (4) speech synthesis (4) end-to-end model (4) end-to-end speech recognition (3) speech enhancement (3) speech processing (3) connectionist temporal classification (3) speech translation (2) zero-shot learning (2) singing voice synthesis (2) sequence-to-sequence model (2) spoken language understanding (2) representation learning (2) multilingual speech (2) multi-talker speech recognition (2)

Papers

Towards Robust Speech Representation Learning for Thousands of Languages EMNLP 2024 UniAudio: Towards Universal Audio Generation with Large Language Models ICML 2024 The Interspeech 2024 Challenge on Speech Processing Using Discrete Units INTERSPEECH 2024 ML-SUPERB 2.0: Benchmarking Multilingual Speech Models Across Modeling Constraints, Languages, and Datasets INTERSPEECH 2024 OWSM v3.1: Better and Faster Open Whisper-Style Speech Models based on E-Branchformer INTERSPEECH 2024 AudioGPT: Understanding and Generating Speech, Music, Sound, and Talking Head AAAI 2024 Make-A-Voice: Revisiting Voice Large Language Models as Scalable Multilingual and Multitask Learners ACL 2024 TokenSplit: Using Discrete Speech Representations for Direct, Refined, and Transcript-Conditioned Speech Separation and Recognition INTERSPEECH 2023 Reducing Barriers to Self-Supervised Learning: HuBERT Pre-training with Academic Compute INTERSPEECH 2023 ML-SUPERB: Multilingual Speech Universal PERformance Benchmark INTERSPEECH 2023 A New Benchmark of Aphasia Speech Recognition and Detection Based on E-Branchformer and Multi-task Learning INTERSPEECH 2023 Exploration of Efficient End-to-End ASR using Discretized Input from Self-Supervised Learning INTERSPEECH 2023 Muskits: an End-to-end Music Processing Toolkit for Singing Voice Synthesis INTERSPEECH 2022 SUPERB-SG: Enhanced Speech processing Universal PERformance Benchmark for Semantic and Generative Capabilities ACL 2022 End-to-End Integration of Speech Recognition, Speech Enhancement, and Self-Supervised Learning Representation INTERSPEECH 2022 Two-Pass Low Latency End-to-End Spoken Language Understanding INTERSPEECH 2022 ESPnet-SE++: Speech Enhancement for Robust Speech Recognition, Translation, and Understanding INTERSPEECH 2022 Highland Puebla Nahuatl Speech Translation Corpus for Endangered Language Documentation NAACL 2021 SUPERB: Speech Processing Universal PERformance Benchmark INTERSPEECH 2021 Speech Representation Learning Combining Conformer CPC with Deep Cluster for the ZeroSpeech Challenge 2021 INTERSPEECH 2021 Multi-Speaker ASR Combining Non-Autoregressive Conformer CTC and Conditional Speaker Chain INTERSPEECH 2021 Streaming End-to-End ASR Based on Blockwise Non-Autoregressive Models INTERSPEECH 2021 Insertion-Based Modeling for End-to-End Automatic Speech Recognition INTERSPEECH 2020 End-to-End ASR with Adaptive Span Self-Attention INTERSPEECH 2020 End-to-End Far-Field Speech Recognition with Unified Dereverberation and Beamforming INTERSPEECH 2020 Sequence to Multi-Sequence Learning via Conditional Chain Mapping for Mixture Signals NIPS 2020 Knowledge Distillation for End-to-End Monaural Multi-Talker ASR System INTERSPEECH 2019 Monaural Multi-Talker Speech Recognition with Attention Mechanism and Gated Convolutional Networks INTERSPEECH 2018 Recognizing Multi-Talker Speech with Permutation Invariant Training INTERSPEECH 2017 Unrestricted Vocabulary Keyword Spotting Using LSTM-CTC INTERSPEECH 2016