Research Explorer

Learn and Don't Forget: Adding a New Language to ASR Foundation Models

Mengjie Qian, Siyuan Tang, Rao Ma et al.

2024 INTERSPEECH

LearnerVoice: A Dataset of Non-Native English Learners’ Spontaneous Speech

Haechan Kim, Junho Myung, Seoyoung Kim et al.

2024 INTERSPEECH

Learning Fine-Grained Controllability on Speech Generation via Efficient Fine-Tuning

Chung-Ming Chien, Andros Tjandra, Apoorv Vyas et al.

2024 INTERSPEECH

Learning from Back Chunks: Acquiring More Future Knowledge for Streaming ASR Models via Self Distillation

Yuting Yang, Guodong Ma, Yuke Li et al.

2024 INTERSPEECH

Learning from memory-based models

Rhiannon Mogridge, Anton Ragni

2024 INTERSPEECH

Learning from Multiple Annotator Biased Labels in Multimodal Conversation

Kazutoshi Shinoda, Nobukatsu Hojo, Saki Mizuno et al.

2024 INTERSPEECH

Learning Pronunciation from Other Accents via Pronunciation Knowledge Transfer

Siqi Sun, Korin Richmond

2024 INTERSPEECH

Learning Representation of Therapist Empathy in Counseling Conversation Using Siamese Hierarchical Attention Network

Dehua Tao, Tan Lee, Harold Chui et al.

2024 INTERSPEECH

Learnings from curating a trustworthy, well-annotated, and useful dataset of disordered English speech

Pan-Pan Jiang, Jimmy Tobin, Katrin Tomanek et al.

2024 INTERSPEECH

Less is More: Accurate Speech Recognition & Translation without Web-Scale Data

Krishna C. Puvvada, Piotr Żelasko, He Huang et al.

2024 INTERSPEECH

Leveraging Adapter for Parameter-Efficient ASR Encoder

Kyuhong Shim, Jinkyu Lee, Hyunjae Kim

2024 INTERSPEECH

Leveraging Graphic and Convolutional Neural Networks for Auditory Attention Detection with EEG

Saurav Pahuja, Gabriel Ivucic, Pascal Himmelmann et al.

2024 INTERSPEECH

Leveraging Language Model Capabilities for Sound Event Detection

Hualei Wang, Jianguo Mao, Zhifang Guo et al.

2024 INTERSPEECH

Leveraging large language models for post-transcription correction in contact centers

Bramhendra Koilakuntla, Prajesh Rana, Paras Ahuja et al.

2024 INTERSPEECH

Leveraging Large Language Models to Refine Automatic Feedback Generation at Articulatory Level in Computer Aided Pronunciation Training

Huihang Zhong, Yanlu Xie, ZiJin Yao

2024 INTERSPEECH

Leveraging Phonemic Transcription and Whisper toward Clinically Significant Indices for Automatic Child Speech Assessment

Yeh-Sheng Lin, Shu-Chuan Tseng, Jyh-Shing Roger Jang

2024 INTERSPEECH

Leveraging Speech Data Diversity to Document Indigenous Heritage and Culture

Allahsera Tapo, Éric Le Ferrand, Zoey Liu et al.

2024 INTERSPEECH

Leveraging Universal Speech Representations for Detecting and Assessing the Severity of Mild Cognitive Impairment Across Languages

Anna Favaro, Tianyu Cao, Najim Dehak et al.

2024 INTERSPEECH

LibriheavyMix: A 20,000-Hour Dataset for Single-Channel Reverberant Multi-Talker Speech Separation, ASR and Speaker Diarization

Zengrui Jin, Yifan Yang, Mohan Shi et al.

2024 INTERSPEECH

LibriTTS-P: A Corpus with Speaking Style and Speaker Identity Prompts for Text-to-Speech and Style Captioning

Masaya Kawamura, Ryuichi Yamamoto, Yuma Shirahata et al.

2024 INTERSPEECH

Lifelong Learning MOS Prediction for Synthetic Speech Quality Evaluation

Félix Saget, Meysam Shamsi, Marie Tahon

2024 INTERSPEECH

Lightweight Audio Segmentation for Long-form Speech Translation

Jaesong Lee, Soyoon Kim, Hanbyul Kim et al.

2024 INTERSPEECH

Lightweight Dynamic Sparse Transformer for Monaural Speech Enhancement

Zehua Zhang, Xuyi Zhuang, Yukun Qian et al.

2024 INTERSPEECH

Lightweight Transducer Based on Frame-Level Criterion

Genshun Wan, Mengzhi Wang, Tingzhi Mao et al.

2024 INTERSPEECH

Lightweight Zero-shot Text-to-Speech with Mixture of Adapters

Kenichi Fujita, Takanori Ashihara, Marc Delcroix et al.

2024 INTERSPEECH

Papers