Papers
8,761 papers found
Enhancing Non-Matching Reference Speech Quality Assessment through Dynamic Weight Adaptation
Bao Thang Ta, Van Hai Do, Huynh Thi Thanh Binh
Enhancing No-Reference Speech Quality Assessment with Pairwise, Triplet Ranking Losses, and ASR Pretraining
Bao Thang Ta, Minh Tu Le, Van Hai Do et al.
Enhancing Out-of-Vocabulary Performance of Indian TTS Systems for Practical Applications through Low-Effort Data Strategies
Srija Anand, Praveen Srinivasa Varadhan, Ashwin Sankar et al.
Enhancing Partially Spoofed Audio Localization with Boundary-aware Attention Mechanism
Jiafeng Zhong, Bin Li, Jiangyan Yi
Enhancing Speech and Music Discrimination Through the Integration of Static and Dynamic Features
Liangwei Chen, Xiren Zhou, Qiang Tu et al.
Enhancing Speech-Driven 3D Facial Animation with Audio-Visual Guidance from Lip Reading Expert
Han EunGi, Oh Hyun-Bin, Kim Sung-Bin et al.
Enhancing Voice Wake-Up for Dysarthria: Mandarin Dysarthria Speech Corpus Release and Customized System Design
Ming Gao, Hang Chen, Jun Du et al.
Enhancing Zero-shot Audio Classification using Sound Attribute Knowledge from Large Language Models
Xuenan Xu, Pingyue Zhang, Ming Yan et al.
Enrolment-based personalisation for improving individual-level fairness in speech emotion recognition
Andreas Triantafyllopoulos, Björn Schuller
Entrainment Analysis and Prosody Prediction of Subsequent Interlocutor’s Backchannels in Dialogue
Keiko Ochi, Koji Inoue, Divesh Lala et al.
E-ODN: An Emotion Open Deep Network for Generalised and Adaptive Speech Emotion Recognition
Liuxian Ma, Lin Shen, Ruobing Li et al.
E-Paraformer: A Faster and Better Parallel Transformer for Non-autoregressive End-to-End Mandarin Speech Recognition
Kun Zou, Fengyun Tan, Ziyang Zhuang et al.
ERes2NetV2: Boosting Short-Duration Speaker Verification Performance with Computational Efficiency
Yafeng Chen, Siqi Zheng, Hui Wang et al.
Error Correction by Paying Attention to Both Acoustic and Confidence References for Automatic Speech Recognition
Yuchun Shu, Bo Hu, Yifeng He et al.
ESPnet-SPK: full pipeline speaker embedding toolkit with reproducible recipes, self-supervised front-ends, and off-the-shelf models
Jee-weon Jung, Wangyou Zhang, Jiatong Shi et al.
Ethnolinguistic Identification of Vietnamese-German Heritage Speech
Thanh Lan Truong, Andrea Weber
Evaluating a 3-factor listener model for prediction of speech intelligibility to hearing-impaired listeners
Mark Huckvale, Gaston Hilkhuysen
Evaluating Italian Vowel Variation with the Recurrent Neural Network Phonet
Austin Jones, Margaret E. L. Renwick
Evaluating Speech Recognition Performance Towards Large Language Model Based Voice Assistants
Zhe Liu, Suyoun Kim, Ozlem Kalinli
Evaluating the Santa Barbara Corpus: Challenges of the Breadth of Conversational Spoken Language
Matthew Maciejewski, Dominik Klement, Ruizhe Huang et al.
Evaluating Transformer-Enhanced Deep Reinforcement Learning for Speech Emotion Recognition
Siddique Latif, Raja Jurdak, Björn W. Schuller
Examining Vocal Tract Coordination in Childhood Apraxia of Speech with Acoustic-to-Articulatory Speech Inversion Feature Sets
Nina R. Benway, Jonathan L. Preston, Carol Espy-Wilson
ExHuBERT: Enhancing HuBERT Through Block Extension and Fine-Tuning on 37 Emotion Datasets
Shahin Amiriparian, Filip Packań, Maurice Gerczuk et al.
Experimental evaluation of MOS, AB and BWS listening test designs
Dan Wells, Andrea Lorena Aldana Blanco, Cassia Valentini et al.
Explainable by-design Audio Segmentation through Non-Negative Matrix Factorization and Probing
Martin Lebourdais, Théo Mariotte, Antonio Almudévar et al.