Papers
Zero-Shot Federated Learning with New Classes for Audio Classification
Gautham Krishna Gudur, Satheesh Kumar Perepu
Zero-Shot Joint Modeling of Multiple Spoken-Text-Style Conversion Tasks Using Switching Tokens
Mana Ihori, Naoki Makishima, Tomohiro Tanaka et al.
Zero-Shot Text-to-Speech for Text-Based Insertion in Audio Narration
Chuanxin Tang, Chong Luo, Zhiyuan Zhao et al.
1-D Row-Convolution LSTM: Fast Streaming ASR at Accuracy Parity with LC-BLSTM
Kshitiz Kumar, Chaojun Liu, Yifan Gong et al.
A 43 Language Multilingual Punctuation Prediction Neural Network Model
Xinxing Li, Edward Lin
Abstractive Spoken Document Summarization Using Hierarchical Model with Multi-Stage Attention Diversity Optimization
Potsawee Manakul, Mark J.F. Gales, Linlin Wang
Accurate Detection of Wake Word Start and End Using a CNN
Christin Jose, Yuriy Mishchenko, Thibaud Sénéchal et al.
Achieving Multi-Accent ASR via Unsupervised Acoustic Model Adaptation
M.A. Tuğtekin Turan, Emmanuel Vincent, Denis Jouvet
A Comparative Re-Assessment of Feature Extractors for Deep Speaker Embeddings
Xuechen Liu, Md. Sahidullah, Tomi Kinnunen
A Comparative Study of Speech Anonymization Metrics
Mohamed Maouche, Brij Mohan Lal Srivastava, Nathalie Vauquier et al.
A Comparison of Acoustic and Linguistics Methodologies for Alzheimer’s Dementia Recognition
Nicholas Cummins, Yilin Pan, Zhao Ren et al.
A Comparison of English Rhythm Produced by Native American Speakers and Mandarin ESL Primary School Learners
Hongwei Ding, Binghuai Lin, Liyuan Wang et al.
A Convolutional Deep Markov Model for Unsupervised Speech Representation Learning
Sameer Khurana, Antoine Laurent, Wei-Ning Hsu et al.
Acoustic-Based Articulatory Phenotypes of Amyotrophic Lateral Sclerosis and Parkinson’s Disease: Towards an Interpretable, Hypothesis-Driven Framework of Motor Control
Hannah P. Rowe, Sarah E. Gutz, Marc F. Maffei et al.
Acoustic Feature Extraction with Interpretable Deep Neural Network for Neurodegenerative Related Disorder Classification
Yilin Pan, Bahman Mirheidari, Zehai Tu et al.
Acoustic Properties of Strident Fricatives at the Edges: Implications for Consonant Discrimination
Louis-Marie Lorin, Lorenzo Maselli, Léo Varnet et al.
Acoustic Scene Analysis with Multi-Head Attention Networks
Weimin Wang, Weiran Wang, Ming Sun et al.
Acoustic Scene Classification Using Audio Tagging
Jee-weon Jung, Hye-jin Shim, Ju-ho Kim et al.
Acoustic Signal Enhancement Using Relative Harmonic Coefficients: Spherical Harmonics Domain Approach
Yonggang Hu, Prasanga N. Samarasinghe, Thushara D. Abhayapala
Acoustic-to-Articulatory Inversion with Deep Autoregressive Articulatory-WaveNet
Narjes Bozorg, Michael T. Johnson
A Cross-Channel Attention-Based Wave-U-Net for Multi-Channel Speech Enhancement
Minh Tri Ho, Jinyoung Lee, Bong-Ki Lee et al.
A Cyclical Post-Filtering Approach to Mismatch Refinement of Neural Vocoder for Text-to-Speech Systems
Yi-Chiao Wu, Patrick Lumban Tobing, Kazuki Yasuhara et al.
Adaptive Compressive Onset-Enhancement for Improved Speech Intelligibility in Noise and Reverberation
Felicitas Bederna, Henning Schepker, Christian Rollwage et al.
Adaptive Domain-Aware Representation Learning for Speech Emotion Recognition
Weiquan Fan, Xiangmin Xu, Xiaofen Xing et al.
Adaptive Neural Speech Enhancement with a Denoising Variational Autoencoder
Yoshiaki Bando, Kouhei Sekiguchi, Kazuyoshi Yoshii