Papers
On Learning Interpretable CNNs with Parametric Modulated Kernel-Based Filters
Erfan Loweimi, Peter Bell, Steve Renals
Online Hybrid CTC/Attention Architecture for End-to-End Speech Recognition
Haoran Miao, Gaofeng Cheng, Pengyuan Zhang et al.
Online Speech Processing and Analysis Suite
Wikus Pienaar, Daan Wissing
On Mitigating Acoustic Feedback in Hearing Aids with Frequency Warping by All-Pass Networks
Ching-Hua Lee, Kuan-Lin Chen, Fred Harris et al.
On Nonlinear Spatial Filtering in Multichannel Speech Enhancement
Kristina Tesch, Robert Rehr, Timo Gerkmann
On Robustness of Unsupervised Domain Adaptation for Speaker Recognition
Pierre-Michel Bousquet, Mickael Rouvier
On the Choice of Modeling Unit for Sequence-to-Sequence Speech Recognition
Kazuki Irie, Rohit Prabhavalkar, Anjuli Kannan et al.
On the Contributions of Visual and Textual Supervision in Low-Resource Semantic Speech Retrieval
Ankita Pasad, Bowen Shi, Herman Kamper et al.
On the End-to-End Solution to Mandarin-English Code-Switching Speech Recognition
Zhiping Zeng, Yerbolat Khassanov, Van Tung Pham et al.
On the Importance of Audio-Source Separation for Singer Identification in Polyphonic Music
Bidisha Sharma, Rohan Kumar Das, Haizhou Li
On the Role of Oral Configurations in European Portuguese Nasal Vowels
Conceição Cunha, Samuel Silva, António Teixeira et al.
On the Role of Style in Parsing Speech with Neural Models
Trang Tran, Jiahong Yuan, Yang Liu et al.
On the Suitability of the Riesz Spectro-Temporal Envelope for WaveNet Based Speech Synthesis
Jitendra Kumar Dhiman, Nagaraj Adiga, Chandra Sekhar Seelamantula
On the Usage of Phonetic Information for Text-Independent Speaker Embedding Extraction
Shuai Wang, Johan Rohdin, Lukáš Burget et al.
On the Use/Misuse of the Term ‘Phoneme’
Roger K. Moore, Lucy Skidmore
On the Use of Pitch Features for Disordered Speech Recognition
Shansong Liu, Shoukang Hu, Xunying Liu et al.
Open-Vocabulary Keyword Spotting with Audio and Text Embeddings
Niccolò Sacchi, Alexandre Nanchen, Martin Jaggi et al.
Optimization of False Acceptance/Rejection Rates and Decision Threshold for End-to-End Text-Dependent Speaker Verification Systems
Victoria Mingote, Antonio Miguel, Dayana Ribas et al.
Optimizing a Speaker Embedding Extractor Through Backend-Driven Regularization
Luciana Ferrer, Mitchell McLaren
Optimizing Speech-Input Length for Speaker-Independent Depression Classification
Tomasz Rutowski, Amir Harati, Yang Lu et al.
Optimizing Voice Activity Detection for Noisy Conditions
Ruixi Lin, Charles Costello, Charles Jankowski et al.
Ordinal Triplet Loss: Investigating Sleepiness Detection from Speech
Peter Wu, SaiKrishna Rallabandi, Alan W. Black et al.
Parallel vs. Non-Parallel Voice Conversion for Esophageal Speech
Luis Serrano, Sneha Raman, David Tavarez et al.
Parameter Enhancement for MELP Speech Codec in Noisy Communication Environment
Min-Jae Hwang, Hong-Goo Kang