Papers
Adversarial Voice Conversion Against Neural Spoofing Detectors
Yi-Yang Ding, Li-Juan Liu, Yu Hu et al.
A Fast Discrete Two-Step Learning Hashing for Scalable Cross-Modal Retrieval
Huan Zhao, Kaili Ma
Affect Recognition Through Scalogram and Multi-Resolution Cochleagram Features
Fasih Haider, Saturnino Luz
Age Estimation with Speech-Age Model for Heterogeneous Speech Datasets
Ryu Takeda, Kazunori Komatani
Age-Invariant Training for End-to-End Child Speech Recognition Using Adversarial Multi-Task Learning
Lars Rumberg, Hanna Ehlert, Ulrike Lüdtke et al.
A Generative Model for Duration-Dependent Score Calibration
Sandro Cumani, Salvatore Sarni
A Hands-On Comparison of DNNs for Dialog Separation Using Transfer Learning from Music Source Separation
Martin Strauss, Jouni Paulus, Matteo Torcoli et al.
A Hybrid Seq-2-Seq ASR Design for On-Device and Server Applications
Cyril Allauzen, Ehsan Variani, Michael Riley et al.
AISHELL-3: A Multi-Speaker Mandarin TTS Corpus
Yao Shi, Hui Bu, Xin Xu et al.
AISHELL-4: An Open Source Dataset for Speech Enhancement, Separation, Recognition and Speaker Diarization in Conference Scenario
Yihui Fu, Luyao Cheng, Shubo Lv et al.
A Learned Conditional Prior for the VAE Acoustic Space of a TTS System
Penny Karanasou, Sri Karlapati, Alexis Moinet et al.
A Light-Weight Contextual Spelling Correction Model for Customizing Transducer-Based Speech Recognition Systems
Xiaoqiang Wang, Yanqing Liu, Sheng Zhao et al.
A Lightweight Framework for Online Voice Activity Detection in the Wild
Xuenan Xu, Heinrich Dinkel, Mengyue Wu et al.
Align-Denoise: Single-Pass Non-Autoregressive Speech Recognition
Nanxin Chen, Piotr Żelasko, Laureano Moro-Velázquez et al.
Aligned Contrastive Predictive Coding
Jan Chorowski, Grzegorz Ciesielski, Jarosław Dzikowski et al.
AlloST: Low-Resource Speech Translation Without Source Transcription
Yao-Fei Cheng, Hung-Shin Lee, Hsin-Min Wang
Alpha-Stable Autoregressive Fast Multichannel Nonnegative Matrix Factorization for Joint Speech Enhancement and Dereverberation
Mathieu Fontaine, Kouhei Sekiguchi, Aditya Arie Nugraha et al.
Alternate Endings: Improving Prosody for Incremental Neural TTS with Predicted Future Text Input
Brooke Stephenson, Thomas Hueber, Laurent Girin et al.
Alzheimer Disease Recognition Using Speech-Based Embeddings From Pre-Trained Models
Lara Gauder, Leonardo Pepino, Luciana Ferrer et al.
Alzheimer’s Dementia Recognition Using Acoustic, Lexical, Disfluency and Speech Pause Features Robust to Noisy Inputs
Morteza Rohanian, Julian Hough, Matthew Purver
Alzheimer’s Disease Detection from Spontaneous Speech Through Combining Linguistic Complexity and (Dis)Fluency Features with Pretrained Language Models
Yu Qiao, Xuefeng Yin, Daniel Wiechmann et al.
A Maximum Likelihood Approach to SNR-Progressive Learning Using Generalized Gaussian Distribution for LSTM-Based Speech Enhancement
Xiao-Qi Zhang, Jun Du, Li Chai et al.
A Meta-Learning Approach for User-Defined Spoken Term Classification with Varying Classes and Examples
Yangbin Chen, Tom Ko, Jianping Wang
Amortized Neural Networks for Low-Latency Speech Recognition
Jonathan Macoskey, Grant P. Strimel, Jinru Su et al.
A Multi-Branch Deep Learning Network for Automated Detection of COVID-19
Ahmed Fakhry, Xinyi Jiang, Jaclyn Xiao et al.