Papers
Radically Old Way of Computing Spectra: Applications in End-to-End ASR
Samik Sadhu, Hynek Hermansky
Rapid Speaker Adaptation for Conformer Transducer: Attention and Bias Are All You Need
Yan Huang, Guoli Ye, Jinyu Li et al.
RaSSpeR: Radar-Based Silent Speech Recognition
David Ferreira, Samuel Silva, Francisco Curado et al.
Raw Speech-to-Articulatory Inversion by Temporal Filtering and Decimation
Abdolreza Sabzi Shahrebabaki, Sabato Marco Siniscalchi, Torbjørn Svendsen
Raw Waveform Encoder with Multi-Scale Globally Attentive Locally Recurrent Networks for End-to-End Speech Recognition
Max W.Y. Lam, Jun Wang, Chao Weng et al.
Real-Time End-to-End Monaural Multi-Speaker Speech Recognition
Song Li, Beibei Ouyang, Fuchuan Tong et al.
Real-Time Independent Vector Analysis Using Semi-Supervised Nonnegative Matrix Factorization as a Source Model
Taihui Wang, Feiran Yang, Rui Zhu et al.
Real-Time Multi-Channel Speech Enhancement Based on Neural Network Masking with Attention Model
Cheng Xue, Weilong Huang, Weiguang Chen et al.
Real-Time Speaker Counting in a Cocktail Party Scenario Using Attention-Guided Convolutional Neural Network
Midia Yousefi, John H.L. Hansen
Recognising Covid-19 from Coughing Using Ensembles of SVMs and LSTMs with Handcrafted and Deep Audio Features
Vincent Karas, Björn W. Schuller
Reduce and Reconstruct: ASR for Low-Resource Phonetic Languages
Anuj Diwan, Preethi Jyothi
Reducing Exposure Bias in Training Recurrent Neural Network Transducers
Xiaodong Cui, Brian Kingsbury, George Saon et al.
Reducing Streaming ASR Model Delay with Self Alignment
Jaeyoung Kim, Han Lu, Anshuman Tripathi et al.
Reformulating DOVER-Lap Label Mapping as a Graph Partitioning Problem
Desh Raj, Sanjeev Khudanpur
Regularizing Word Segmentation by Creating Misspellings
Hainan Xu, Kartik Audhkhasi, Yinghui Huang et al.
Reinforce-Aligner: Reinforcement Alignment Search for Robust End-to-End Text-to-Speech
Hyunseung Chung, Sang-Hoon Lee, Seong-Whan Lee
Reinforcement Learning for Emotional Text-to-Speech Synthesis with Improved Emotion Discriminability
Rui Liu, Berrak Sisman, Haizhou Li
Relational Data Selection for Data Augmentation of Speaker-Dependent Multi-Band MelGAN Vocoder
Yi-Chiao Wu, Cheng-Hung Hu, Hung-Shin Lee et al.
Relationships Between Perceptual Distinctiveness, Articulatory Complexity and Functional Load in Speech Communication
Yuqing Zhang, Zhu Li, Bin Wu et al.
Relaxing the Conditional Independence Assumption of CTC-Based ASR by Conditioning on Intermediate Predictions
Jumon Nozaki, Tatsuya Komatsu
Reliable Estimates of Interpretable Cue Effects with Active Learning in Psycholinguistic Research
Marieke Einfeldt, Rita Sevastjanova, Katharina Zahner-Ritter et al.
Reliable Intensity Vector Selection for Multi-Source Direction-of-Arrival Estimation Using a Single Acoustic Vector Sensor
Jianhua Geng, Sifan Wang, Juan Li et al.
Remote Smartphone-Based Speech Collection: Acceptance and Barriers in Individuals with Major Depressive Disorder
Judith Dineley, Grace Lavelle, Daniel Leightley et al.
Representation Learning to Classify and Detect Adversarial Attacks Against Speaker and Speech Recognition Systems
Jesús Villalba, Sonal Joshi, Piotr Żelasko et al.