Papers
Serialized Multi-Layer Multi-Head Attention for Neural Speaker Embedding
Hongning Zhu, Kong Aik Lee, Haizhou Li
Shallow Convolution-Augmented Transformer with Differentiable Neural Computer for Low-Complexity Classification of Variable-Length Acoustic Scene
Soonshin Seo, Donghyun Lee, Ji-Hwan Kim
Should We Always Separate?: Switching Between Enhanced and Observed Signals for Overlapping Speech Recognition
Hiroshi Sato, Tsubasa Ochiai, Marc Delcroix et al.
Siamese Network with wav2vec Feature for Spoofing Speech Detection
Yang Xie, Zhenchuan Zhang, Yingchun Yang
Silent versus Modal Multi-Speaker Speech Recognition from Ultrasound and Video
Manuel Sam Ribeiro, Aciel Eshky, Korin Richmond et al.
Simulating Reading Mistakes for Child Speech Transformer-Based Phone Recognition
Lucile Gelin, Thomas Pellegrini, Julien Pinquier et al.
Single-Channel Speech Enhancement Using Learnable Loss Mixup
Oscar Chang, Dung N. Tran, Kazuhito Koishida
slimIPL: Language-Model-Free Iterative Pseudo-Labeling
Tatiana Likhomanenko, Qiantong Xu, Jacob Kahn et al.
SmallER: Scaling Neural Entity Resolution for Edge Devices
Ross McGowan, Jinru Su, Vince DiCocco et al.
Sound Change in Spontaneous Bilingual Speech: A Corpus Study on the Cantonese n-l Merger in Cantonese-English Bilinguals
Rachel Soo, Khia A. Johnson, Molly Babel
Sound Source Localization with Majorization Minimization
Masahito Togami, Robin Scheibler
Source and Vocal Tract Cues for Speech-Based Classification of Patients with Parkinson’s Disease and Healthy Subjects
Tanuka Bhattacharjee, Jhansi Mallela, Yamini Belur et al.
Speaker Anonymisation Using the McAdams Coefficient
Jose Patino, Natalia Tomashenko, Massimiliano Todisco et al.
Speaker Attentive Speech Emotion Recognition
Clément Le Moine, Nicolas Obin, Axel Roebel
Speaker-Conversation Factorial Designs for Diarization Error Analysis
Scott Seyfarth, Sundararajan Srinivasan, Katrin Kirchhoff
Speaker Diarization Using Two-Pass Leave-One-Out Gaussian PLDA Clustering of DNN Embeddings
Kiran Karra, Alan McCree
Speaker Embeddings by Modeling Channel-Wise Correlations
Themos Stafylakis, Johan Rohdin, Lukáš Burget
Speaker Normalization Using Joint Variational Autoencoder
Shashi Kumar, Shakti P. Rath, Abhishek Pandey
SpeakerStew: Scaling to Many Languages with a Triaged Multilingual Text-Dependent and Text-Independent Speaker Verification System
Roza Chojnacka, Jason Pelecanos, Quan Wang et al.
Speaker Transition Patterns in Three-Party Conversation: Evidence from English, Estonian and Swedish
Marcin Włodarczak, Emer Gilmartin
Speaker Verification-Based Evaluation of Single-Channel Speech Separation
Matthew Maciejewski, Shinji Watanabe, Sanjeev Khudanpur
Speaking Corona? Human and Machine Recognition of COVID-19 from Voice
Pascal Hecker, Florian B. Pokorny, Katrin D. Bartl-Pokorny et al.
Speaking with a KN95 Face Mask: ASR Performance and Speaker Compensation
Sarah E. Gutz, Hannah P. Rowe, Jordan R. Green
Speak or Chat with Me: End-to-End Spoken Language Understanding System with Flexible Inputs
Sujeong Cha, Wangrui Hou, Hyun Jung et al.