Papers
Online Continual Learning of End-to-End Speech Recognition Models
Muqiao Yang, Ian Lane, Shinji Watanabe
Online Learning of Open-set Speaker Identification by Active User-registration
Eunkyung Yoo, Hyeonseop Song, Taehyeong Kim et al.
Online Speaker Diarization with Core Samples Selection
Yanyan Yue, Jun Du, Mao-Kui He et al.
Online Target Speaker Voice Activity Detection for Speaker Diarization
Weiqing Wang, Ming Li, Qingjian Lin
On Metric Learning for Audio-Text Cross-Modal Retrieval
Xinhao Mei, Xubo Liu, Jianyuan Sun et al.
On monoaural speech enhancement for automatic recognition of real noisy speech using mixture invariant training
Jisi Zhang, Catalin Zorila, Rama Doddipatla et al.
On-the-fly ASR Corrections with Audio Exemplars
Golan Pundak, Tsendsuren Munkhdalai, Khe Chai Sim
On the Prediction Network Architecture in RNN-T for ASR
Dario Albesano, Jesús Andrés-Ferrer, Nicola Ferri et al.
On the Role of Spatial, Spectral, and Temporal Processing for DNN-based Non-linear Multi-channel Speech Enhancement
Kristina Tesch, Nils-Hendrik Mohrmann, Timo Gerkmann
On the Use of Deep Mask Estimation Module for Neural Source Separation Systems
Kai Li, Xiaolin Hu, Yi Luo
OpenASR21: The Second Open Challenge for Automatic Speech Recognition of Low-Resource Languages
Kay Peterson, Audrey Tong, Yan Yu
Opencpop: A High-Quality Open Source Chinese Popular Song Corpus for Singing Voice Synthesis
Yu Wang, Xinsheng Wang, Pengcheng Zhu et al.
Open Source MagicData-RAMC: A Rich Annotated Mandarin Conversational(RAMC) Speech Dataset
Zehui Yang, Yifan Chen, Lei Luo et al.
Optimal thyroplasty implant shape and stiffness for treatment of acute unilateral vocal fold paralysis: Evidence from a canine in vivo phonation model
Neha Reddy, Yoonjeong Lee, Zhaoyan Zhang et al.
Optimization of Deep Neural Network (DNN) Speech Coder Using a Multi Time Scale Perceptual Loss Function
Joon Byun, Seungmin Shin, Jongmo Sung et al.
ORCA-WHISPER: An Automatic Killer Whale Sound Type Generation Toolkit Using Deep Learning
Christian Bergler, Alexander Barnhill, Dominik Perrin et al.
Oriental Language Recognition (OLR) 2021: Summary and Analysis
Binling Wang, Feng Wang, Wenxuan Hu et al.
Orofacial somatosensory inputs in speech perceptual training modulate speech production
Monica Ashokumar, Jean-Luc Schwartz, Takayuki Ito
OSSEM: one-shot speaker adaptive speech enhancement using meta learning
Cheng Yu, Szu-wei Fu, Tsun-An Hsieh et al.
Overlapped Frequency-Distributed Network: Frequency-Aware Voice Spoofing Countermeasure
Sunmook Choi, Il-Youp Kwak, Seungsang Oh
Overlapped speech and gender detection with WavLM pre-trained features
Martin Lebourdais, Marie Tahon, Antoine LAURENT et al.
Overlapped Speech Detection in Broadcast Streams Using X-vectors
Lukas Mateju, Frantisek Kynych, Petr Cerva et al.
Paraformer: Fast and Accurate Parallel Transformer for Non-autoregressive End-to-End Speech Recognition
Zhifu Gao, ShiLiang Zhang, Ian McLoughlin et al.
Paraguayan Guarani: Tritonal pitch accent and Accentual Phrase
Sun-Ah Jun, Maria Luisa Zubizarreta
Parameter-Efficient Conformers via Sharing Sparsely-Gated Experts for End-to-End Speech Recognition
Ye Bai, Jie Li, Wenjing Han et al.