Papers
Multi-Stream Gated and Pyramidal Temporal Convolutional Neural Networks for Audio-Visual Speech Separation in Multi-Talker Environments
Yiyu Luo, Jing Wang, Liang Xu et al.
Multitask Adaptation with Lattice-Free MMI for Multi-Genre Speech Recognition of Low Resource Languages
Srikanth Madikeri, Petr Motlicek, Hervé Bourlard
Multi-Task Learning for End-to-End ASR Word and Utterance Confidence with Deletion Prediction
David Qiu, Yanzhang He, Qiujia Li et al.
Multi-Task Neural Network for Robust Multiple Speaker Embedding Extraction
Weipeng He, Petr Motlicek, Jean-Marc Odobez
Multitask Training with Text Data for End-to-End Speech Recognition
Peidong Wang, Tara N. Sainath, Ron J. Weiss
Mutual Information Enhanced Training for Speaker Embedding
Youzhi Tu, Man-Wai Mak
NeMo (Inverse) Text Normalization: From Development to Production
Yang Zhang, Evelina Bakhturina, Boris Ginsburg
NeMo Inverse Text Normalization: From Development to Production
Yang Zhang, Evelina Bakhturina, Kyle Gorman et al.
Neural Speaker Embeddings for Ultrasound-Based Silent Speech Interfaces
Amin Honarmandi Shandiz, László Tóth, Gábor Gosztolya et al.
Neural Speaker Extraction with Speaker-Speech Cross-Attention Network
Wupeng Wang, Chenglin Xu, Meng Ge et al.
Neural Spoken-Response Generation Using Prosodic and Linguistic Context for Conversational Systems
Yoshihiro Yamazaki, Yuya Chiba, Takashi Nose et al.
Neural Text Denormalization for Speech Transcripts
Benjamin Suter, Josef Novak
NISQA: A Deep CNN-Self-Attention Model for Multidimensional Speech Quality Prediction with Crowdsourced Datasets
Gabriel Mittag, Babak Naderi, Assmaa Chehadi et al.
N-MTTL SI Model: Non-Intrusive Multi-Task Transfer Learning-Based Speech Intelligibility Prediction Model with Scenery Classification
Ľuboš Marcinek, Michael Stone, Rebecca Millman et al.
Noise Robust Acoustic Modeling for Single-Channel Speech Recognition Based on a Stream-Wise Transformer Architecture
Masakiyo Fujimoto, Hisashi Kawai
Noise Robust Pitch Stylization Using Minimum Mean Absolute Error Criterion
Chiranjeevi Yarra, Prasanta Kumar Ghosh
Noisy Student-Teacher Training for Robust Keyword Spotting
Hyun-Jin Park, Pai Zhu, Ignacio Lopez Moreno et al.
Non-Autoregressive Predictive Coding for Learning Speech Representations from Local Dependencies
Alexander H. Liu, Yu-An Chung, James Glass
Non-Intrusive Speech Quality Assessment with Transfer Learning and Subject-Specific Scaling
Natalia Nessler, Milos Cernak, Paolo Prandoni et al.
Nonlinear Acoustic Echo Cancellation with Deep Learning
Amir Ivry, Israel Cohen, Baruch Berdugo
Non-Parallel Any-to-Many Voice Conversion by Replacing Speaker Statistics
Yufei Liu, Chengzhu Yu, Wang Shuai et al.
Non-Verbal Vocalisation and Laughter Detection Using Sequence-to-Sequence Models and Multi-Label Training
Scott Condron, Georgia Clarke, Anita Klementiev et al.
Normalization Driven Zero-Shot Multi-Speaker Speech Synthesis
Neeraj Kumar, Srishti Goel, Ankur Narang et al.
N-Singer: A Non-Autoregressive Korean Singing Voice Synthesis System for Pronunciation Enhancement
Gyeong-Hoon Lee, Tae-Woo Kim, Hanbin Bae et al.