Papers
Low-Delay Speech Enhancement Using Perceptually Motivated Target and Loss
Xu Zhang, Xinlei Ren, Xiguang Zheng et al.
Low Resource ASR: The Surprising Effectiveness of High Resource Transliteration
Shreya Khare, Ashish Mittal, Anuj Diwan et al.
Low Resource German ASR with Untranscribed Data Spoken by Non-Native Children — INTERSPEECH 2021 Shared Task SPAPL System
Jinhan Wang, Yunzheng Zhu, Ruchao Fan et al.
LT-LM: A Novel Non-Autoregressive Language Model for Single-Shot Lattice Rescoring
Anton Mitrofanov, Mariya Korenevskaya, Ivan Podluzhny et al.
M3: MultiModal Masking Applied to Sentiment Analysis
Efthymios Georgiou, Georgios Paraskevopoulos, Alexandros Potamianos
Manifold-Aware Deep Clustering: Maximizing Angles Between Embedding Vectors Based on Regular Simplex
Keitaro Tanaka, Ryosuke Sawata, Shusuke Takahashi
Many-Speakers Single Channel Speech Separation with Optimal Permutation Training
Shaked Dovrat, Eliya Nachmani, Lior Wolf
Many-to-Many Voice Conversion Based Feature Disentanglement Using Variational Autoencoder
Manh Luong, Viet Anh Tran
Masked Proxy Loss for Text-Independent Speaker Verification
Jiachen Lian, Aiswarya Vinod Kumar, Hira Dhamyal et al.
Measuring Voice Quality Parameters After Speaker Pseudonymization
Rob J.J.H. van Son
MetricGAN+: An Improved Version of MetricGAN for Speech Enhancement
Szu-Wei Fu, Cheng Yu, Tsun-An Hsieh et al.
Metric Learning Based Feature Representation with Gated Fusion Model for Speech Emotion Recognition
Yuan Gao, Jiaxing Liu, Longbiao Wang et al.
MetricNet: Towards Improved Modeling For Non-Intrusive Speech Quality Assessment
Meng Yu, Chunlei Zhang, Yong Xu et al.
Microphone Array Generalization for Multichannel Narrowband Deep Speech Enhancement
Siyuan Zhang, Xiaofei Li
MIMO Self-Attentive RNN Beamformer for Multi-Speaker Speech Separation
Xiyun Li, Yong Xu, Meng Yu et al.
Minimum-Norm Differential Beamforming for Linear Array with Directional Microphones
Weilong Huang, Jinwei Feng
Minimum Word Error Rate Training with Language Model Fusion for End-to-End Speech Recognition
Zhong Meng, Yu Wu, Naoyuki Kanda et al.
Mixture Model Attention: Flexible Streaming and Non-Streaming Automatic Speech Recognition
Kartik Audhkhasi, Tongzhou Chen, Bhuvana Ramabhadran et al.
Mixture of Orthogonal Sequences Made from Extended Time-Stretched Pulses Enables Measurement of Involuntary Voice Fundamental Frequency Response to Pitch Perturbation
Hideki Kawahara, Toshie Matsui, Kohei Yatabe et al.
Model-Agnostic Fast Adaptive Multi-Objective Balancing Algorithm for Multilingual Automatic Speech Recognition Model Training
Jiabin Xue, Tieran Zheng, Jiqing Han
Model-Based Exploration of Linking Between Vowel Articulatory Space and Acoustic Space
Anqi Xu, Daniel van Niekerk, Branislav Gerazov et al.
Modeling and Training Strategies for Language Recognition Systems
Raphaël Duroselle, Md. Sahidullah, Denis Jouvet et al.
Modeling Dialectal Variation for Swiss German Automatic Speech Recognition
Abbas Khosravani, Philip N. Garner, Alexandros Lazaridis
Modeling Dysphonia Severity as a Function of Roughness and Breathiness Ratings in the GRBAS Scale
Carlos A. Ferrer, Efren Aragón, María E. Hdez-Díaz et al.