Papers
Multi-Modal Multi-Correlation Learning for Audio-Visual Speech Separation
Xiaoyu Wang, Xiangyu Kong, Xiulian Peng et al.
Multimodal Persuasive Dialogue Corpus using Teleoperated Android
Seiya Kawano, Muteki Arioka, Akishige Yuguchi et al.
Multi-Path GMM-MobileNet Based on Attack Algorithms and Codecs for Synthetic Speech and Deepfake Detection
Yan Wen, Zhenchun Lei, Yingen Yang et al.
Multiple Enhancements to LSTM for Learning Emotion-Salient Features in Speech Emotion Recognition
Desheng Hu, Xinhui Hu, Xinkang Xu
Multiple-hypothesis RNN-T Loss for Unsupervised Fine-tuning and Self-training of Neural Transducer
Cong-Thanh Do, Mohan Li, Rama Doddipatla
Multi-scale Speaker Diarization with Dynamic Scale Weighting
Tae Jin Park, Nithin Rao Koluguri, Jagadeesh Balam et al.
Multi-source wideband DOA estimation method by frequency focusing and error weighting
Jing Zhou, Changchun Bao
Multi-stage Progressive Compression of Conformer Transducer for On-device Speech Recognition
Jash Rathod, Nauman Dawalatabad, SHATRUGHAN SINGH et al.
Multi-Task End-to-End Model for Telugu Dialect and Speech Recognition
Aditya Yadavalli, Ganesh Mirishkar, Anil Kumar Vuppala
Multitask Learning for Low Resource Spoken Language Understanding
Quentin Meeus, Marie Francine Moens, Hugo Van hamme
Multi-Type Outer Product-Based Fusion of Respiratory Sounds for Detecting COVID-19
Adria Mallol-Ragolta, Helena Cuesta, Emilia Gomez et al.
Multi-View Attention Transfer for Efficient Speech Enhancement
Wooseok Shin, Hyun Joon Park, Jin Sob Kim et al.
MusicNet: Compact Convolutional Neural Network for Real-time Background Music Detection
Chandan Reddy, Vishak Gopal, Harishchandra Dubey et al.
Muskits: an End-to-end Music Processing Toolkit for Singing Voice Synthesis
Jiatong Shi, Shuai Guo, Tao Qian et al.
Nasal Coda Loss in the Chengdu Dialect of Mandarin: Evidence from RT-MRI
Sishi Liao, Phil Hoole, Conceição Cunha et al.
NAS-SCAE: Searching Compact Attention-based Encoders For End-to-end Automatic Speech Recognition
Yukun Liu, Ta Li, Pengyuan Zhang et al.
NASTAR: Noise Adaptive Speech Enhancement with Target-Conditional Resampling
Chi-Chang Lee, Cheng-Hung Hu, Yu-Chen Lin et al.
NAS-VAD: Neural Architecture Search for Voice Activity Detection
Daniel Rho, Jinhyeok Park, Jong Hwan Ko
Native phonotactic interference in L2 vowel processing: Mouse-tracking reveals cognitive conflicts during identification
Yizhou Wang, Rikke Bundgaard-Nielsen, Brett Baker et al.
Negative Guided Abstractive Dialogue Summarization
Junpeng Liu, Yanyan Zou, Yuxuan Xi et al.
NeMo Open Source Speaker Diarization System
Tae Jin Park, Nithin Rao Koluguri, Fei Jia et al.
NESC: Robust Neural End-2-End Speech Coding with GANs
Nicola Pia, Kishan Gupta, Srikanth Korse et al.
Neural correlates of acoustic and semantic cues during speech segmentation in French
Maria del Mar Cordero, Ambre Denis-Noël, Elsa Spinelli et al.
Neural Lexicon Reader: Reduce Pronunciation Errors in End-to-end TTS by Leveraging External Textual Knowledge
Mutian He, Jingzhou Yang, Lei He et al.
Neural Network-augmented Kalman Filtering for Robust Online Speech Dereverberation in Noisy Reverberant Environments
Jean-Marie Lemercier, Joachim Thiemann, Raphael Koning et al.