Papers
MP-SENet: A Speech Enhancement Model with Parallel Denoising of Magnitude and Phase Spectra
Ye-Xin Lu, Yang Ai, Zhen-Hua Ling
MSAF: A Multiple Self-Attention Field Method for Speech Enhancement
Minghang Chu, Jing Wang, Yaoyao Ma et al.
MT4SSL: Boosting Self-Supervised Speech Representation Learning by Integrating Multiple Targets
Ziyang Ma, Zhisheng Zheng, Changli Tang et al.
MTANet: Multi-band Time-frequency Attention Network for Singing Melody Extraction from Polyphonic Music
Yuan Gao, Ying Hu, Liusong Wang et al.
MT-SLVR: Multi-Task Self-Supervised Learning for Transformation In(Variant) Representations
Calum Heggan, Tim Hospedales, Sam Budgett et al.
MuAViC: A Multilingual Audio-Visual Corpus for Robust Speech Recognition and Robust Speech-to-Text Translation
Mohamed Anwar, Bowen Shi, Vedanuj Goswami et al.
Multi-channel multi-speaker transformer for speech recognition
Guo Yifan, Tian Yao, Suo Hongbin et al.
Multi-channel separation of dynamic speech and sound events
Takuya Fujimura, Robin Scheibler
Multi-Channel Speech Separation with Cross-Attention and Beamforming
Ladislav Mosner, Oldřich Plchot, Junyi Peng et al.
Multi-class Detection of Pathological Speech with Latent Features: How does it perform on unseen data?
Dominik Wagner, Ilja Baumann, Franziska Braun et al.
Multi-Dataset Co-Training with Sharpness-Aware Optimization for Audio Anti-spoofing
Hye-jin Shim, Jee-weon Jung, Tomi Kinnunen
Multi-Head State Space Model for Speech Recognition
Yassir Fathullah, Chunyang Wu, Yuan Shangguan et al.
Multi-input Multi-output Complex Spectral Mapping for Speaker Separation
Hassan Taherian, Ashutosh Pandey, Daniel Wong et al.
Multi-Level Knowledge Distillation for Speech Emotion Recognition in Noisy Conditions
Yang Liu, Haoqin Sun, Geng Chen et al.
Multilingual context-based pronunciation learning for Text-to-Speech
Giulia Comini, Sam Ribeiro, Fan Yang et al.
Multilingual Contextual Adapters To Improve Custom Word Recognition In Low-resource Languages
Devang Kulshreshtha, Saket Dingliwal, Brady Houston et al.
Multilingual Text-to-Speech Synthesis for Turkic Languages Using Transliteration
Rustem Yeshpanov, Saida Mussakhojayeva, Yerbolat Khassanov
Multi-microphone Automatic Speech Segmentation in Meetings Based on Circular Harmonics Features
Théo Mariotte, Anthony Larcher, Silvio Montrésor et al.
Multimodal Assessment of Bulbar Amyotrophic Lateral Sclerosis (ALS) Using a Novel Remote Speech Assessment App
Leif Simmatis, Timothy Pommeé, Yana Yunusova
Multimodal Locally Enhanced Transformer for Continuous Sign Language Recognition
Katerina Papadimitriou, Gerasimos Potamianos
Multimodal Personality Traits Assessment (MuPTA) Corpus: The Impact of Spontaneous and Read Speech
Elena Ryumina, Dmitry Ryumin, Maxim Markitantov et al.
Multimodal Speech Recognition for Language-Guided Embodied Agents
Allen Chang, Xiaoyuan Zhu, Aarav Monga et al.
Multimodal Turn-Taking Model Using Visual Cues for End-of-Utterance Prediction in Spoken Dialogue Systems
Fuma Kurata, Mao Saeki, Shinya Fujie et al.
Multi-mode Neural Speech Coding Based on Deep Generative Networks
Wei Xiao, Wenzhe Liu, Meng Wang et al.
Multi-pass Training and Cross-information Fusion for Low-resource End-to-end Accented Speech Recognition
Xuefei Wang, Yanhua Long, Yijie Li et al.