Papers
MFT-CRN:Multi-scale Fourier Transform for Monaural Speech Enhancement
Yulong Wang, Xueliang Zhang
miniStreamer: Enhancing Small Conformer with Chunked-Context Masking for Streaming ASR Applications on the Edge
Haris Gulzar, Monikka Roslianna Busto, Takeharu Eda et al.
Mispronunciation detection and diagnosis model for tonal language, applied to Vietnamese
Tuong Tu Huu, Viet Thanh Pham, Thi Thu Trang Nguyen et al.
Mitigating Catastrophic Forgetting for Few-Shot Spoken Word Classification Through Meta-Learning
Ruan van der Merwe, Herman Kamper
Mitigating the Exposure Bias in Sentence-Level Grapheme-to-Phoneme (G2P) Transduction
Eunseop Yoon, Hee Suk Yoon, Dhananjaya Gowda et al.
Mix before Align: Towards Zero-shot Cross-lingual Sentiment Analysis via Soft-Mix and Multi-View Learning
Zhihong Zhu, Xuxin Cheng, Dongsheng Chen et al.
MixRep: Hidden Representation Mixup for Low-Resource Speech Recognition
Jiamin Xie, John H. L. Hansen
Mixture Encoder for Joint Speech Separation and Recognition
Simon Berger, Peter Vieting, Christoph Boeddeker et al.
Mixture-of-Expert Conformer for Streaming Multilingual ASR
Ke Hu, Bo Li, Tara Sainath et al.
ML-SUPERB: Multilingual Speech Universal PERformance Benchmark
Jiatong Shi, Dan Berrebbi, William Chen et al.
MMER: Multimodal Multi-task Learning for Speech Emotion Recognition
Sreyan Ghosh, Utkarsh Tyagi, S Ramaneswaran et al.
MMLung: Moving Closer to Practical Lung Health Estimation using Smartphones
Mohammed Mosuily, Lindsay Welch, Jagmohan Chauhan
MMSpeech: Multi-modal Multi-task Encoder-Decoder Pre-training for speech recognition
Xiaohuan Zhou, Jiaming Wang, Zeyu Cui et al.
MOCKS 1.0: Multilingual Open Custom Keyword Spotting Testset
Mikołaj Pudo, Mateusz Wosik, Adam Cieślak et al.
Modality Confidence Aware Training for Robust End-to-End Spoken Language Understanding
Suyoun Kim, Akshat Shrivastava, Duc Le et al.
Model-assisted Lexical Tone Evaluation of three-year-old Chinese-speaking Children by also Considering Segment Production
Shu-Chuan Tseng, Yi-Fen Liu, Xiang-Li Lu
Model Compression for DNN-based Speaker Verification Using Weight Quantization
Jingyu Li, Wei Liu, Zhaoyang Zhang et al.
Modeling Dependent Structure for Utterances in ASR Evaluation
Zhe Liu, Fuchun Peng
Model-Internal Slot-triggered Biasing for Domain Expansion in Neural Transducer ASR Models
Yiting Lu, Philip Harding, Kanthashree Mysore Sathyendra et al.
Modular Domain Adaptation for Conformer-Based Streaming ASR
Qiujia Li, Bo Li, Dongseong Hwang et al.
Modular Speech-to-Text Translation for Zero-Shot Cross-Modal Transfer
Paul-Ambroise Duquenne, Holger Schwenk, Benoît Sagot
Monaural Speech Separation Method Based on Recurrent Attention with Parallel Branches
Xue Yang, Changchun Bao, Xu Zhang et al.
MOS vs. AB: Evaluating Text-to-Speech Systems Reliably Using Clustered Standard Errors
Joshua Camp, Tom Kenter, Lev Finkelstein et al.
Motor Control Similarity Between Speakers Saying “A Souk” Using Inverse Atlas Tongue Modeling
Ursa Maity, Fangxu Xing, Jerry Prince et al.