Research Explorer

Modelling Lexical Characteristics of the Healthy Aging Population: A Corpus-Based Study

Han Kunmei

2024 INTERSPEECH

Jiahao Li, Miao Liu, Shu Yang et al.

2024 INTERSPEECH

MR-RawNet: Speaker verification system with multiple temporal resolutions for variable duration utterances using raw waveforms

Seung-bin Kim, Chan-yeong Lim, Jungwoo Heo et al.

2024 INTERSPEECH

MSA-DPCRN: A Multi-Scale Asymmetric Dual-Path Convolution Recurrent Network with Attentional Feature Fusion for Acoustic Echo Cancellation

Ye Ni, Cong Pang, Chengwei Huang et al.

2024 INTERSPEECH

MSceneSpeech: A Multi-Scene Speech Dataset For Expressive Speech Synthesis

Qian Yang, Jialong Zuo, Zhe Su et al.

2024 INTERSPEECH

MSDET: Multitask Speaker Separation and Direction-of-Arrival Estimation Training

Roland Hartanto, Sakriani Sakti, Koichi Shinoda

2024 INTERSPEECH

MS-HuBERT: Mitigating Pre-training and Inference Mismatch in Masked Language Modelling methods for learning Speech Representations

Hemant Yadav, Sunayana Sitaram, Rajiv Ratn Shah

2024 INTERSPEECH

MSR-86K: An Evolving, Multilingual Corpus with 86,300 Hours of Transcribed Audio for Speech Recognition Research

Song Li, Yongbin You, Xuezhi Wang et al.

2024 INTERSPEECH

MSRS: Training Multimodal Speech Recognition Models from Scratch with Sparse Mask Optimization

Adriana Fernandez-Lopez, Honglie Chen, Pingchuan Ma et al.

2024 INTERSPEECH

Multi-Channel Extension of Pre-trained Models for Speaker Verification

Ladislav Mošner, Romain Serizel, Lukáš Burget et al.

2024 INTERSPEECH

Multi-Channel Multi-Speaker ASR Using Target Speaker’s Solo Segment

Yiwen Shao, Shi-Xiong Zhang, Yong Xu et al.

2024 INTERSPEECH

MULTI-CONVFORMER: Extending Conformer with Multiple Convolution Kernels

Darshan Prabhu, Yifan Peng, Preethi Jyothi et al.

2024 INTERSPEECH

Multi-label Bird Species Classification from Field Recordings using Mel_Graph-GCN Framework

Noumida A, Rajeev Rajan

2024 INTERSPEECH

Multi-latency look-ahead for streaming speaker segmentation

Bilal Rahou, Hervé Bredin

2024 INTERSPEECH

Multilingual Speech and Language Analysis for the Assessment of Mild Cognitive Impairment: Outcomes from the Taukadial Challenge

Paula Andrea Pérez-Toro, Tomas Arias-Vergara, Philipp Klumpp et al.

2024 INTERSPEECH

Multi-mic Echo Cancellation Coalesced with Beamforming for Real World Adverse Acoustic Conditions

Premanand Nayak, Kamini Sabu, M. Ali Basha Shaik

2024 INTERSPEECH

Multi-modal Adversarial Training for Zero-Shot Voice Cloning

John Janiczek, Dading Chong, Dongyang Dai et al.

2024 INTERSPEECH

Multi-Modal Automatic Prosody Annotation with Contrastive Pretraining of Speech-Silence and Word-Punctuation

Jinzuomu Zhong, Yang Li, Hui Huang et al.

2024 INTERSPEECH

Multimodal Belief Prediction

John Murzaku, Adil Soubki, Owen Rambow

2024 INTERSPEECH

Multimodal Continuous Fingerspelling Recognition via Visual Alignment Learning

Katerina Papadimitriou, Gerasimos Potamianos

2024 INTERSPEECH

Multimodal Digital Biomarkers for Longitudinal Tracking of Speech Impairment Severity in ALS: An Investigation of Clinically Important Differences

Michael Neumann, Hardik Kothare, Jackson Liscombe et al.

2024 INTERSPEECH

Multimodal Fusion for Vocal Biomarkers Using Vector Cross-Attention

Vladimir Despotovic, Abir Elbéji, Petr V. Nazarov et al.

2024 INTERSPEECH

Multimodal Fusion of Music Theory-Inspired and Self-Supervised Representations for Improved Emotion Recognition

Xiaohan Shi, Xingfeng Li, Tomoki Toda

2024 INTERSPEECH

Multimodal Large Language Models with Fusion Low Rank Adaptation for Device Directed Speech Detection

Shruti Palaskar, Ognjen Rudovic, Sameer Dharur et al.

2024 INTERSPEECH

Multimodal Representation Loss Between Timed Text and Audio for Regularized Speech Separation

Tsun-An Hsieh, Heeyoul Choi, Minje Kim

2024 INTERSPEECH

Papers