Research Explorer

Voice quality in telephone speech: Comparing acoustic measures between VoIP telephone and high-quality recordings

Chenzi Xu, Jessica Wormald, Paul Foulkes et al.

2024 INTERSPEECH

Voice Quality Variation in AAE: An Additional Challenge for Addressing Bias in ASR Models?

Li-Fang Lai, Nicole Holliday

2024 INTERSPEECH

VoiceTailor: Lightweight Plug-In Adapter for Diffusion-Based Personalized Text-to-Speech

Heeseung Kim, Sang-gil Lee, Jiheum Yeom et al.

2024 INTERSPEECH

VoiCor: A Residual Iterative Voice Correction Framework for Monaural Speech Enhancement

Rui Cao, Tianrui Wang, Meng Ge et al.

2024 INTERSPEECH

VoxBlink2: A 100K+ Speaker Recognition Corpus and the Open-Set Speaker-Identification Benchmark

Yuke Lin, Ming Cheng, Fulin Zhang et al.

2024 INTERSPEECH

VoxFlow AI: wearable voice converter for atypical speech

Grzegorz P. Mika, Konrad Zieli´nski, Paweł Cyrta et al.

2024 INTERSPEECH

VoxMed: one-step respiratory disease classifier using digital stethoscope sounds

Paridhi Mundra, Manik Sharma, Yashwardhan Chaudhuri et al.

2024 INTERSPEECH

VoxSim: A perceptual voice similarity dataset

Junseok Ahn, Youkyum Kim, Yeunju Choi et al.

2024 INTERSPEECH

VSASV: a Vietnamese Dataset for Spoofing-Aware Speaker Verification

Vu Hoang, Viet Thanh Pham, Hoa Nguyen Xuan et al.

2024 INTERSPEECH

Wav2vec 2.0 Embeddings Are No Swiss Army Knife -- A Case Study for Multiple Sclerosis

Gábor Gosztolya, Mercedes Vetráb, Veronika Svindt et al.

2024 INTERSPEECH

Wave to Interlingua: Analyzing Representations of Multilingual Speech Transformers for Spoken Language Translation

Badr M. Abdullah, Mohammed Maqsood Shaik, Dietrich Klakow

2024 INTERSPEECH

Weighted Cross-entropy for Low-Resource Languages in Multilingual Speech Recognition

Andrés Piñeiro-Martín, Carmen García-Mateo, Laura Docio-Fernandez et al.

2024 INTERSPEECH

Well, what can you do with messy data? Exploring the prosody and pragmatic function of the discourse marker "well" with found data and speech synthesis

Johannah O'Mahony, Catherine Lai, Éva Székely

2024 INTERSPEECH

WenetSpeech4TTS: A 12,800-hour Mandarin TTS Corpus for Large Speech Generation Model Benchmark

Linhan Ma, Dake Guo, Kun Song et al.

2024 INTERSPEECH

WeSep: A Scalable and Flexible Toolkit Towards Generalizable Target Speaker Extraction

Shuai Wang, Ke Zhang, Shaoxiong Lin et al.

2024 INTERSPEECH

W-GVKT: Within-Global-View Knowledge Transfer for Speaker Verification

Zezhong Jin, Youzhi Tu, Man-Wai Mak

2024 INTERSPEECH

What Does it Take to Generalize SER Model Across Datasets? A Comprehensive Benchmark

Adham Ibrahim, Shady Shehata, Ajinkya Kulkarni et al.

2024 INTERSPEECH

What do people hear? Listeners’ Perception of Conversational Speech

Adaeze Adigwe, Sarenne Wallbridge, Simon King

2024 INTERSPEECH

What happens in continued pre-training? Analysis of self-supervised speech models with continued pre-training for colloquial Finnish ASR

Yaroslav Getman, Tamas Grosz, Mikko Kurimo

2024 INTERSPEECH

What if HAL breathed? Enhancing Empathy in Human-AI Interactions with Breathing Speech Synthesis

Nicolò Loddo, Francisca Pessanha, Almila Akdag

2024 INTERSPEECH

When Whisper Listens to Aphasia: Advancing Robust Post-Stroke Speech Recognition

Giulia Sanguedolce, Sophie Brook, Dragos C. Gruia et al.

2024 INTERSPEECH

WHiSER: White House Tapes Speech Emotion Recognition Corpus

Abinay Reddy Naini, Lucas Goncalves, Mary A. Kohler et al.

2024 INTERSPEECH

Whisper-Flamingo: Integrating Visual Features into Whisper for Audio-Visual Speech Recognition and Translation

Andrew Rouditchenko, Yuan Gong, Samuel Thomas et al.

2024 INTERSPEECH

Whispering in Norwegian: Navigating Orthographic and Dialectic Challenges

Per E Kummervold, Javier de la Rosa, Freddy Wetjen et al.

2024 INTERSPEECH

Whisper Multilingual Downstream Task Tuning Using Task Vectors

Ji-Hun Kang, Jae-Hong Lee, Mun-Hak Lee et al.

2024 INTERSPEECH

Papers