Research Explorer

Using Text Injection to Improve Recognition of Personal Identifiers in Speech

Yochai Blau, Rohan Agrawal, Lior Madmony et al.

2023 INTERSPEECH

Utility-Preserving Privacy-Enabled Speech Embeddings for Emotion Detection

Chandrashekhar Lavania, Sanjiv Das, Xin Huang et al.

2023 INTERSPEECH

Validation of a Task-Independent Cepstral Peak Prominence Measure with Voice Activity Detection

Olivia M. Murton, Abigail E. Haenssler, Marc F. Maffei et al.

2023 INTERSPEECH

Variance-Preserving-Based Interpolation Diffusion Models for Speech Enhancement

Zilu Guo, Jun Du, Chin-Hui Lee et al.

2023 INTERSPEECH

Variational Classifier for Unsupervised Anomalous Sound Detection under Domain Generalization

Antonio Almudévar, Alfonso Ortega, Luis Vicente et al.

2023 INTERSPEECH

VC-T: Streaming Voice Conversion Based on Neural Transducer

Hiroki Kanagawa, Takafumi Moriya, Yusuke Ijima

2023 INTERSPEECH

Verbal and nonverbal feedback signals in response to increasing levels of miscommunication

Maeva Garnier, Eric Le Ferrand, Fabien Ringeval

2023 INTERSPEECH

Video Multimodal Emotion Recognition System for Real World Applications

Sun-Kyung Lee, Jong-Hwan Kim

2023 INTERSPEECH

Video Summarization Leveraging Multimodal Information for Presentations

Hanchao Liu, Dapeng Chen, Rongjun Li et al.

2023 INTERSPEECH

Vietnam-Celeb: a large-scale dataset for Vietnamese speaker recognition

Viet Thanh Pham, Xuan Thai Hoa Nguyen, Vu Hoang et al.

2023 INTERSPEECH

VISinger2: High-Fidelity End-to-End Singing Voice Synthesis Enhanced by Digital Signal Processing Synthesizer

Yongmao Zhang, Heyang Xue, Hanzhao Li et al.

2023 INTERSPEECH

Vistaar: Diverse Benchmarks and Training Sets for Indian Language ASR

Kaushal Bhogale, Sai Sundaresan, Abhigyan Raman et al.

2023 INTERSPEECH

Visualizing Data Augmentation in Deep Speaker Recognition

Pengqi Li, Lantian Li, Askar Hamdulla et al.

2023 INTERSPEECH

Visually-Aware Audio Captioning With Adaptive Audio-Visual Attention

Xubo Liu, Qiushi Huang, Xinhao Mei et al.

2023 INTERSPEECH

Visually grounded few-shot word acquisition with fewer shots

Leanne Nortje, Benjamin van Niekerk, Herman Kamper

2023 INTERSPEECH

VITS2: Improving Quality and Efficiency of Single-Stage Text-to-Speech with Adversarial Learning and Architecture Design

Jungil Kong, Jihoon Park, Beomjeong Kim et al.

2023 INTERSPEECH

Vocoder drift in x-vector–based speaker anonymization

Michele Panariello, Massimiliano Todisco, Nicholas Evans

2023 INTERSPEECH

Voice Conversion With Just Nearest Neighbors

Matthew Baas, Benjamin van Niekerk, Herman Kamper

2023 INTERSPEECH

Voice Passing : a Non-Binary Voice Gender Prediction System for evaluating Transgender voice transition

David Doukhan, Simon Devauchelle, Lucile Girard-Monneron et al.

2023 INTERSPEECH

Voice Twins: Discovering Extremely Similar-sounding, Unrelated Speakers

Linda Gerlach, Kirsty McDougall, Finnian Kelly et al.

2023 INTERSPEECH

Vowel Normalisation in Latent Space for Sociolinguistics

James Burridge

2023 INTERSPEECH

Vowel reduction by Greek-speaking children: The effect of stress and word length

Polychronia Christodoulidou, Katerina Nicolaidis, Dimitrios Stamovlasis

2023 INTERSPEECH

VoxTube: a multilingual speaker recognition dataset

Ivan Yakovlev, Anton Okhotnikov, Nikita Torgashov et al.

2023 INTERSPEECH

Wav2ToBI: a new approach to automatic ToBI transcription

Wanyue Zhai, Mark Hasegawa-Johnson

2023 INTERSPEECH

wav2vec 2.0 ASR for Cantonese-Speaking Older Adults in a Clinical Setting

Ranzo Huang, Brian Mak

2023 INTERSPEECH

Papers