Research Explorer

Unsupervised Improved MVDR Beamforming for Sound Enhancement

Jacob Kealey, John R. Hershey, François Grondin

2024 INTERSPEECH

Unsupervised Online Continual Learning for Automatic Speech Recognition

Steven Vander Eeckt, Hugo Van hamme

2024 INTERSPEECH

Unveiling Biases while Embracing Sustainability: Assessing the Dual Challenges of Automatic Speech Recognition Systems

Ajinkya Kulkarni, Atharva Kulkarni, Miguel Couceiro et al.

2024 INTERSPEECH

Urdu Alternative Questions: A Hat Pattern

Benazir Mumtaz, Miriam Butt

2024 INTERSPEECH

URGENT Challenge: Universality, Robustness, and Generalizability For Speech Enhancement

Wangyou Zhang, Robin Scheibler, Kohei Saijo et al.

2024 INTERSPEECH

USD-AC: Unsupervised Speech Disentanglement for Accent Conversion

Jen-Hung Huang, Wei-Tsung Lee, Chung-Hsien Wu

2024 INTERSPEECH

Using articulated speech EEG signals for imagined speech decoding

Chris Bras, Tanvina Patel, Odette Scharenborg

2024 INTERSPEECH

Using Large Language Model for End-to-End Chinese ASR and NER

Yuang Li, Jiawei Yu, Min Zhang et al.

2024 INTERSPEECH

Using wav2vec 2.0 for phonetic classification tasks: methodological aspects

Lila Kim, Cédric Gendrot

2024 INTERSPEECH

USM RNN-T model weights binarization

Oleg Rybakov, Dmitriy Serdyuk, Chengjian Zheng

2024 INTERSPEECH

Utilization of Text Data for Response Timing Detection in Attentive Listening

Yu Watanabe, Koichiro Ito, Shigeki Matsubara

2024 INTERSPEECH

Utilizing Adaptive Global Response Normalization and Cluster-Based Pseudo Labels for Zero-Shot Voice Conversion

Ji Sub Um, Hoirin Kim

2024 INTERSPEECH

UY/CH-CHILD -- A Public Chinese L2 Speech Database of Uyghur Children

Mewlude Nijat, Chen Chen, Dong Wang et al.

2024 INTERSPEECH

VAE-based Phoneme Alignment Using Gradient Annealing and SSL Acoustic Features

Tomoki Koriyama

2024 INTERSPEECH

Variability of speech timing features across repeated recordings: a comparison of open-source extraction techniques

Judith Dineley, Ewan Carr, Lauren L. White et al.

2024 INTERSPEECH

Variable Segment Length and Domain-Adapted Feature Optimization for Speaker Diarization

Chenyuan Zhang, Linkai Luo, Hong Peng et al.

2024 INTERSPEECH

VECL-TTS: Voice identity and Emotional style controllable Cross-Lingual Text-to-Speech

Ashishkumar Gudmalwar, Nirmesh Shah, Sai Akarsh et al.

2024 INTERSPEECH

Vec-Tok-VC+: Residual-enhanced Robust Zero-shot Voice Conversion with Progressive Constraints in a Dual-mode Training Strategy

Linhan Ma, Xinfa Zhu, Yuanjun Lv et al.

2024 INTERSPEECH

Vision Transformer Segmentation for Visual Bird Sound Denoising

Sahil Kumar, Jialu Li, Youshan Zhang

2024 INTERSPEECH

Visualization for improving foreign language pronunciation

Charlotte Yoder, Karrie Karahalios, Mark Hasegawa-Johnson et al.

2024 INTERSPEECH

Visual scene display application for augmentative and alternative communication

Karthik Venkat Sridaran, Raja Praveen, Reuben T Varghese et al.

2024 INTERSPEECH

VN-SLU: A Vietnamese Spoken Language Understanding Dataset

Tuyen Tran, Khanh Le, Ngoc Dang Nguyen et al.

2024 INTERSPEECH

Voiced and voiceless laterals in Angami

Viyazonuo Terhiija, Priyankoo Sarmah

2024 INTERSPEECH

VoiceDefense: Protecting Automatic Speaker Verification Models Against Black-box Adversarial Attacks

Yip Keng Kan, Ke Xu, Hao Li et al.

2024 INTERSPEECH

Voice Disorder Analysis: a Transformer-based Approach

Alkis Koudounas, Gabriele Ciravegna, Marco Fantini et al.

2024 INTERSPEECH

Papers