Research Explorer

Wavelet Scattering Transform for Improving Generalization in Low-Resourced Spoken Language Identification

Spandan Dey, Premjeet Singh, Goutam Saha

2023 INTERSPEECH

Wave to Syntax: Probing spoken language models for syntax

Gaofei Shen, Afra Alishahi, Arianna Bisazza et al.

2023 INTERSPEECH

Weakly-supervised forced alignment of disfluent speech using phoneme-level modeling

Theodoros Kouzelis, Georgios Paraskevopoulos, Athanasios Katsamanis et al.

2023 INTERSPEECH

Weakly supervised glottis segmentation in high-speed videoendoscopy using bounding box labels

Varun Belagali, Achuth Rao, Prasanta Kumar Ghosh

2023 INTERSPEECH

Weakly-Supervised Speech Pre-training: A Case Study on Target Speech Recognition

Wangyou Zhang, Yanmin Qian

2023 INTERSPEECH

Weighted Von Mises Distribution-based Loss Function for Real-time STFT Phase Reconstruction Using DNN

Nguyen Binh Thien, Yukoh Wakabayashi, Yuting Geng et al.

2023 INTERSPEECH

What are differences? Comparing DNN and Human by Their Performance and Characteristics in Speaker Age Estimation

Yuki Kitagishi, Naohiro Tawara, Atsunori Ogawa et al.

2023 INTERSPEECH

What Can an Accent Identifier Learn? Probing Phonetic and Prosodic Information in a Wav2vec2-based Accent Identification Model

Mu Yang, Ram C. M. C. Shekar, Okim Kang et al.

2023 INTERSPEECH

What do self-supervised speech representations encode? An analysis of languages, varieties, speaking styles and speakers

Julian Linke, Mate Kadar, Gergely Dosinszky et al.

2023 INTERSPEECH

What influences the foreign accent strength? Phonological and grammatical errors in the perception of accentedness

Sarah Wesołek, Piotr Gulgowski, Joanna Błaszczak et al.

2023 INTERSPEECH

What is Learnt by the LEArnable Front-end (LEAF)? Adapting Per-Channel Energy Normalisation (PCEN) to Noisy Conditions

Hanyu Meng, Vidhyasaharan Sethu, Eliathamby Ambikairajah

2023 INTERSPEECH

What questions are my customers asking?: Towards Actionable Insights from Customer Questions in Contact Center Calls

Varun Nathan, Devashish Deshpande, Ayush Kumar et al.

2023 INTERSPEECH

What’s in a Rise? The Relevance of Intonation for Attention Orienting

Martine Grice

2023 INTERSPEECH

When Words Speak Just as Loudly as Actions: Virtual Agent Based Remote Health Assessment Integrating What Patients Say with What They Do

Vikram Ramanarayanan, David Pautler, Lakshmi Arbatti et al.

2023 INTERSPEECH

Which aspects of motor speech disorder are captured by Mel Frequency Cepstral Coefficients? Evidence from the change in STN-DBS conditions in Parkinson’s disease

Vojtěch Illner, Petr Krýže, Jan Švihlík et al.

2023 INTERSPEECH

WhiSLU: End-to-End Spoken Language Understanding with Whisper

Minghan Wang, Yinglu Li, Jiaxin Guo et al.

2023 INTERSPEECH

Whisper-AT: Noise-Robust Automatic Speech Recognizers are Also Strong General Audio Event Taggers

Yuan Gong, Sameer Khurana, Leonid Karlinsky et al.

2023 INTERSPEECH

Whisper Encoder features for Infant Cry Classification

Monil Charola, Aastha Kachhi, Hemant A. Patil

2023 INTERSPEECH

Whisper Features for Dysarthric Severity-Level Classification

Siddharth Rathod, Monil Charola, Akshat Vora et al.

2023 INTERSPEECH

WhisperX: Time-Accurate Speech Transcription of Long-Form Audio

Max Bain, Jaesung Huh, Tengda Han et al.

2023 INTERSPEECH

Whistle-to-text: Automatic recognition of the Silbo Gomero whistled language

Agata Jakubiak

2023 INTERSPEECH

Why We Should Report the Details in Subjective Evaluation of TTS More Rigorously

Cheng-Han Chiang, Wei-Ping Huang, Hung-yi Lee

2023 INTERSPEECH

Word-level Confidence Estimation for CTC Models

Burin Naowarat, Thananchai Kongthaworn, Ekapol Chuangsuwanich

2023 INTERSPEECH

Xiaoicesing 2: A High-Fidelity Singing Voice Synthesizer Based on Generative Adversarial Network

Wang Chunhui, Chang Zeng, Xing He

2023 INTERSPEECH

XPhoneBERT: A Pre-trained Multilingual Model for Phoneme Representations for Text-to-Speech

Linh The Nguyen, Thinh Pham, Dat Quoc Nguyen

2023 INTERSPEECH

Papers