2023 INTERSPEECH INTERSPEECH 2023

Whistle-to-text: Automatic recognition of the Silbo Gomero whistled language

Abstract

Automatic speech recognition (ASR) is a rapidly developing field of study. However, ASR for other types of speech than the regular spoken speech-for example, whispering or shouting-remains difficult, as it requires specific models trained to recognise these types of speech. A lesser-known type of speech than those is the whistled speech, in which speech is transformed into whistling. In this paper, I will describe how I created the first-ever ASR model designed to recognise a whistled language. It was trained, using the HMM-GMM approach to ASR, to recognise the whistled dialect of Spanish, Silbo Gomero. This model learned to recognise Silbo Gomero, though its performance was somewhat worse than that of spoken speech recognition models trained on data sets of similar size. It appears that methods used to create spoken language ASR models can be used to create whistled language ASR models, with only small changes-which will be explained in this paper-required.

🧭 Keyword Pioneer — whistled speech
🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Speech & Audio

Authors