2023 INTERSPEECH INTERSPEECH 2023

Fast Enrollable Streaming Keyword Spotting System: Training and Inference using a Web Browser

Abstract

When a keyword spotting system is deployed on heavily personalized platforms such as digital humans, a few issues occur such as 1) a lack of training data when registering user-defined keywords, 2) a desire to reduce computation and minimize latency, and 3) the inability to immediately train and test the keyword-spotting model. We address the issues through 1) a keyword-spotting system based on a speech embedding model, 2) streamable system with duplicate computations removed, and 3) real-time inference in a web browser using WebAssembly.

🌉 Interdisciplinary Bridge — Deep Learning and Machine Learning and Speech & Audio
🧭 Keyword Pioneer — streaming system
🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Deep Learning, Healthcare & Medicine, Interdisciplinary, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio