speech recognition

1223 papers

Explore in graph

Also known as

STT WER HSR SRS ASR SR

Co-occurring keywords

automatic speech recognition (1764) word error rate (406) acoustic model (277) speech translation (413) multimodal learning (4622) language model (4573) self-supervised learning (3751) machine translation (2472) deep neural network (1801) neural network (6616)

Papers

ASR data augmentation in low-resource settings using cross-lingual multi-speaker TTS and cross-lingual voice conversion INTERSPEECH 2023

GhostT5: Generate More Features with Cheap Operations to Improve Textless Spoken Question Answering INTERSPEECH 2023

Efficient Self-supervised Learning with Contextualized Target Representations for Vision, Speech and Language ICML 2023

Evaluating Multilingual Speech Translation under Realistic Conditions with Resegmentation and Terminology ACL 2023

Modality Confidence Aware Training for Robust End-to-End Spoken Language Understanding INTERSPEECH 2023

Improving Code-Switching and Name Entity Recognition in ASR with Speech Editing based Data Augmentation INTERSPEECH 2023

Rethinking Speech Recognition with A Multimodal Perspective via Acoustic and Semantic Cooperative Decoding INTERSPEECH 2023

Record Deduplication for Entity Distribution Modeling in ASR Transcripts INTERSPEECH 2023

Matesub: The Translated Subtitling Tool at the IWSLT2023 Subtitling Task ACL 2023

Personalization for BERT-based Discriminative Speech Recognition Rescoring INTERSPEECH 2023

Iteratively Improving Speech Recognition and Voice Conversion INTERSPEECH 2023

Acoustic Word Embeddings for Untranscribed Target Languages with Continued Pretraining and Learned Pooling INTERSPEECH 2023

Guiding Students to Investigate What Google Speech Recognition Knows about Language AAAI 2023

Zambezi Voice: A Multilingual Speech Corpus for Zambian Languages INTERSPEECH 2023

Perception and Semantic Aware Regularization for Sequential Confidence Calibration CVPR 2023

MOCKS 1.0: Multilingual Open Custom Keyword Spotting Testset INTERSPEECH 2023

BeAts: Bengali Speech Acts Recognition using Multimodal Attention Fusion INTERSPEECH 2023

TokenSplit: Using Discrete Speech Representations for Direct, Refined, and Transcript-Conditioned Speech Separation and Recognition INTERSPEECH 2023

HK-LegiCoST: Leveraging Non-Verbatim Transcripts for Speech Translation INTERSPEECH 2023

CoMFLP: Correlation Measure Based Fast Search on ASR Layer Pruning INTERSPEECH 2023

Multilingual self-supervised speech representations improve the speech recognition of low-resource African languages with codeswitching EMNLP 2023

Don’t Stop Self-Supervision: Accent Adaptation of Speech Representations via Residual Adapters INTERSPEECH 2023

Using Random Forests to classify language as a function of syllable timing in two groups: children with cochlear implants and with normal hearing INTERSPEECH 2023

SGEM: Test-Time Adaptation for Automatic Speech Recognition via Sequential-Level Generalized Entropy Minimization INTERSPEECH 2023

An Equitable Framework for Automatically Assessing Children's Oral Narrative Language Abilities INTERSPEECH 2023