speech recognition

1223 papers

Explore in graph

Also known as

STT WER HSR SRS ASR SR

Co-occurring keywords

automatic speech recognition (1764) word error rate (406) acoustic model (277) speech translation (413) multimodal learning (4622) language model (4573) self-supervised learning (3751) machine translation (2472) deep neural network (1801) neural network (6616)

Papers

Spoken-to-written text conversion with Large Language Model INTERSPEECH 2024

Adapter pre-training for improved speech recognition in unseen domains using low resource adapter tuning of self-supervised models INTERSPEECH 2024

CaptainA self-study mobile app for practising speaking: task completion assessment and feedback with generative AI INTERSPEECH 2024

M3AV: A Multimodal, Multigenre, and Multipurpose Audio-Visual Academic Lecture Dataset ACL 2024

The Influence of Automatic Speech Recognition on Linguistic Features and Automatic Alzheimer’s Disease Detection from Spontaneous Speech COLING 2024

When Good and Reproducible Results are a Giant with Feet of Clay: The Importance of Software Quality in NLP ACL 2024

Prompting Large Language Models with Mispronunciation Detection and Diagnosis Abilities INTERSPEECH 2024

Automatic Partitioning of a Code-Switched Speech Corpus Using Mixed-Integer Programming COLING 2024

Lightweight Transducer Based on Frame-Level Criterion INTERSPEECH 2024

ES3: Evolving Self-Supervised Learning of Robust Audio-Visual Speech Representations CVPR 2024

A Multitask Training Approach to Enhance Whisper with Open-Vocabulary Keyword Spotting INTERSPEECH 2024

On the Evaluation of Speech Foundation Models for Spoken Language Understanding ACL 2024

Multimodal Belief Prediction INTERSPEECH 2024

Contextual Biasing Speech Recognition in Speech-enhanced Large Language Model INTERSPEECH 2024

Wav2Gloss: Generating Interlinear Glossed Text from Speech ACL 2024

Multi-Dialect Vietnamese: Task, Dataset, Baseline Models and Challenges EMNLP 2024

Handling the Alignment for Wake Word Detection: A Comparison Between Alignment-Based, Alignment-Free and Hybrid Approaches INTERSPEECH 2023

Improving Joint Speech-Text Representations Without Alignment INTERSPEECH 2023

Leveraging Cross-Utterance Context For ASR Decoding INTERSPEECH 2023

Exploring Energy-based Language Models with Different Architectures and Training Methods for Speech Recognition INTERSPEECH 2023

Spoken Language Identification System for English-Mandarin Code-Switching Child-Directed Speech INTERSPEECH 2023

Knowledge Distillation for Neural Transducer-based Target-Speaker ASR: Exploiting Parallel Mixture/Single-Talker Speech Data INTERSPEECH 2023

Listen, Decipher and Sign: Toward Unsupervised Speech-to-Sign Language Recognition ACL 2023

Lenient Evaluation of Japanese Speech Recognition: Modeling Naturally Occurring Spelling Inconsistency ACL 2023

Blank Collapse: Compressing CTC Emission for the Faster Decoding INTERSPEECH 2023