speech recognition

1223 papers

Explore in graph

Also known as

STT WER HSR SRS ASR SR

Co-occurring keywords

automatic speech recognition (1764) word error rate (406) acoustic model (277) speech translation (413) multimodal learning (4622) language model (4573) self-supervised learning (3751) machine translation (2472) deep neural network (1801) neural network (6616)

Papers

BeAts: Bengali Speech Acts Recognition using Multimodal Attention Fusion INTERSPEECH 2023

TokenSplit: Using Discrete Speech Representations for Direct, Refined, and Transcript-Conditioned Speech Separation and Recognition INTERSPEECH 2023

Re-investigating the Efficient Transfer Learning of Speech Foundation Model using Feature Fusion Methods INTERSPEECH 2023

Prompting the Hidden Talent of Web-Scale Speech Models for Zero-Shot Task Generalization INTERSPEECH 2023

An Equitable Framework for Automatically Assessing Children's Oral Narrative Language Abilities INTERSPEECH 2023

Don’t Stop Self-Supervision: Accent Adaptation of Speech Representations via Residual Adapters INTERSPEECH 2023

Automatic Data Augmentation for Domain Adapted Fine-Tuning of Self-Supervised Speech Representations INTERSPEECH 2023

Towards Noise-Tolerant Speech-Referring Video Object Segmentation: Bridging Speech and Text EMNLP 2023

SGEM: Test-Time Adaptation for Automatic Speech Recognition via Sequential-Level Generalized Entropy Minimization INTERSPEECH 2023

Let's Give a Voice to Conversational Agents in Virtual Reality INTERSPEECH 2023

Classifying Dementia in the Presence of Depression: A Cross-Corpus Study INTERSPEECH 2023

PronScribe: Highly Accurate Multimodal Phonemic Transcription From Speech and Text INTERSPEECH 2023

Phonetic-assisted Multi-Target Units Modeling for Improving Conformer-Transducer ASR system INTERSPEECH 2023

Human Transcription Quality Improvement INTERSPEECH 2023

Cross-lingual/Cross-channel Intent Detection in Contact-Center Conversations INTERSPEECH 2023

Fast and Efficient Multilingual Self-Supervised Pre-training for Low-Resource Speech Recognition INTERSPEECH 2023

Temporally Aligning Long Audio Interviews with Questions: A Case Study in Multimodal Data Integration IJCAI 2023

Dysarthric Speech Recognition, Detection and Classification using Raw Phase and Magnitude Spectra INTERSPEECH 2023

UniSplice: Universal Cross-Lingual Data Splicing for Low-Resource ASR INTERSPEECH 2023

Using Random Forests to classify language as a function of syllable timing in two groups: children with cochlear implants and with normal hearing INTERSPEECH 2023

Perception and Semantic Aware Regularization for Sequential Confidence Calibration CVPR 2023

Hierarchical Fusion for Online Multimodal Dialog Act Classification EMNLP 2023

Integrating Emotion Recognition with Speech Recognition and Speaker Diarisation for Conversations INTERSPEECH 2023

Improving the response timing estimation for spoken dialogue systems by reducing the effect of speech recognition delay INTERSPEECH 2023

Exploiting Cross-Domain And Cross-Lingual Ultrasound Tongue Imaging Features For Elderly And Dysarthric Speech Recognition INTERSPEECH 2023