← Recognition

Speech & Audio › Recognition ›

Speech Recognition

1480 directly classified papers

Papers per year

Papers

Strategies for Improving Low Resource Speech to Text Translation Relying on Pre-trained ASR Models INTERSPEECH 2023

Record Deduplication for Entity Distribution Modeling in ASR Transcripts INTERSPEECH 2023

MAVD: The First Open Large-Scale Mandarin Audio-Visual Dataset with Depth Information INTERSPEECH 2023

NoRefER: a Referenceless Quality Metric for Automatic Speech Recognition via Semi-Supervised Language Model Fine-Tuning with Contrastive Learning INTERSPEECH 2023

Towards Zero-shot Learning for End-to-end Cross-modal Translation Models EMNLP 2023

Modular Speech-to-Text Translation for Zero-Shot Cross-Modal Transfer INTERSPEECH 2023

Perception and Semantic Aware Regularization for Sequential Confidence Calibration CVPR 2023

What Can an Accent Identifier Learn? Probing Phonetic and Prosodic Information in a Wav2vec2-based Accent Identification Model INTERSPEECH 2023

On-the-Fly Feature Based Rapid Speaker Adaptation for Dysarthric and Elderly Speech Recognition INTERSPEECH 2023

Consistency is Key: On Data-Efficient Modality Transfer in Speech Translation EMNLP 2023

Scaling Laws for Discriminative Speech Recognition Rescoring Models INTERSPEECH 2023

Dysarthric Speech Recognition, Detection and Classification using Raw Phase and Magnitude Spectra INTERSPEECH 2023

Semantic Enrichment Towards Efficient Speech Representations INTERSPEECH 2023

Embedding Articulatory Constraints for Low-resource Speech Recognition Based on Large Pre-trained Model INTERSPEECH 2023

Personalization for BERT-based Discriminative Speech Recognition Rescoring INTERSPEECH 2023

SynthVSR: Scaling Up Visual Speech Recognition With Synthetic Supervision CVPR 2023

End-to-End Word-Level Pronunciation Assessment with MASK Pre-training INTERSPEECH 2023

Investigating wav2vec2 context representations and the effects of fine-tuning, a case-study of a Finnish model INTERSPEECH 2023

C²A-SLU: Cross and Contrastive Attention for Improving ASR Robustness in Spoken Language Understanding INTERSPEECH 2023

Language-Routing Mixture of Experts for Multilingual and Code-Switching Speech Recognition INTERSPEECH 2023

Language-Universal Phonetic Representation in Multilingual Speech Pretraining for Low-Resource Speech Recognition INTERSPEECH 2023

Dialect Speech Recognition Modeling using Corpus of Japanese Dialects and Self-Supervised Learning-based Model XLSR INTERSPEECH 2023

A Neural Time Alignment Module for End-to-End Automatic Speech Recognition INTERSPEECH 2023

Integration of Frame- and Label-synchronous Beam Search for Streaming Encoder-decoder Speech Recognition INTERSPEECH 2023

Distillation Strategies for Discriminative Speech Recognition Rescoring INTERSPEECH 2023