← Recognition

Speech & Audio › Recognition ›

Speech Recognition

1480 directly classified papers

Papers per year

Papers

Improving End-to-End Speech Translation by Imitation-Based Knowledge Distillation with Synthetic Transcripts ACL 2023

UniSplice: Universal Cross-Lingual Data Splicing for Low-Resource ASR INTERSPEECH 2023

Allophant: Cross-lingual Phoneme Recognition with Articulatory Attributes INTERSPEECH 2023

Robust Keyword Spotting for Noisy Environments by Leveraging Speech Enhancement and Speech Presence Probability INTERSPEECH 2023

A Comparative Study on E-Branchformer vs Conformer in Speech Recognition, Translation, and Understanding Tasks INTERSPEECH 2023

The MASCFLICHT Corpus: Face Mask Type and Coverage Area Recognition from Speech INTERSPEECH 2023

Matching Latent Encoding for Audio-Text based Keyword Spotting INTERSPEECH 2023

Improving Small Footprint Few-shot Keyword Spotting with Supervision on Auxiliary Data INTERSPEECH 2023

Semantic Enrichment Towards Efficient Speech Representations INTERSPEECH 2023

Description and analysis of the KPT system for NIST Language Recognition Evaluation 2022 INTERSPEECH 2023

Sequence-Level Knowledge Distillation for Class-Incremental End-to-End Spoken Language Understanding INTERSPEECH 2023

Arabic Dysarthric Speech Recognition Using Adversarial and Signal-Based Augmentation INTERSPEECH 2023

Streaming Audio-Visual Speech Recognition with Alignment Regularization INTERSPEECH 2023

A Binary Keyword Spotting System with Error-Diffusion Based Feature Binarization INTERSPEECH 2023

Investigating Pre-trained Audio Encoders in the Low-Resource Condition INTERSPEECH 2023

SparseVSR: Lightweight and Noise Robust Visual Speech Recognition INTERSPEECH 2023

Hearing Lips in Noise: Universal Viseme-Phoneme Mapping and Transfer for Robust Audio-Visual Speech Recognition ACL 2023

Intuitive Multilingual Audio-Visual Speech Recognition with a Single-Trained Model EMNLP 2023

Exploring the Impact of Pretrained Models and Web-Scraped Data for the 2022 NIST Language Recognition Evaluation INTERSPEECH 2023

GhostT5: Generate More Features with Cheap Operations to Improve Textless Spoken Question Answering INTERSPEECH 2023

BIG-C: a Multimodal Multi-Purpose Dataset for Bemba ACL 2023

MIR-GAN: Refining Frame-Level Modality-Invariant Representations with Adversarial Network for Audio-Visual Speech Recognition ACL 2023

STT4SG-350: A Speech Corpus for All Swiss German Dialect Regions ACL 2023

Multimodal Speech Recognition for Language-Guided Embodied Agents INTERSPEECH 2023

CoBERT: Self-Supervised Speech Representation Learning Through Code Representation Learning INTERSPEECH 2023