speech processing

478 papers

Explore in graph

Co-occurring keywords

multimodal learning (4622) automatic speech recognition (1764) speech recognition (1223) self-supervised learning (3751) representation learning (6174) large language model (12755) acoustic feature (265) neural network (6616) speech analysis (363) feature extraction (1578)

Papers

Adapting an Unadaptable ASR System INTERSPEECH 2023

Speech-Text Pre-training for Spoken Dialog Understanding with Explicit Cross-Modal Alignment ACL 2023

Exploring the Impact of Back-End Network on Wav2vec 2.0 for Dialect Identification INTERSPEECH 2023

Yet Another Model for Arabic Dialect Identification EMNLP 2023

Dialogue Act-Aided Backchannel Prediction Using Multi-Task Learning EMNLP 2023

MCR-Data2vec 2.0: Improving Self-supervised Speech Pre-training via Model-level Consistency Regularization INTERSPEECH 2023

DISCO: A Large Scale Human Annotated Corpus for Disfluency Correction in Indo-European Languages EMNLP 2023

Pre-trained Speech Processing Models Contain Human-Like Biases that Propagate to Speech Emotion Recognition EMNLP 2023

PoCaPNet: A Novel Approach for Surgical Phase Recognition Using Speech and X-Ray Images INTERSPEECH 2023

SegAugment: Maximizing the Utility of Speech Translation Data with Segmentation-based Augmentations EMNLP 2023

Head movements in two- and four-person interactive conversational tasks in noisy and moderately reverberant conditions INTERSPEECH 2023

ZeroPrompt: Streaming Acoustic Encoders are Zero-Shot Masked LMs INTERSPEECH 2023

Improving End-to-End Speech Translation by Leveraging Auxiliary Speech and Text Data AAAI 2023

Investigating the cortical tracking of speech and music with sung speech INTERSPEECH 2023

Investigating Reproducibility at Interspeech Conferences: A Longitudinal and Comparative Perspective INTERSPEECH 2023

Towards Multi-Lingual Audio Question Answering INTERSPEECH 2023

Exploration on HuBERT with Multiple Resolution INTERSPEECH 2023

Biased Self-supervised Learning for ASR INTERSPEECH 2023

Understanding Spoken Language Development of Children with ASD Using Pre-trained Speech Embeddings INTERSPEECH 2023

Dual-Memory Multi-Modal Learning for Continual Spoken Keyword Spotting with Confidence Selection and Diversity Enhancement INTERSPEECH 2023

Reducing Barriers to Self-Supervised Learning: HuBERT Pre-training with Academic Compute INTERSPEECH 2023

Probing Self-supervised Speech Models for Phonetic and Phonemic Information: A Case Study in Aspiration INTERSPEECH 2023

Improving grapheme-to-phoneme conversion by learning pronunciations from speech recordings INTERSPEECH 2023

Automatic Prediction of Language Learners' Listenability Using Speech and Text Features Extracted from Listening Drills INTERSPEECH 2023

NeMo Forced Aligner and its application to word alignment for subtitle generation INTERSPEECH 2023