Papers
8,761 papers found
DCTX-Conformer: Dynamic context carry-over for low latency unified streaming and non-streaming Conformer
Goeric Huybrechts, Srikanth Ronanki, Xilai Li et al.
Debiased Automatic Speech Recognition for Dysarthric Speech via Sample Reweighting with Sample Affinity Test
Eungbeom Kim, Yunkee Chae, Jaeheon Sim et al.
DeCoR: Defy Knowledge Forgetting by Predicting Earlier Audio Codes
Xilin Jiang, Yinghao Aaron Li, Nima Mesgarani
Decoupling Segmental and Prosodic Cues of Non-native Speech through Vector Quantization
Waris Quamer, Anurag Das, Ricardo Gutierrez-Osuna
DeepFilterNet: Perceptually Motivated Real-Time Speech Enhancement
Hendrik Schröter, Alberto N. Escalante-B., Tobias Rosenkranz et al.
Deeply Supervised Curriculum Learning for Deep Neural Network-based Sound Source Localization
Min-Sang Baek, Joon-Young Yang, Joon-Hyuk Chang
DeePMOS: Deep Posterior Mean-Opinion-Score of Speech
Xinyu Liang, Fredrik Cumlin, Christian Schüldt et al.
Deep Multi-Frame Filtering for Hearing Aids
Hendrik Schröter, Tobias Rosenkranz, Alberto N. Escalante-B. et al.
Deep Speech Synthesis from MRI-Based Articulatory Representations
Peter Wu, Tingle Li, Yijing Lu et al.
DeepVQE: Real Time Deep Voice Quality Enhancement for Joint Acoustic Echo Cancellation, Noise Suppression and Dereverberation
Nicolae Catalin Ristea, Evgenii Indenbom, Ando Saabas et al.
Defense Against Adversarial Attacks on Audio DeepFake Detection
Piotr Kawa, Marcin Plata, Piotr Syga
DeFT-AN RT: Real-time Multichannel Speech Enhancement using Dense Frequency-Time Attentive Network and Non-overlapping Synthesis Window
Dongheon Lee, Dayun Choi, Jung-Woo Choi
Delay-penalized CTC Implemented Based on Finite State Transducer
Zengwei Yao, Wei Kang, Fangjun Kuang et al.
Describing the phonetics in the underlying speech attributes for deep and interpretable speaker recognition
Imen Ben-Amor, Jean-François Bonastre, Benjamin O'Brien et al.
Description and Analysis of ABC Submission to NIST LRE 2022
Pavel Matejka, Anna Silnova, Josef Slavíček et al.
Description and analysis of the KPT system for NIST Language Recognition Evaluation 2022
Salvatore Sarni, Sandro Cumani, Sabato Marco Siniscalchi et al.
Detecting Manifest Huntington's Disease Using Vocal Data
Vinod Subramanian, Namhee Kwon, Raymond Brueckner et al.
Detection of Cross-Dataset Fake Audio Based on Prosodic and Pronunciation Features
Chenglong Wang, Jiangyan Yi, Jianhua Tao et al.
Detection of Emotional Hotspots in Meetings Using a Cross-Corpus Approach
Georg Stemmer, Paulo Lopez Meyer, Juan Del Hoyo Ontiveros et al.
Detection of Laughter and Screaming Using the Attention and CTC Models
Takuto Matsuda, Yoshiko Arimoto
Developing Speech Processing Pipelines for Police Accountability
Anjalie Field, Prateek Verma, Nay San et al.
Developmental Articulatory and Acoustic Features for Six to Ten Year Old Children
Vishwas M. Shetty, Steven M. Lulich, Abeer Alwan
DFSNet: A Steerable Neural Beamformer Invariant to Microphone Array Configuration for Real-Time, Low-Latency Speech Enhancement
Anton Kovalyov, Kashyap Patel, Issa Panahi
Diacritic Recognition Performance in Arabic ASR
Hanan Aldarmaki, Ahmad Ghannam