Papers
Dialogue Situation Recognition for Everyday Conversation Using Multimodal Information
Yuya Chiba, Ryuichiro Higashinaka
DiCOVA Challenge: Dataset, Task, and Baseline System for COVID-19 Diagnosis Using Acoustics
Ananya Muguli, Lancelot Pinto, Nirmala R et al.
Difference in Perceived Speech Signal Quality Assessment Among Monolingual and Bilingual Teenage Students
Przemyslaw Falkowski-Gilski
Differentiable Allophone Graphs for Language-Universal Speech Recognition
Brian Yan, Siddharth Dalmia, David R. Mortensen et al.
Diff-TTS: A Denoising Diffusion Model for Text-to-Speech
Myeonghun Jeong, Hyeongju Kim, Sung Jun Cheon et al.
Digital Einstein Experience: Fast Text-to-Speech for Conversational AI
Joanna Rownicka, Kilian Sprenkamp, Antonio Tripiana et al.
Direct Multimodal Few-Shot Learning of Speech and Images
Leanne Nortje, Herman Kamper
Discriminative Self-Training for Punctuation Prediction
Qian Chen, Wen Wang, Mengzhe Chen et al.
Disfluency Detection with Unlabeled Data and Small BERT Models
Johann C. Rocholl, Vicky Zayats, Daniel D. Walker et al.
Disordered Speech Data Collection: Lessons Learned at 1 Million Utterances from Project Euphonia
Robert L. MacDonald, Pan-Pan Jiang, Julie Cattiau et al.
Dissecting the Aero-Acoustic Parameters of Open Articulatory Transitions
Mark Gibson, Oihane Muxika, Marianne Pouplier
Dissecting User-Perceived Latency of On-Device E2E Speech Recognition
Yuan Shangguan, Rohit Prabhavalkar, Hang Su et al.
Distortion of Voiced Obstruents for Differential Diagnosis Between Parkinson’s Disease and Multiple System Atrophy
Khalid Daoudi, Biswajit Das, Solange Milhé de Saint Victor et al.
Do Acoustic Word Embeddings Capture Phonological Similarity? An Empirical Study
Badr M. Abdullah, Marius Mosbach, Iuliia Zaitova et al.
Domain-Aware Self-Attention for Multi-Domain Neural Machine Translation
Shiqi Zhang, Yan Liu, Deyi Xiong et al.
Domain-Initial Strengthening in Turkish: Acoustic Cues to Prosodic Hierarchy in Stop Consonants
Kubra Bodur, Sweeney Branje, Morgane Peirolo et al.
Domain-Specific Multi-Agent Dialog Policy Learning in Multi-Domain Task-Oriented Scenarios
Li Tang, Yuke Si, Longbiao Wang et al.
Do Sound Event Representations Generalize to Other Audio Tasks? A Case Study in Audio Transfer Learning
Anurag Kumar, Yun Wang, Vamsi Krishna Ithapu et al.
DPCRN: Dual-Path Convolution Recurrent Network for Single Channel Speech Enhancement
Xiaohuai Le, Hongsheng Chen, Kai Chen et al.
Dropout Regularization for Self-Supervised Learning of Transformer Encoder Speech Representation
Jian Luo, Jianzong Wang, Ning Cheng et al.
Dr-Vectors: Decision Residual Networks and an Improved Loss for Speaker Recognition
Jason Pelecanos, Quan Wang, Ignacio Lopez Moreno
Dual Causal/Non-Causal Self-Attention for Streaming End-to-End Speech Recognition
Niko Moritz, Takaaki Hori, Jonathan Le Roux
Dual-Path Filter Network: Speaker-Aware Modeling for Speech Separation
Fan-Lin Wang, Yu-Huai Peng, Hung-Shin Lee et al.
Dual Script E2E Framework for Multilingual and Code-Switching ASR
Mari Ganesh Kumar, Jom Kuriakose, Anand Thyagachandran et al.