Papers
Discriminative Adversarial Learning for Speaker Independent Emotion Recognition
Chamara Kasun, Chung Soo Ahn, Jagath Rajapakse et al.
Discriminative Feature Representation Based on Cascaded Attention Network with Adversarial Joint Loss for Speech Emotion Recognition
Yang Liu, Haoqin Sun, Wenbo Guan et al.
Disentangled Latent Speech Representation for Automatic Pathological Intelligibility Assessment
Tobias Weise, Philipp Klumpp, Andreas Maier et al.
Disentanglement of Emotional Style and Speaker Identity for Expressive Voice Conversion
Zongyang Du, Berrak Sisman, Kun Zhou et al.
Disentangling the Impacts of Language and Channel Variability on Speech Separation Networks
Fan-Lin Wang, Hung-Shin Lee, Yu Tsao et al.
Distance-Based Sound Separation
Katharine Patterson, Kevin Wilson, Scott Wisdom et al.
Distilling a Pretrained Language Model to a Multilingual ASR Model
Kwanghee Choi, Hyung-Min Park
Distinguishing between pre- and post-treatment in the speech of patients with chronic obstructive pulmonary disease
Andreas Triantafyllopoulos, Markus Fendler, Anton Batliner et al.
DocLayoutTTS: Dataset and Baselines for Layout-informed Document-level Neural Speech Synthesis
Puneet Mathur, Franck Dernoncourt, Quan Hung Tran et al.
Does Audio Deepfake Detection Generalize?
Nicolas Müller, Pavel Czempin, Franziska Diekmann et al.
Does Utterance entails Intent?: Evaluating Natural Language Inference Based Setup for Few-Shot Intent Detection
Ayush Kumar, Vijit Malik, Jithendra Vepa
Domain Adversarial Self-Supervised Speech Representation Learning for Improving Unknown Domain Downstream Tasks
Tomohiro Tanaka, Ryo Masumura, Hiroshi Sato et al.
Domain Agnostic Few-shot Learning for Speaker Verification
Seunghan Yang, Debasmit Das, Janghoon Cho et al.
Domain-aware Intermediate Pretraining for Dementia Detection with Limited Data
Youxiang Zhu, Xiaohui Liang, John A. Batsis et al.
Domain Generalization with Relaxed Instance Frequency-wise Normalization for Multi-device Acoustic Scene Classification
Byeonggeun Kim, Seunghan Yang, Jangho Kim et al.
Domain Prompts: Towards memory and compute efficient domain adaptation of ASR systems
Saket Dingliwal, Ashish Shenoy, Sravan Bodapati et al.
DRSpeech: Degradation-Robust Text-to-Speech Synthesis with Frame-Level and Utterance-Level Acoustic Representation Learning
Takaaki Saeki, Kentaro Tachibana, Ryuichi Yamamoto
DUAL: Discrete Spoken Unit Adaptive Learning for Textless Spoken Question Answering
Guan-Ting Lin, Yung-Sung Chuang, Ho-Lam Chung et al.
Dual Path Embedding Learning for Speaker Verification with Triplet Attention
Bei Liu, Zhengyang Chen, Yanmin Qian
Dummy Prototypical Networks for Few-Shot Open-Set Keyword Spotting
Byeonggeun Kim, Seunghan Yang, Inseop Chung et al.
Durational Patterning at Discourse Boundaries in Relation to Therapist Empathy in Psychotherapy
Jonathan Him Nok Lee, Dehua Tao, Harold Chui et al.
Dyadic Interaction Assessment from Free-living Audio for Depression Severity Assessment
Bishal Lamichhane, Nidal Moukaddam, Ankit B. Patel et al.
DyConvMixer: Dynamic Convolution Mixer Architecture for Open-Vocabulary Keyword Spotting
Waseem Gharbieh, Jinmiao Huang, Qianhui Wan et al.
Dynamic Vertical Larynx Actions Under Prosodic Focus
Miran Oh, Yoonjeong Lee