Research Explorer

Elucidating Clock-drift Using Real-world Audios In Wireless Mode For Time-offset Insensitive End-to-End Asynchronous Acoustic Echo Cancellation

Premanand Nayak, M. Ali Basha Shaik

2024 INTERSPEECH

Embedding Learning for Preference-based Speech Quality Assessment

ChengHung Hu, Yusuke Yasuda, Tomoki Toda

2024 INTERSPEECH

Emo-bias: A Large Scale Evaluation of Social Bias on Speech Emotion Recognition

Yi-Cheng Lin, Haibin Wu, Huang-Cheng Chou et al.

2024 INTERSPEECH

EmoBox: Multilingual Multi-corpus Speech Emotion Recognition Toolkit and Benchmark

Ziyang Ma, Mingjie Chen, Hezhao Zhang et al.

2024 INTERSPEECH

EmoSphere-TTS: Emotional Style and Intensity Modeling via Spherical Emotion Vector for Controllable Emotional Text-to-Speech

Deok-Hyeon Cho, Hyung-Seok Oh, Seung-Bin Kim et al.

2024 INTERSPEECH

Emotional Cues Extraction and Fusion for Multi-modal Emotion Prediction and Recognition in Conversation

Haoxiang Shi, Ziqi Liang, Jun Yu

2024 INTERSPEECH

Emotion Arithmetic: Emotional Speech Synthesis via Weight Space Interpolation

Pavan Kalyan, Preeti Rao, Preethi Jyothi et al.

2024 INTERSPEECH

Emotion-Aware Speech Self-Supervised Representation Learning with Intensity Knowledge

Rui Liu, Zening Ma

2024 INTERSPEECH

Empowering Low-Resource Language ASR via Large-Scale Pseudo Labeling

Kaushal Santosh Bhogale, Deovrat Mehendale, Niharika Parasa et al.

2024 INTERSPEECH

Empowering Whisper as a Joint Multi-Talker and Target-Talker Speech Recognition System

Lingwei Meng, Jiawen Kang, Yuejiao Wang et al.

2024 INTERSPEECH

Enabling Conversational Speech Synthesis using Noisy Spontaneous Data

Liisa Rätsep, Rasmus Lellep, Mark Fishel

2024 INTERSPEECH

Enhanced ASR Robustness to Packet Loss with a Front-End Adaptation Network

Yehoshua Dissen, Shiry Yonash, Israel Cohen et al.

2024 INTERSPEECH

Enhanced Deep Speech Separation in Clustered Ad Hoc Distributed Microphone Environments

Jihyun Kim, Stijn Kindt, Nilesh Madhu et al.

2024 INTERSPEECH

Enhanced Feature Learning with Normalized Knowledge Distillation for Audio Tagging

Yuwu Tang, Ziang Ma, Haitao Zhang

2024 INTERSPEECH

Enhanced Reverberation as Supervision for Unsupervised Speech Separation

Kohei Saijo, Gordon Wichern, François G. Germain et al.

2024 INTERSPEECH

Enhancing Automated Audio Captioning via Large Language Models with Optimized Audio Encoding

Jizhong Liu, Gang Li, Junbo Zhang et al.

2024 INTERSPEECH

Enhancing Child Vocalization Classification with Phonetically-Tuned Embeddings for Assisting Autism Diagnosis

Jialu Li, Mark Hasegawa-Johnson, Karrie Karahalios

2024 INTERSPEECH

Enhancing CTC-based speech recognition with diverse modeling units

Shiyi Han, Mingbin Xu, Zhihong Lei et al.

2024 INTERSPEECH

Enhancing Dysarthric Speech Recognition for Unseen Speakers via Prototype-Based Adaptation

Shiyao Wang, Shiwan Zhao, Jiaming Zhou et al.

2024 INTERSPEECH

Enhancing ECAPA-TDNN with Feature Processing Module and Attention Mechanism for Speaker Verification

Shiu-Hsiang Liou, Po-Cheng Chan, Chia-Ping Chen et al.

2024 INTERSPEECH

Enhancing Japanese Text-to-Speech Accuracy with a Novel Combination Transformer-BERT-based G2P: Integrating Pronunciation Dictionaries and Accent Sandhi

Kiyoshi Kurihara, Masanori Sano

2024 INTERSPEECH

Enhancing Modal Fusion by Alignment and Label Matching for Multimodal Emotion Recognition

Qifei Li, Yingming Gao, Yuhua Wen et al.

2024 INTERSPEECH

Enhancing Multilingual Voice Toxicity Detection with Speech-Text Alignment

Joseph Liu, Mahesh Kumar Nandwana, Janne Pylkkönen et al.

2024 INTERSPEECH

Enhancing Multimodal Emotion Recognition through ASR Error Compensation and LLM Fine-Tuning

Jehyun Kyung, Serin Heo, Joon-Hyuk Chang

2024 INTERSPEECH

Enhancing Neural Transducer for Multilingual ASR with Synchronized Language Diarization

Amir Hussein, Desh Raj, Matthew Wiesner et al.

2024 INTERSPEECH

Papers