Papers
8,761 papers found
Elucidating Clock-drift Using Real-world Audios In Wireless Mode For Time-offset Insensitive End-to-End Asynchronous Acoustic Echo Cancellation
Premanand Nayak, M. Ali Basha Shaik
Embedding Learning for Preference-based Speech Quality Assessment
ChengHung Hu, Yusuke Yasuda, Tomoki Toda
Emo-bias: A Large Scale Evaluation of Social Bias on Speech Emotion Recognition
Yi-Cheng Lin, Haibin Wu, Huang-Cheng Chou et al.
EmoBox: Multilingual Multi-corpus Speech Emotion Recognition Toolkit and Benchmark
Ziyang Ma, Mingjie Chen, Hezhao Zhang et al.
EmoSphere-TTS: Emotional Style and Intensity Modeling via Spherical Emotion Vector for Controllable Emotional Text-to-Speech
Deok-Hyeon Cho, Hyung-Seok Oh, Seung-Bin Kim et al.
Emotional Cues Extraction and Fusion for Multi-modal Emotion Prediction and Recognition in Conversation
Haoxiang Shi, Ziqi Liang, Jun Yu
Emotion Arithmetic: Emotional Speech Synthesis via Weight Space Interpolation
Pavan Kalyan, Preeti Rao, Preethi Jyothi et al.
Empowering Low-Resource Language ASR via Large-Scale Pseudo Labeling
Kaushal Santosh Bhogale, Deovrat Mehendale, Niharika Parasa et al.
Empowering Whisper as a Joint Multi-Talker and Target-Talker Speech Recognition System
Lingwei Meng, Jiawen Kang, Yuejiao Wang et al.
Enabling Conversational Speech Synthesis using Noisy Spontaneous Data
Liisa Rätsep, Rasmus Lellep, Mark Fishel
Enhanced ASR Robustness to Packet Loss with a Front-End Adaptation Network
Yehoshua Dissen, Shiry Yonash, Israel Cohen et al.
Enhanced Deep Speech Separation in Clustered Ad Hoc Distributed Microphone Environments
Jihyun Kim, Stijn Kindt, Nilesh Madhu et al.
Enhanced Feature Learning with Normalized Knowledge Distillation for Audio Tagging
Yuwu Tang, Ziang Ma, Haitao Zhang
Enhanced Reverberation as Supervision for Unsupervised Speech Separation
Kohei Saijo, Gordon Wichern, François G. Germain et al.
Enhancing Automated Audio Captioning via Large Language Models with Optimized Audio Encoding
Jizhong Liu, Gang Li, Junbo Zhang et al.
Enhancing Child Vocalization Classification with Phonetically-Tuned Embeddings for Assisting Autism Diagnosis
Jialu Li, Mark Hasegawa-Johnson, Karrie Karahalios
Enhancing CTC-based speech recognition with diverse modeling units
Shiyi Han, Mingbin Xu, Zhihong Lei et al.
Enhancing Dysarthric Speech Recognition for Unseen Speakers via Prototype-Based Adaptation
Shiyao Wang, Shiwan Zhao, Jiaming Zhou et al.
Enhancing ECAPA-TDNN with Feature Processing Module and Attention Mechanism for Speaker Verification
Shiu-Hsiang Liou, Po-Cheng Chan, Chia-Ping Chen et al.
Enhancing Modal Fusion by Alignment and Label Matching for Multimodal Emotion Recognition
Qifei Li, Yingming Gao, Yuhua Wen et al.
Enhancing Multilingual Voice Toxicity Detection with Speech-Text Alignment
Joseph Liu, Mahesh Kumar Nandwana, Janne Pylkkönen et al.
Enhancing Multimodal Emotion Recognition through ASR Error Compensation and LLM Fine-Tuning
Jehyun Kyung, Serin Heo, Joon-Hyuk Chang
Enhancing Neural Transducer for Multilingual ASR with Synchronized Language Diarization
Amir Hussein, Desh Raj, Matthew Wiesner et al.