Papers
8,761 papers found
Enc-Dec RNN Acoustic Word Embeddings learned via Pairwise Prediction
Adhiraj Banerjee, Vipul Arora
Encoder-decoder Multimodal Speaker Change Detection
Jee-weon Jung, Soonshin Seo, Hee-Soo Heo et al.
End-to-End Joint Target and Non-Target Speakers ASR
Ryo Masumura, Naoki Makishima, Taiga Yamane et al.
End-to-End Neural Speaker Diarization with Absolute Speaker Loss
Chao Wang, Jie Li, Xiang Fang et al.
End to End Spoken Language Diarization with Wav2vec Embeddings
Jagabandhu Mishra, Jayadev N Patil, Amartya Chowdhury et al.
End-to-End Word-Level Pronunciation Assessment with MASK Pre-training
Yukang Liang, Kaitao Song, Shaoguang Mao et al.
End-to-End Zero-Shot Voice Conversion with Location-Variable Convolutions
Wonjune Kang, Mark Hasegawa-Johnson, Deb Roy
Enhance Temporal Relations in Audio Captioning with Sound Event Detection
Zeyu Xie, Xuenan Xu, Mengyue Wu et al.
Enhancing New Intent Discovery via Robust Neighbor-based Contrastive Learning
Zhenhe Wu, Xiaoguang Yu, Meng Chen et al.
Enhancing Speech Articulation Analysis Using A Geometric Transformation of the X-ray Microbeam Dataset
Ahmed Adel Attia, Mark Tiede, Carol Espy-Wilson
Enhancing the EEG Speech Match Mismatch Tasks With Word Boundaries
Akshara Soman, Vidhi Sinha, Sriram Ganapathy
Enhancing the Unified Streaming and Non-streaming Model with Contrastive Learning
Yuting Yang, Yuke Li, Binbin Du
Enhancing Visual Question Answering via Deconstructing Questions and Explicating Answers
Feilong Chen, Minglun Han, Jing Shi et al.
Episodic Memory For Domain-Adaptable, Robust Speech Emotion Recognition
James Tavernor, Matthew Perez, Emily Mower Provost
Epoch-Based Spectrum Estimation for Speech
Jón Guðnason, Guolin Fang, Mike Brookes
eSTImate: A Real-time Speech Transmission Index Estimator With Speech Enhancement Auxiliary Task Using Self-Attention Feature Pyramid Network
Bajian Xiang, Hongkun Liu, Zedong Wu et al.
Estimating virtual targets for lingual stop consonants using general Tau theory
Benjamin Elie, Alice Turk
Estimation of Listening Response Timing by Generative Model and Parameter Control of Response Substantialness Using Dynamic-Prompt-Tune
Toshiki Muromachi, Yoshinobu Kano
Evaluating and reducing the distance between synthetic and real speech distributions
Christoph Minixhofer, Ondřej Klejch, Peter Bell
Evaluating context-invariance in unsupervised speech representations
Mark Hallap, Emmanuel Dupoux, Ewan Dunbar
Evaluation of a Forensic Automatic Speaker Recognition System with Emotional Speech Recordings
Robert Essery, Philip Harrison, Vincent Hughes
Evaluation of delexicalization methods for research on emotional speech
Nicolas Audibert, Francesca Carbone, Maud Champagne-Lavau et al.
Everyone has an accent
Nina Markl, Catherine Lai
Experimenting with Additive Margins for Contrastive Self-Supervised Speaker Verification
Theo Lepage, Reda Dehak
Explicit Intensity Control for Accented Text-to-speech
Rui Liu, Haolin Zuo, De Hu et al.