Papers
Zambezi Voice: A Multilingual Speech Corpus for Zambian Languages
Claytone Sikasote, Kalinda Siaminwe, Stanly Mwape et al.
ZeroPrompt: Streaming Acoustic Encoders are Zero-Shot Masked LMs
Xingchen Song, Di Wu, Binbin Zhang et al.
Zero-Shot Accent Conversion using Pseudo Siamese Disentanglement Network
Dongya Jia, Qiao Tian, Kainan Peng et al.
Zero-Shot Automatic Pronunciation Assessment
Hongfu Liu, Mingqian Shi, Ye Wang
ZET-Speech: Zero-shot adaptive Emotion-controllable Text-to-Speech Synthesis with Diffusion and Style-based Models
Minki Kang, Wooseok Han, Sung Ju Hwang et al.
Zoneformer: On-device Neural Beamformer For In-car Multi-zone Speech Separation, Enhancement and Echo Cancellation
Yong Xu, Vinay Kothapally, Meng Yu et al.
4-bit Conformer with Native Quantization Aware Training for Speech Recognition
Shaojin Ding, Phoenix Meadowlark, Yanzhang He et al.
A BERT-based Language Modeling Framework
Chin-Yueh Chien, Kuan-Yu Chen
A blueprint for using deepfakes in sociolinguistic matched-guise experiments
Nathan Joel Young, David Britain, Adrian Leemann
Accelerating Inference and Language Model Fusion of Recurrent Neural Network Transducers via End-to-End 4-bit Quantization
Andrea Fasoli, Chia-Yu Chen, Mauricio Serrano et al.
Accent Conversion using Pre-trained Model and Synthesized Data from Voice Conversion
Tuan Nam Nguyen, Ngoc-Quan Pham, Alexander Waibel
Accurate Emotion Strength Assessment for Seen and Unseen Speech Based on Data-Driven Deep Learning
Rui Liu, Berrak Sisman, Björn Schuller et al.
ACNN-VC: Utilizing Adaptive Convolution Neural Network for One-Shot Voice Conversion
Ji Sub Um, Yeunju Choi, Hoi Rin Kim
A compact transformer-based GAN vocoder
Chenfeng Miao, Ting Chen, Minchuan Chen et al.
A Comparative Study on Speaker-attributed Automatic Speech Recognition in Multi-party Meetings
Fan Yu, Zhihao Du, ShiLiang Zhang et al.
A comparative study on vowel articulation in Parkinson's disease and multiple system atrophy
Khalid Daoudi, Biswajit Das, Solange Milhé de Saint Victor et al.
A Complementary Joint Training Approach Using Unpaired Speech and Text A Complementary Joint Training Approach Using Unpaired Speech and Text
Yeqian Du, Jie Zhang, Qiu-shi Zhu et al.
A Conformer-based Waveform-domain Neural Acoustic Echo Canceller Optimized for ASR Accuracy
Sankaran Panchapagesan, Arun Narayanan, Turaj Zakizadeh Shabestary et al.
Acoustic Feature Shuffling Network for Text-independent Speaker Verification
Jin Li, Xin Fang, Fan Chu et al.
Acoustic Modeling for End-to-End Empathetic Dialogue Speech Synthesis Using Linguistic and Prosodic Contexts of Dialogue History
Yuto Nishimura, Yuki Saito, Shinnosuke Takamichi et al.
Acoustic Representation Learning on Breathing and Speech Signals for COVID-19 Detection
Debottam Dutta, Debarpan Bhattacharya, Sriram Ganapathy et al.
Acoustic Stress Detection in Isolated English Words for Computer-Assisted Pronunciation Training
Vera Bernhard, Sandra Schwab, Jean-Philippe Goldman
Acoustic To Articulatory Speech Inversion Using Multi-Resolution Spectro-Temporal Representations Of Speech Signals
Rahil Parikh, Nadee Seneviratne, Ganesh Sivaraman et al.
Acoustic-to-articulatory Speech Inversion with Multi-task Learning
Yashish M. Siriwardena, Ganesh Sivaraman, Carol Espy-Wilson