Papers
Creating New Voices using Normalizing Flows
Piotr Bilinski, Thomas Merritt, Abdelhamid Ezzerg et al.
Cross-Age Speaker Verification: Learning Age-Invariant Speaker Embeddings
Xiaoyi Qin, Na Li, Weng Chao et al.
Cross-Cultural Comparison of Gradient Emotion Perception: Human vs. Alexa TTS Voices
Iona Gessinger, Michelle Cohn, Georgia Zellou et al.
Cross-dialect lexicon optimisation for an endangered language ASR system: the case of Irish
Liam Lonergan, Mengjie Qian, Neasa Ní Chiaráin et al.
Cross-Layer Similarity Knowledge Distillation for Speech Enhancement
Jiaming Cheng, Ruiyu Liang, Yue Xie et al.
Cross-lingual articulatory feature information transfer for speech recognition using recurrent progressive neural networks
Mahir Morshed, Mark Hasegawa-Johnson
Cross-lingual Self-Supervised Speech Representations for Improved Dysarthric Speech Recognition
Abner Hernandez, Paula Andrea Pérez-Toro, Elmar Noeth et al.
Cross-lingual Style Transfer with Conditional Prior VAE and Style Loss
Dino Rattcliffe, You Wang, Alex Mansbridge et al.
Cross-Lingual Transfer Learning Approach to Phoneme Error Detection via Latent Phonetic Representation
Jovan M. Dalhouse, Katunobu Itou
Cross-Modal Decision Regularization for Simultaneous Speech Translation
Mohd Abbas Zaidi, Beomseok Lee, Sangha Kim et al.
Cross-modal Transfer Learning via Multi-grained Alignment for End-to-End Spoken Language Understanding
Yi Zhu, Zexun Wang, Hang Liu et al.
Cross-Scale Vector Quantization for Scalable Neural Speech Coding
Xue Jiang, Xiulian Peng, Huaying Xue et al.
Cross-speaker Emotion Transfer Based On Prosody Compensation for End-to-End Speech Synthesis
Tao Li, Xinsheng Wang, Qicong Xie et al.
Cross-Speaker Emotion Transfer for Low-Resource Text-to-Speech Using Non-Parallel Voice Conversion with Pitch-Shift Data Augmentation
Ryo Terashima, Ryuichi Yamamoto, Eunwoo Song et al.
CS-CTCSCONV1D: Small footprint speaker verification with channel split time-channel-time separable 1-dimensional convolution
Linjun Cai, Yuhong Yang, Xufeng Chen et al.
CTA-RNN: Channel and Temporal-wise Attention RNN leveraging Pre-trained ASR Embeddings for Speech Emotion Recognition
Chengxin Chen, Pengyuan Zhang
CTC Variations Through New WFST Topologies
Aleksandr Laptev, Somshubra Majumdar, Boris Ginsburg
CTFALite: Lightweight Channel-specific Temporal and Frequency Attention Mechanism for Enhancing the Speaker Embedding Extractor
Yuheng Wei, Junzhao Du, Hui Liu et al.
CTRL: Continual Representation Learning to Transfer Information of Pre-trained for WAV2VEC 2.0
Jae-Hong Lee, Chae-Won Lee, Jin-Seong Choi et al.
CT-SAT: Contextual Transformer for Sequential Audio Tagging
Yuanbo Hou, Zhaoyi Liu, Bo Kang et al.
CUSIDE: Chunking, Simulating Future Context and Decoding for Streaming ASR
Keyu An, Huahuan Zheng, Zhijian Ou et al.
CycleGAN-based Unpaired Speech Dereverberation
Hannah Muckenhirn, Aleksandr Safin, Hakan Erdogan et al.
CyclicAugment: Speech Data Random Augmentation with Cosine Annealing Scheduler for Automatic Speech Recognition
Zhihan Wang, Feng Hou, Yuanhang Qiu et al.
Daft-Exprt: Cross-Speaker Prosody Transfer on Any Text for Expressive Speech Synthesis
Julian Zaïdi, Hugo Seuté, Benjamin van Niekerk et al.
Data Augmentation for Dementia Detection in Spoken Language.
Dominika Woszczyk, Anna Hlédiková, Alican Akman et al.