Papers
Transducer-based language embedding for spoken language identification
Peng Shen, Xugang Lu, Hisashi Kawai
Transfer Learning for Robust Low-Resource Children's Speech ASR with Transformers and Source-Filter Warping
Jenthe Thienpondt, Kris Demuynck
Transfer Learning Framework for Low-Resource Text-to-Speech using a Large-Scale Unlabeled Speech Corpus
Minchan Kim, Myeonghun Jeong, Byoung Jin Choi et al.
Transfer Learning from Multi-Lingual Speech Translation Benefits Low-Resource Speech Recognition
Geoffroy Vanderreydt, François REMY, Kris Demuynck
Transformer-Based Automatic Speech Recognition with Auxiliary Input of Source Language Text Toward Transcribing Simultaneous Interpretation
Shuta Taniguchi, Tsuneo Kato, Akihiro Tamura et al.
Transformer-based quality assessment model for generalized user-generated multimedia audio content
Deebha Mumtaz, Ajit Jena, Vinit Jakhetiya et al.
Transformer-Based Video Front-Ends for Audio-Visual Speech Recognition for Single and Muti-Person Video
Dmitriy Serdyuk, Otavio Braga, Olivier Siohan
Transformer Networks for Non-Intrusive Speech Quality Prediction
M K Jayesh, Mukesh Sharma, Praneeth Vonteddu et al.
Transplantation of Conversational Speaking Style with Interjections in Sequence-to-Sequence Speech Synthesis
Raul Fernandez, David Haws, Guy Lorberbom et al.
Transport-Oriented Feature Aggregation for Speaker Embedding Learning
Yusheng Tian, Jingyu Li, Tan Lee
Tree-constrained Pointer Generator with Graph Neural Network Encodings for Contextual Speech Recognition
Guangzhi Sun, Chao Zhang, Phil Woodland
TRILLsson: Distilled Universal Paralinguistic Speech Representations
Joel Shor, Subhashini Venugopalan
TriniTTS: Pitch-controllable End-to-end TTS without External Aligner
Yooncheol Ju, Ilhwan Kim, Hongsun Yang et al.
TRUNet: Transformer-Recurrent-U Network for Multi-channel Reverberant Sound Source Separation
Ali Aroudi, Stefan Uhlich, Marc Ferras Font
TTS-by-TTS 2: Data-Selective Augmentation for Neural Speech Synthesis Using Ranking Support Vector Machine with Variational Autoencoder
Eunwoo Song, Ryuichi Yamamoto, Ohsung Kwon et al.
Turn-Taking Prediction for Natural Conversational Speech
Shuo-Yiin Chang, Bo Li, Tara Sainath et al.
Two Methods for Spoofing-Aware Speaker Verification: Multi-Layer Perceptron Score Fusion Model and Integrated Embedding Projector
Jungwoo Heo, Ju-Ho Kim, Hyun-seo Shin
Two-pass Decoding and Cross-adaptation Based System Combination of End-to-end Conformer and Hybrid TDNN ASR Systems
Mingyu Cui, Jiajun Deng, Shoukang Hu et al.
Two-Pass Low Latency End-to-End Spoken Language Understanding
Siddhant Arora, Siddharth Dalmia, Xuankai Chang et al.
Ultra-Low-Bitrate Speech Coding with Pretrained Transformers
Ali Siahkoohi, Michael Chinen, Tom Denton et al.
Uncertainty Calibration for Deep Audio Classifiers
Tong Ye, Shijing Si, Jianzong Wang et al.
UNet-DenseNet for Robust Far-Field Speaker Verification
Zhenke Gao, Manwai Mak, Weiwei Lin
Unified Source-Filter GAN with Harmonic-plus-Noise Source Excitation Generation
Reo Yoneyama, Yi-Chiao Wu, Tomoki Toda
Unify and Conquer: How Phonetic Feature Representation Affects Polyglot Text-To-Speech (TTS)
Ariadna Sanchez, Alessio Falai, Ziyao Zhang et al.
Unifying Cosine and PLDA Back-ends for Speaker Verification
Zhiyuan Peng, Xuanji He, Ke Ding et al.