Papers
Toward Corpus Size Requirements for Training and Evaluating Depression Risk Models Using Spoken Language
Tomasz Rutowski, Amir Harati, Elizabeth Shriberg et al.
Toward Fairness in Speech Recognition: Discovery and mitigation of performance disparities
PRANAV DHERAM, Murugesan Ramakrishnan, Anirudh Raju et al.
Toward Low-Cost End-to-End Spoken Language Understanding
Marco Dinarelli, Marco Naguib, François Portet
Towards Automated Counselling Decision-Making: Remarks on Therapist Action Forecasting on the AnnoMI Dataset
Zixiu Wu, Rim Helaoui, Diego Reforgiato Recupero et al.
Towards Automated Dialog Personalization using MBTI Personality Indicators
Daniel Fernau, Stefan Hillmann, Nils Feldhus et al.
Towards Cross-speaker Reading Style Transfer on Audiobook Dataset
Xiang Li, Changhe Song, Xianhao Wei et al.
Towards Disentangled Speech Representations
Cal Peyser, W. Ronny Huang, Andrew Rosenberg et al.
Towards Efficiently Learning Monotonic Alignments for Attention-based End-to-End Speech Recognition
Chenfeng Miao, Kun Zou, Ziyang Zhuang et al.
Towards End-to-End Private Automatic Speaker Recognition
Francisco Teixeira, Alberto Abad, Bhiksha Raj et al.
Towards Error-Resilient Neural Speech Coding
Huaying Xue, Xiulian Peng, Xue Jiang et al.
Towards Green ASR: Lossless 4-bit Quantization of a Hybrid TDNN System on the 300-hr Swithboard Corpus
Junhao Xu, Shoukang Hu, Xunying Liu et al.
Towards high-fidelity singing voice conversion with acoustic reference and contrastive predictive coding
Chao Wang, Zhonghao Li, Benlai Tang et al.
Towards Improved Zero-shot Voice Conversion with Conditional DSVAE
Jiachen Lian, Chunlei Zhang, Gopala Krishna Anumanchipalli et al.
Towards Improving the Expressiveness of Singing Voice Synthesis with BERT Derived Semantic Information
Shaohuan Zhou, Shun Lei, Weiya You et al.
Towards Multi-Scale Speaking Style Modelling with Hierarchical Context Information for Mandarin Speech Synthesis
Shun Lei, Yixuan Zhou, Liyang Chen et al.
Toward Zero Oracle Word Error Rate on the Switchboard Benchmark
Arlo Faria, Adam Janin, Sidhi Adkoli et al.
tPLCnet: Real-time Deep Packet Loss Concealment in the Time Domain Using a Short Temporal Context
Nils L. Westhausen, Bernd T. Meyer
Training and typological bias in ASR performance for world Englishes
May Pik Yu Chan, June Choe, Aini Li et al.
Training Data Generation with DOA-based Selecting and Remixing for Unsupervised Training of Deep Separation Models
Hokuto Munakata, Ryu Takeda, Kazunori Komatani
Training speaker embedding extractors using multi-speaker audio with unknown speaker boundaries
Themos Stafylakis, Ladislav Mosner, Oldrich Plchot et al.
Training speaker recognition systems with limited data
Nik Vaessen, David van Leeuwen
Training Text-To-Speech Systems From Synthetic Data: A Practical Approach For Accent Transfer Tasks
Lev Finkelstein, Heiga Zen, Norman Casagrande et al.
Trajectories predicted by optimal speech motor control using LSTM networks
Tsiky Rakotomalala, Pierre Baraduc, Pascal Perrier
Transducer-based language embedding for spoken language identification
Peng Shen, Xugang Lu, Hisashi Kawai
Transfer Learning for Robust Low-Resource Children's Speech ASR with Transformers and Source-Filter Warping
Jenthe Thienpondt, Kris Demuynck