Papers
Intra-Sentential Speaking Rate Control in Neural Text-To-Speech for Automatic Dubbing
Mayank Sharma, Yogesh Virkar, Marcello Federico et al.
Introducing a Central African Primate Vocalisation Dataset for Automated Species Classification
Joeri A. Zwerts, Jelle Treep, Casper S. Kaandorp et al.
Investigating Contributions of Speech and Facial Landmarks for Talking Head Generation
Ege Kesim, Engin Erzin
Investigating Deep Neural Structures and their Interpretability in the Domain of Voice Conversion
Samuel J. Broughton, Md. Asif Jalal, Roger K. Moore
Investigating Feature Selection and Explainability for COVID-19 Diagnostics from Cough Sounds
Flavio Avila, Amir H. Poorjam, Deepak Mittal et al.
Investigating Methods to Improve Language Model Integration for Attention-Based Encoder-Decoder ASR Models
Mohammad Zeineldeen, Aleksandr Glushko, Wilfried Michel et al.
Investigating Speech Reconstruction for Laryngectomees for Silent Speech Interfaces
Beiming Cao, Nordine Sebkhi, Arpan Bhavsar et al.
Investigating the Impact of Spectral and Temporal Degradation on End-to-End Automatic Speech Recognition Performance
Takanori Ashihara, Takafumi Moriya, Makio Kashino
Investigating the Interplay Between Affective, Phonatory and Motoric Subsystems in Autism Spectrum Disorder Using a Multimodal Dialogue Agent
Hardik Kothare, Vikram Ramanarayanan, Oliver Roesler et al.
Investigating the Utility of Multimodal Conversational Technology and Audiovisual Analytic Measures for the Assessment and Monitoring of Amyotrophic Lateral Sclerosis at Scale
Michael Neumann, Oliver Roesler, Jackson Liscombe et al.
Investigating Voice Function Characteristics of Greek Speakers with Hearing Loss Using Automatic Glottal Source Feature Extraction
Anna Sfakianaki, George P. Kafentzis
Investigation of IMU&Elevoc Submission for the Short-Duration Speaker Verification Challenge 2021
Peng Zhang, Peng Hu, Xueliang Zhang
Investigation of Practical Aspects of Single Channel Speech Separation for ASR
Jian Wu, Zhuo Chen, Sanyuan Chen et al.
Investigation of Spatial-Acoustic Features for Overlapping Speech Detection in Multiparty Meetings
Shiliang Zhang, Siqi Zheng, Weilong Huang et al.
IR-GAN: Room Impulse Response Generator for Far-Field Speech Recognition
Anton Ratnarajah, Zhenyu Tang, Dinesh Manocha
It’s Not What You Said, it’s How You Said it: Discriminative Perception of Speech as a Multichannel Communication System
Sarenne Wallbridge, Peter Bell, Catherine Lai
Joint Feature Enhancement and Speaker Recognition with Multi-Objective Task-Oriented Network
Yibo Wu, Longbiao Wang, Kong Aik Lee et al.
Joint Online Multichannel Acoustic Echo Cancellation, Speech Dereverberation and Source Separation
Yueyue Na, Ziteng Wang, Zhang Liu et al.
Joint Retrieval-Extraction Training for Evidence-Aware Dialog Response Selection
Hongyin Luo, James Glass, Garima Lalwani et al.
KazakhTTS: An Open-Source Kazakh Text-to-Speech Synthesis Dataset
Saida Mussakhojayeva, Aigerim Janaliyeva, Almas Mirzakhmetov et al.
Keyword Transformer: A Self-Attention Model for Keyword Spotting
Axel Berg, Mark O’Connor, Miguel Tairum Cruz
Knowledge Distillation Based Training of Universal ASR Source Models for Cross-Lingual Transfer
Takashi Fukuda, Samuel Thomas
Knowledge Distillation for Singing Voice Detection
Soumava Paul, Gurunath Reddy M, K. Sreenivasa Rao et al.
Knowledge Distillation for Streaming Transformer–Transducer
Atsushi Kojima
Knowledge Distillation from BERT Transformer to Speech Transformer for Intent Classification
Yidi Jiang, Bidisha Sharma, Maulik Madhavi et al.