Papers
Bootstrapping a Text Normalization System for an Inflected Language. Numbers as a Test Case
Anna Björk Nikulásdóttir, Jón Guðnason
Bridging the Gap Between Monaural Speech Enhancement and Recognition with Distortion-Independent Acoustic Modeling
Peidong Wang, Ke Tan, DeLiang Wang
Building a Mixed-Lingual Neural TTS System with Only Monolingual Data
Liumeng Xue, Wei Song, Guanghui Xu et al.
Building Large-Vocabulary ASR Systems for Languages Without Any Audio Training Data
Manasa Prasad, Daan van Esch, Sandy Ritchie et al.
Building the Singapore English National Speech Corpus
Jia Xin Koh, Aqilah Mislan, Kevin Khoo et al.
Calibrating DNN Posterior Probability Estimates of HMM/DNN Models to Improve Social Signal Detection from Audio Data
Gábor Gosztolya, László Tóth
CaptionAI: A Real-Time Multilingual Captioning Application
Nagendra Kumar Goel, Mousmita Sarma, Saikiran Valluri et al.
Capturing L1 Influence on L2 Pronunciation by Simulating Perceptual Space Using Acoustic Features
Shuju Shi, Chilin Shih, Jinsong Zhang
Cascaded Cross-Module Residual Learning Towards Lightweight End-to-End Speech Coding
Kai Zhen, Jongmo Sung, Mi Suk Lee et al.
Challenging the Boundaries of Speech Recognition: The MALACH Corpus
Michael Picheny, Zoltán Tüske, Brian Kingsbury et al.
Character-Aware Sub-Word Level Language Modeling for Uyghur and Turkish ASR
Chang Liu, Zhen Zhang, Pengyuan Zhang et al.
Char+CV-CTC: Combining Graphemes and Consonant/Vowel Units for CTC-Based ASR Using Multitask Learning
Abdelwahab Heba, Thomas Pellegrini, Jean-Pierre Lorré et al.
Child Speech Disorder Detection with Siamese Recurrent Network Using Speech Attribute Features
Jiarui Wang, Ying Qin, Zhiyuan Peng et al.
Class-Wise Centroid Distance Metric Learning for Acoustic Event Detection
Xugang Lu, Peng Shen, Sheng Li et al.
CNN-Based Phoneme Classifier from Vocal Tract MRI Learns Embedding Consistent with Articulatory Topology
K.G. van Leeuwen, P. Bos, S. Trebeschi et al.
CNN-BLSTM Based Question Detection from Dialogs Considering Phase and Context Information
Yuke Si, Longbiao Wang, Jianwu Dang et al.
CNN-LSTM Models for Multi-Speaker Source Separation Using Bayesian Hyper Parameter Optimization
Jeroen Zegers, Hugo Van hamme
Coarse-to-Fine Optimization for Speech Enhancement
Jian Yao, Ahmad Al-Dahle
Code-Switching Detection Using ASR-Generated Language Posteriors
Qinyi Wang, Emre Yılmaz, Adem Derinel et al.
Code-Switching Sentence Generation by Bert and Generative Adversarial Networks
Yingying Gao, Junlan Feng, Ying Liu et al.
Code-Switching Sentence Generation by Generative Adversarial Networks and its Application to Data Augmentation
Ching-Ting Chang, Shun-Po Chuang, Hung-Yi Lee
Cognitive Factors in Thai-Naïve Mandarin Speakers’ Imitation of Thai Lexical Tones
Juqiang Chen, Catherine T. Best, Mark Antoniou
Combining Adversarial Training and Disentangled Speech Representation for Robust Zero-Resource Subword Modeling
Siyuan Feng, Tan Lee, Zhiyuan Peng
Combining Speaker Recognition and Metric Learning for Speaker-Dependent Representation Learning
João Monteiro, Jahangir Alam, Tiago H. Falk
Comparative Analysis of Prosodic Characteristics Using WaveNet Embeddings
Antti Suni, Marcin Włodarczak, Martti Vainio et al.