Papers
Automatic Pronunciation Assessment using Self-Supervised Speech Representation Learning
Eesung Kim, Jae-Jin Jeon, Hyeji Seo et al.
Automatic Prosody Annotation with Pre-Trained Text-Speech Model
Ziqian Dai, Jianwei Yu, Yan Wang et al.
Automatic Prosody Evaluation of L2 English Read Speech in Reference to Accent Dictionary with Transformer Encoder
Yu Suzuki, Tsuneo Kato, Akihiro Tamura
Automatic Selection of Discriminative Features for Dementia Detection in Cantonese-Speaking People
Xiaoquan KE, Man-Wai Mak, Helen M. Meng
Automatic Speaker Verification System for Dysarthria Patients
Shinimol Salim, Syed Shahnawazuddin, Waquar Ahmad
Autoregressive Co-Training for Learning Discrete Speech Representation
Sung-Lin Yeh, Hao Tang
AVATAR: Unconstrained Audiovisual Speech Recognition
Valentin Gabeur, Paul Hongsuck Seo, Arsha Nagrani et al.
A Vietnamese-English Neural Machine Translation System
Tuan-Duy H. Nguyen, Duy Phung, Duy Tran-Cong Nguyen et al.
Avoid Overfitting User Specific Information in Federated Keyword Spotting
Xin-Chun Li, Jin-Lin Tang, Shaoming Song et al.
A VR Interactive 3D Mandarin Pronunciation Teaching Model
Yujia Jin, Yanlu Xie, Jinsong Zhang
Backend Ensemble for Speaker Verification and Spoofing Countermeasure
Li Zhang, Yue Li, Huan Zhao et al.
Back to the Future: Extending the Blizzard Challenge 2013
Sébastien Le Maguer, Simon King, Naomi Harte
Barlow Twins self-supervised learning for robust speaker recognition
Mohammad Mohammadamini, Driss Matrouf, Jean-Francois Bonastre et al.
BARTpho: Pre-trained Sequence-to-Sequence Models for Vietnamese
Nguyen Luong Tran, Duong Le, Dat Quoc Nguyen
Bayesian Recurrent Units and the Forward-Backward Algorithm
Alexandre Bittar, Philip N. Garner
Bayesian Transformer Using Disentangled Mask Attention
Jen-Tzung Chien, Yu-Han Huang
Beam-Guided TasNet: An Iterative Speech Separation Framework with Multi-Channel Output
Hangting Chen, Yi Yang, Feng Dang et al.
Benchmarking Transformers-based models on French Spoken Language Understanding tasks
Oralie Cattan, Sahar Ghannay, Christophe Servan et al.
Bending the string: intonation contour length as a correlate of macro-rhythm
Constantijn Kaland
BERT, can HE predict contrastive focus? Predicting and controlling prominence in neural TTS using a language model
Brooke Stephenson, Laurent Besacier, Laurent Girin et al.
Better Intermediates Improve CTC Inference
Tatsuya Komatsu, Yusuke Fujita, Jaesong Lee et al.
BibleTTS: a large, high-fidelity, multilingual, and uniquely African speech corpus
Josh Meyer, David Adelani, Edresson Casanova et al.
BiCAPT: Bidirectional Computer-Assisted Pronunciation Training with Normalizing Flows
Zhan Zhang, Yuehai Wang, Jianyi Yang
Bifurcation and Reunion: A Loss-Guided Two-Stage Approach for Monaural Speech Dereverberation
Xiaoxue Luo, Chengshi Zheng, Andong Li et al.