Papers
8,761 papers found
Adapting Multi-Lingual ASR Models for Handling Multiple Talkers
Chenda Li, Yao Qian, Zhuo Chen et al.
Adaptive Contextual Biasing for Transducer Based Streaming Speech Recognition
Tianyi Xu, Zhanheng Yang, Kaixun Huang et al.
Adaptive Neural Network Quantization For Lightweight Speaker Verification
Haoyu Wang, Bei Liu, Yifei Wu et al.
Addressing Cold Start Problem for End-to-end Automatic Speech Scoring
Jungbae Park, Seungtaek Choi
A Dual Attention-based Modality-Collaborative Fusion Network for Emotion Recognition
Xiaoheng Zhang, Yang Li
Advanced RawNet2 with Attention-based Channel Masking for Synthetic Speech Detection
Jing Li, Yanhua Long, Yijie Li et al.
Advances in Language Recognition in Low Resource African Languages: The JHU-MIT Submission for NIST LRE22
Jesús Villalba, Jonas Borgstrom, Maliha Jahan et al.
Adversarial Diffusion Probability Model For Cross-domain Speaker Verification Integrating Contrastive Loss
Xinmei Su, Xiang Xie, Fengrun Zhang et al.
Adversarial Learning of Intermediate Acoustic Feature for End-to-End Lightweight Text-to-Speech
Hyungchan Yoon, Seyun Um, Changhwan Kim et al.
Affective attributes of French caregivers' professional speech
Jean-Luc Rouas, Yaru Wu, Takaaki Shochi
AfriNames: Most ASR Models "Butcher" African Names
Tobi Olatunji, Tejumade Afonja, Bonaventure F. P. Dossou et al.
A Generative Framework for Conversational Laughter: Its 'Language Model' and Laughter Sound Synthesis
Hiroki Mori, Shunya Kimura
A Hierarchical Context-aware Modeling Approach for Multi-aspect and Multi-granular Pronunciation Assessment
Fu-An Chao, Tien-Hong Lo, Tzu-I Wu et al.
A Joint Model for Pronunciation Assessment and Mispronunciation Detection and Diagnosis with Multi-task Learning
Hyungshin Ryu, Sunhee Kim, Minhwa Chung
A Lexical-aware Non-autoregressive Transformer-based ASR Model
Chong-En Lin, Kuan-Yu Chen
AlignAtt: Using Attention-based Audio-Translation Alignments as a Guide for Simultaneous Speech Translation
Sara Papi, Marco Turchi, Matteo Negri
Aligning Speech Enhancement for Improving Downstream Classification Performance
Yan Xiong, Visar Berisha, Chaitali Chakrabarti
Alignment of Beat Gestures and Prosodic Prominence in German
Sophie Repp, Lara Muhtz, Johannes Heim
Allophant: Cross-lingual Phoneme Recognition with Articulatory Attributes
Kevin Glocker, Aaricia Herygers, Munir Georges
ALO-VC: Any-to-any Low-latency One-shot Voice Conversion
Bohan Wang, Damien Ronssin, Milos Cernak
A Low-Resource Pipeline for Text-to-Speech from Found Data With Application to Scottish Gaelic
Dan Wells, Korin Richmond, William Lamb
Alzheimer Disease Classification through ASR-based Transcriptions: Exploring the Impact of Punctuation and Pauses
Lucía Gómez-Zaragozá, Simone Wills, Cristian Tejedor-Garcia et al.
A Mask Free Neural Network for Monaural Speech Enhancement
Liang Liu, Haixin Guan, Jinlong Ma et al.