Papers
8,761 papers found
Learn and Don't Forget: Adding a New Language to ASR Foundation Models
Mengjie Qian, Siyuan Tang, Rao Ma et al.
LearnerVoice: A Dataset of Non-Native English Learners’ Spontaneous Speech
Haechan Kim, Junho Myung, Seoyoung Kim et al.
Learning Fine-Grained Controllability on Speech Generation via Efficient Fine-Tuning
Chung-Ming Chien, Andros Tjandra, Apoorv Vyas et al.
Learning from Back Chunks: Acquiring More Future Knowledge for Streaming ASR Models via Self Distillation
Yuting Yang, Guodong Ma, Yuke Li et al.
Learning from memory-based models
Rhiannon Mogridge, Anton Ragni
Learning from Multiple Annotator Biased Labels in Multimodal Conversation
Kazutoshi Shinoda, Nobukatsu Hojo, Saki Mizuno et al.
Learning Pronunciation from Other Accents via Pronunciation Knowledge Transfer
Siqi Sun, Korin Richmond
Learning Representation of Therapist Empathy in Counseling Conversation Using Siamese Hierarchical Attention Network
Dehua Tao, Tan Lee, Harold Chui et al.
Learnings from curating a trustworthy, well-annotated, and useful dataset of disordered English speech
Pan-Pan Jiang, Jimmy Tobin, Katrin Tomanek et al.
Less is More: Accurate Speech Recognition & Translation without Web-Scale Data
Krishna C. Puvvada, Piotr Żelasko, He Huang et al.
Leveraging Adapter for Parameter-Efficient ASR Encoder
Kyuhong Shim, Jinkyu Lee, Hyunjae Kim
Leveraging Graphic and Convolutional Neural Networks for Auditory Attention Detection with EEG
Saurav Pahuja, Gabriel Ivucic, Pascal Himmelmann et al.
Leveraging Language Model Capabilities for Sound Event Detection
Hualei Wang, Jianguo Mao, Zhifang Guo et al.
Leveraging large language models for post-transcription correction in contact centers
Bramhendra Koilakuntla, Prajesh Rana, Paras Ahuja et al.
Leveraging Large Language Models to Refine Automatic Feedback Generation at Articulatory Level in Computer Aided Pronunciation Training
Huihang Zhong, Yanlu Xie, ZiJin Yao
Leveraging Phonemic Transcription and Whisper toward Clinically Significant Indices for Automatic Child Speech Assessment
Yeh-Sheng Lin, Shu-Chuan Tseng, Jyh-Shing Roger Jang
Leveraging Speech Data Diversity to Document Indigenous Heritage and Culture
Allahsera Tapo, Éric Le Ferrand, Zoey Liu et al.
Leveraging Universal Speech Representations for Detecting and Assessing the Severity of Mild Cognitive Impairment Across Languages
Anna Favaro, Tianyu Cao, Najim Dehak et al.
LibriheavyMix: A 20,000-Hour Dataset for Single-Channel Reverberant Multi-Talker Speech Separation, ASR and Speaker Diarization
Zengrui Jin, Yifan Yang, Mohan Shi et al.
LibriTTS-P: A Corpus with Speaking Style and Speaker Identity Prompts for Text-to-Speech and Style Captioning
Masaya Kawamura, Ryuichi Yamamoto, Yuma Shirahata et al.
Lifelong Learning MOS Prediction for Synthetic Speech Quality Evaluation
Félix Saget, Meysam Shamsi, Marie Tahon
Lightweight Audio Segmentation for Long-form Speech Translation
Jaesong Lee, Soyoon Kim, Hanbyul Kim et al.
Lightweight Dynamic Sparse Transformer for Monaural Speech Enhancement
Zehua Zhang, Xuyi Zhuang, Yukun Qian et al.
Lightweight Transducer Based on Frame-Level Criterion
Genshun Wan, Mengzhi Wang, Tingzhi Mao et al.
Lightweight Zero-shot Text-to-Speech with Mixture of Adapters
Kenichi Fujita, Takanori Ashihara, Marc Delcroix et al.