Papers
8,761 papers found
HybridVC: Efficient Voice Style Conversion with Text and Audio Prompts
Xinlei Niu, Jing Zhang, Charles Patrick Martin
HypR: A comprehensive study for ASR hypothesis revising with a reference corpus
Yi-Wei Wang, Ke-Han Lu, Kuan-Yu Chen
Identifying Speakers in Dialogue Transcripts: A Text-based Approach Using Pretrained Language Models
Minh Nguyen, Franck Dernoncourt, Seunghyun Yoon et al.
IIITH Ucchar e-Sudharak: an automatic English pronunciation corrector for school-going children with a teacher in the loop
Meenakshi Sirigiraju, Arjun Rajasekar, Abhishikth Meejuri et al.
Improved Factorized Neural Transducer Model For Text-only Domain Adaptation
Junzhe Liu, Jianwei Yu, Xie Chen
Improvement Speaker Similarity for Zero-Shot Any-to-Any Voice Conversion of Whispered and Regular Speech
Aleksei Gusev, Anastasia Avdeeva
Improving Audio Classification with Low-Sampled Microphone Input: An Empirical Study Using Model Self-Distillation
Dawei Liang, Alice Zhang, David Harwath et al.
Improving Audio Codec-based Zero-Shot Text-to-Speech Synthesis with Multi-Modal Context and Large Language Model
Jinlong Xue, Yayue Deng, Yicheng Han et al.
Improving child speech recognition with augmented child-like speech
Yuanyuan Zhang, Zhengjun Yue, Tanvina Patel et al.
Improving Copy-Synthesis Anti-Spoofing Training Method with Rhythm and Speaker Perturbation
Jingze Lu, Yuxiang Zhang, Zhuo Li et al.
Improving Domain-Specific ASR with LLM-Generated Contextual Descriptions
Jiwon Suh, Injae Na, Woohwan Jung
Improving Generalization of Speech Separation in Real-World Scenarios: Strategies in Simulation, Optimization, and Evaluation
Ke Chen, Jiaqi Su, Taylor Berg-Kirkpatrick et al.
Improving Multilingual ASR Robustness to Errors in Language Input
Brady Houston, Omid Sadjadi, Zejiang Hou et al.
Improving Multilingual Text-to-Speech with Mixture-of-Language-Experts and Accent Disentanglement
Jing Wu, Ting Chen, Minchuan Chen et al.
Improving Neural Biasing for Contextual Speech Recognition by Early Context Injection and Text Perturbation
Ruizhe Huang, Mahsa Yarmohammadi, Sanjeev Khudanpur et al.
Improving Noise Robustness in Self-supervised Pre-trained Model for Speaker Verification
Chan-yeong Lim, Hyun-seo Shin, Ju-ho Kim et al.
Improving Robustness of LLM-based Speech Synthesis by Learning Monotonic Alignment
Paarth Neekhara, Shehzeen Hussain, Subhankar Ghosh et al.
Improving Self-supervised Pre-training using Accent-Specific Codebooks
Darshan Prabhu, Abhishek Gupta, Omkar Nitsure et al.
Improving Speech-Based Dysarthria Detection using Multi-task Learning with Gradient Projection
Yan Xiong, Visar Berisha, Julie Liss et al.
Improving Speech Enhancement by Integrating Inter-Channel and Band Features with Dual-branch Conformer
Jizhen Li, Xinmeng Xu, Weiping Tu et al.
Improving Speech Recognition with Prompt-based Contextualized ASR and LLM-based Re-predictor
Nguyen Manh Tien Anh, Thach Ho Sy
Improving Whisper's Recognition Performance for Under-Represented Language Kazakh Leveraging Unpaired Speech and Text
Jinpeng Li, Yu Pu, Qi Sun et al.