Papers
8,761 papers found
Controlling formant frequencies with neural text-to-speech for the manipulation of perceived speaker age
Ziya Khan, Lovisa Wihlborg, Cassia Valentini-Botinhao et al.
Controlling Multi-Class Human Vocalization Generation via a Simple Segment-based Labeling Scheme
Hieu-Thi Luong, Junichi Yamagishi
ControlVC: Zero-Shot Voice Conversion with Time-Varying Controls on Pitch and Speed
Meiying Chen, Zhiyao Duan
ConvKT: Conversation-Level Knowledge Transfer for Context Aware End-to-End Spoken Language Understanding
Vishal Sunder, Eric Fosler-Lussier, Samuel Thomas et al.
COnVoy: A Contact Center Operated Pipeline for Voice of Customer Discovery
Rishabh Tripathi, Digvijay Anil Ingle, Ayush Kumar et al.
CQNV: A Combination of Coarsely Quantized Bitstream and Neural Vocoder for Low Rate Speech Coding
Youqiang Zheng, Li Xiao, Weiping Tu et al.
Creak Prevalence and Prosodic Context in Australian English
Hannah White, Joshua Penney, Andy Gibson et al.
Creating Personalized Synthetic Voices from Post-Glossectomy Speech with Guided Diffusion Models
Yusheng Tian, Guangyan Zhang, Tan Lee
Cross-Lingual Cross-Age Adaptation for Low-Resource Elderly Speech Emotion Recognition
Samuel Cahyawijaya, Holy Lovenia, Willy Chung et al.
Cross-lingual/Cross-channel Intent Detection in Contact-Center Conversations
Suraj Agrawal, Aashraya Sachdeva, Soumya Jain et al.
Cross-Lingual Features for Alzheimer’s Dementia Detection from Speech
Thomas Melistas, Lefteris Kapelonis, Nikos Antoniou et al.
Cross-lingual Prosody Transfer for Expressive Machine Dubbing
Jakub Swiatkowski, Duo Wang, Mikolaj Babianski et al.
Cross-Lingual Transfer Learning for Phrase Break Prediction with Multilingual Language Model
Hoyeon Lee, Hyun-Wook Yoon, Jong-Hwan Kim et al.
Cross-linguistic Emotion Perception in Human and TTS Voices
Iona Gessinger, Michelle Cohn, Benjamin R. Cowan et al.
Cross-Modal Semantic Alignment before Fusion for Two-Pass End-to-End Spoken Language Understanding
Lingyan Huang, Tao Li, Haodong Zhou et al.
Cross-utterance Conditioned Coherent Speech Editing
Cheng Yu, Yang Li, Weiqin Zu et al.
Crowdsource-based Validation of the Audio Cocktail as a Sound Browsing Tool
Per Fallgren, Jens Edlund
Crowdsourced Data Validation for ASR Training
Wannaphong Phatthiyaphaibun, Chompakorn Chaksangchaichot, Thanawin Rakthammanon et al.
Cues to next-speaker projection in conversational Swedish: Evidence from reaction times
Kathrin Feindt, Martina Rossi, Ghazaleh Esfandiari-Baiat et al.
Curriculum Learning for Self-supervised Speaker Verification
Hee-Soo Heo, Jee-weon Jung, Jingu Kang et al.
CVTE-Poly: A New Benchmark for Chinese Polyphone Disambiguation
Siheng Zhang, Xingjun Tan, Yanqiang Lei et al.
Data augmentation for children ASR and child-adult speaker classification using voice conversion methods
Shuyang Zhao, Mittul Singh, Abraham Woubie et al.
Data Augmentation for Diverse Voice Conversion in Noisy Environments
Avani Tanna, Michael Saxon, Amr El Abbadi et al.
DC CoMix TTS: An End-to-End Expressive TTS with Discrete Code Collaborated with Mixer
Yerin Choi, Myoung-Wan Koo
DCCRN-KWS: An Audio Bias Based Model for Noise Robust Small-Footprint Keyword Spotting
Shubo Lv, Xiong Wang, Sining Sun et al.