conftrace_

Papers

Audio-conditioned phonemic and prosodic annotation for building text-to-speech models from unlabeled speech data INTERSPEECH 2024 Noise-Robust Voice Conversion by Conditional Denoising Training Using Latent Variables of Recording Quality and Environment INTERSPEECH 2024 LibriTTS-P: A Corpus with Speaking Style and Speaker Identity Prompts for Text-to-Speech and Style Captioning INTERSPEECH 2024 CtrSVDD: A Benchmark Dataset and Baseline Analysis for Controlled Singing Voice Deepfake Detection INTERSPEECH 2024 SRC4VC: Smartphone-Recorded Corpus for Voice Conversion Benchmark INTERSPEECH 2024 Cross-Speaker Emotion Transfer for Low-Resource Text-to-Speech Using Non-Parallel Voice Conversion with Pitch-Shift Data Augmentation INTERSPEECH 2022 DRSpeech: Degradation-Robust Text-to-Speech Synthesis with Frame-Level and Utterance-Level Acoustic Representation Learning INTERSPEECH 2022 A Unified Accent Estimation Method Based on Multi-Task Learning for Japanese Text-to-Speech INTERSPEECH 2022 TTS-by-TTS 2: Data-Selective Augmentation for Neural Speech Synthesis Using Ranking Support Vector Machine with Variational Autoencoder INTERSPEECH 2022 Language Model-Based Emotion Prediction Methods for Emotional Speech Synthesis Systems INTERSPEECH 2022 Phrase Break Prediction with Bidirectional Encoder Representations in Japanese Text-to-Speech Synthesis INTERSPEECH 2021 High-Fidelity Parallel WaveGAN with Multi-Band Harmonic-Plus-Noise Model INTERSPEECH 2021 Neural Text-to-Speech with a Modeling-by-Generation Excitation Vocoder INTERSPEECH 2020 Probability Density Distillation with Generative Adversarial Networks for High-Quality Parallel Waveform Generation INTERSPEECH 2019