speech synthesis

753 papers

Explore in graph

Also known as

SSS SS TTS

Co-occurring keywords

neural vocoder (126) voice conversion (259) text-to-speech synthesis (293) speech recognition (1223) deep neural network (1801) speech generation (97) low-resource language (2234) automatic speech recognition (1764) generative adversarial network (1939) neural network (6616)

Papers

Grapheme-to-Phoneme Conversion for Thai using Neural Regression Models NAACL 2022

When Is TTS Augmentation Through a Pivot Language Useful? INTERSPEECH 2022

Empathic Machines: Using Intermediate Features as Levers to Emulate Emotions in Text-To-Speech Systems NAACL 2022

AdaVocoder: Adaptive Vocoder for Custom Voice INTERSPEECH 2022

Textless Speech Emotion Conversion using Discrete & Decomposed Representations EMNLP 2022

A Framework for Automatic Generation of Spoken Question-Answering Data EMNLP 2022

L2-GEN: A Neural Phoneme Paraphrasing Approach to L2 Speech Synthesis for Mispronunciation Diagnosis INTERSPEECH 2022

Data Augmentation Using McAdams-Coefficient-Based Speaker Anonymization for Fake Audio Detection INTERSPEECH 2022

Karaoker: Alignment-free singing voice synthesis with speech training data INTERSPEECH 2022

NatiQ: An End-to-end Text-to-Speech System for Arabic EMNLP 2022

Opencpop: A High-Quality Open Source Chinese Popular Song Corpus for Singing Voice Synthesis INTERSPEECH 2022

DDOS: A MOS Prediction Framework utilizing Domain Adaptive Pre-training and Distribution of Opinion Scores INTERSPEECH 2022

Data-augmented cross-lingual synthesis in a teacher-student framework INTERSPEECH 2022

SingAug: Data Augmentation for Singing Voice Synthesis with Cycle-consistent Training Strategy INTERSPEECH 2022

MSR-NV: Neural Vocoder Using Multiple Sampling Rates INTERSPEECH 2022

TTS-by-TTS 2: Data-Selective Augmentation for Neural Speech Synthesis Using Ranking Support Vector Machine with Variational Autoencoder INTERSPEECH 2022

Modeling speech recognition and synthesis simultaneously: Encoding and decoding lexical and sublexical semantic information into speech with no direct access to speech data INTERSPEECH 2022

Synthesizing Near Native-accented Speech for a Non-native Speaker by Imitating the Pronunciation and Prosody of a Native Speaker INTERSPEECH 2022

Self supervised learning for robust voice cloning INTERSPEECH 2022

Speaker Anonymization with Phonetic Intermediate Representations INTERSPEECH 2022

Low-data? No problem: low-resource, language-agnostic conversational text-to-speech via F0-conditioned data augmentation INTERSPEECH 2022

SOMOS: The Samsung Open MOS Dataset for the Evaluation of Neural Text-to-Speech Synthesis INTERSPEECH 2022

Back to the Future: Extending the Blizzard Challenge 2013 INTERSPEECH 2022

REYD – The First Yiddish Text-to-Speech Dataset and System INTERSPEECH 2022

Automatic Evaluation of Speaker Similarity INTERSPEECH 2022