speech synthesis

753 papers

Explore in graph

Also known as

SSS SS TTS

Co-occurring keywords

neural vocoder (126) voice conversion (259) text-to-speech synthesis (293) speech recognition (1223) deep neural network (1801) speech generation (97) low-resource language (2234) automatic speech recognition (1764) generative adversarial network (1939) neural network (6616)

Papers

FluentTTS: Text-dependent Fine-grained Style Control for Multi-style TTS INTERSPEECH 2022

Production characteristics of obstruents in WaveNet and older TTS systems INTERSPEECH 2022

Unsupervised Text-to-Speech Synthesis by Unsupervised Automatic Speech Recognition INTERSPEECH 2022

SATTS: Speaker Attractor Text to Speech, Learning to Speak by Learning to Separate INTERSPEECH 2022

SpecGrad: Diffusion Probabilistic Model based Neural Vocoder with Adaptive Noise Spectral Shaping INTERSPEECH 2022

Relationship between the acoustic time intervals and tongue movements of German diphthongs INTERSPEECH 2022

A Framework for Automatic Generation of Spoken Question-Answering Data EMNLP 2022

Fine-grained Noise Control for Multispeaker Speech Synthesis INTERSPEECH 2022

NatiQ: An End-to-end Text-to-Speech System for Arabic EMNLP 2022

SpeechT5: Unified-Modal Encoder-Decoder Pre-Training for Spoken Language Processing ACL 2022

Enhancement of Pitch Controllability using Timbre-Preserving Pitch Augmentation in FastPitch INTERSPEECH 2022

Automatic Song Translation for Tonal Languages ACL 2022

Gi2Pi Rule-based, index-preserving grapheme-to-phoneme transformations ACL 2022

Opencpop: A High-Quality Open Source Chinese Popular Song Corpus for Singing Voice Synthesis INTERSPEECH 2022

Building African Voices INTERSPEECH 2022

Towards Efficiently Learning Monotonic Alignments for Attention-based End-to-End Speech Recognition INTERSPEECH 2022

SiD-WaveFlow: A Low-Resource Vocoder Independent of Prior Knowledge INTERSPEECH 2022

Autoencoder-Based Tongue Shape Estimation During Continuous Speech INTERSPEECH 2022

Unsupervised Inference of Physiologically Meaningful Articulatory Trajectories with VocalTractLab INTERSPEECH 2022

Towards Multi-Scale Speaking Style Modelling with Hierarchical Context Information for Mandarin Speech Synthesis INTERSPEECH 2022

Evoc-Learn — High quality simulation of early vocal learning INTERSPEECH 2022

Audio-Visual Speech Codecs: Rethinking Audio-Visual Speech Enhancement by Re-Synthesis CVPR 2022

V2C: Visual Voice Cloning CVPR 2022

Cross-Utterance Conditioned VAE for Non-Autoregressive Text-to-Speech ACL 2022

More Than Words: In-the-Wild Visually-Driven Prosody for Text-to-Speech CVPR 2022