speech synthesis

753 papers

Explore in graph

Also known as

SSS SS TTS

Co-occurring keywords

neural vocoder (126) voice conversion (259) text-to-speech synthesis (293) speech recognition (1223) deep neural network (1801) speech generation (97) low-resource language (2234) automatic speech recognition (1764) generative adversarial network (1939) neural network (6616)

Papers

FluentSpeech: Stutter-Oriented Automatic Speech Editing with Context-Aware Diffusion Models ACL 2023

CLAPSpeech: Learning Prosody from Text Context with Contrastive Language-Audio Pre-Training ACL 2023

HABLA: A Dataset of Latin American Spanish Accents for Voice Anti-spoofing INTERSPEECH 2023

ViT-TTS: Visual Text-to-Speech with Scalable Diffusion Transformer EMNLP 2023

Decoupling Segmental and Prosodic Cues of Non-native Speech through Vector Quantization INTERSPEECH 2023

Speech Synthesis with Self-Supervisedly Learnt Prosodic Representations INTERSPEECH 2023

SALTTS: Leveraging Self-Supervised Speech Representations for improved Text-to-Speech Synthesis INTERSPEECH 2023

Deep Speech Synthesis from MRI-Based Articulatory Representations INTERSPEECH 2023

EE-TTS: Emphatic Expressive TTS with Linguistic Information INTERSPEECH 2023

UnitY: Two-pass Direct Speech-to-speech Translation with Discrete Units ACL 2023

FastFit: Towards Real-Time Iterative Neural Vocoder by Replacing U-Net Encoder With Multiple STFTs INTERSPEECH 2023

J-MAC: Japanese multi-speaker audiobook corpus for speech synthesis INTERSPEECH 2022

Voicing decision based on phonemes classification and spectral moments for whisper-to-speech conversion INTERSPEECH 2022

Flow-Based Unconstrained Lip to Speech Generation AAAI 2022

Generating iso-accented stimuli for second language research: methodology and a dataset for Spanish-accented English INTERSPEECH 2022

RefineGAN: Universally Generating Waveform Better than Ground Truth with Highly Accurate Pitch and Intensity Responses INTERSPEECH 2022

Accent Conversion using Pre-trained Model and Synthesized Data from Voice Conversion INTERSPEECH 2022

A$^3$T: Alignment-Aware Acoustic and Text Pretraining for Speech Synthesis and Editing ICML 2022

FlowVocoder: A small Footprint Neural Vocoder based Normalizing Flow for Speech Synthesis INTERSPEECH 2022

SiD-WaveFlow: A Low-Resource Vocoder Independent of Prior Knowledge INTERSPEECH 2022

Visualising Model Training via Vowel Space for Text-To-Speech Systems INTERSPEECH 2022

Cross-Utterance Conditioned VAE for Non-Autoregressive Text-to-Speech ACL 2022

Fast Bilingual Grapheme-To-Phoneme Conversion NAACL 2022

Evoc-Learn — High quality simulation of early vocal learning INTERSPEECH 2022

Unsupervised Inference of Physiologically Meaningful Articulatory Trajectories with VocalTractLab INTERSPEECH 2022