speech synthesis

753 papers

Explore in graph

Also known as

SSS SS TTS

Co-occurring keywords

neural vocoder (126) voice conversion (259) text-to-speech synthesis (293) speech recognition (1223) deep neural network (1801) speech generation (97) low-resource language (2234) automatic speech recognition (1764) generative adversarial network (1939) neural network (6616)

Papers

Improving WaveRNN with Heuristic Dynamic Blending for Fast and High-Quality GPU Vocoding INTERSPEECH 2023

Combining language corpora in a Japanese electromagnetic articulography database for acoustic-to-articulatory inversion INTERSPEECH 2023

Using speech synthesis to explain automatic speaker recognition: a new application of synthetic speech INTERSPEECH 2023

Prosody-controllable Gender-ambiguous Speech Synthesis: A Tool for Investigating Implicit Bias in Speech Perception INTERSPEECH 2023

Prior-free Guided TTS: An Improved and Efficient Diffusion-based Text-Guided Speech Synthesis INTERSPEECH 2023

The HW-TSC’s Simultaneous Speech-to-Speech Translation System for IWSLT 2023 Evaluation ACL 2023

Japanese-to-English Simultaneous Dubbing Prototype ACL 2023

FastDiff 2: Revisiting and Incorporating GANs and Diffusion Models in High-Fidelity Speech Synthesis ACL 2023

FluentSpeech: Stutter-Oriented Automatic Speech Editing with Context-Aware Diffusion Models ACL 2023

StyleTalk: One-Shot Talking Head Generation with Controllable Speaking Styles AAAI 2023

Avocodo: Generative Adversarial Network for Artifact-Free Vocoder AAAI 2023

Evaluating and reducing the distance between synthetic and real speech distributions INTERSPEECH 2023

DualVC: Dual-mode Voice Conversion using Intra-model Knowledge Distillation and Hybrid Predictive Coding INTERSPEECH 2023

RWEN-TTS: Relation-Aware Word Encoding Network for Natural Text-to-Speech Synthesis AAAI 2023

Simple and Effective Unsupervised Speech Translation ACL 2023

CLAPSpeech: Learning Prosody from Text Context with Contrastive Language-Audio Pre-Training ACL 2023

Towards Robust FastSpeech 2 by Modelling Residual Multimodality INTERSPEECH 2023

Evaluation of delexicalization methods for research on emotional speech INTERSPEECH 2023

TokenSplit: Using Discrete Speech Representations for Direct, Refined, and Transcript-Conditioned Speech Separation and Recognition INTERSPEECH 2023

EE-TTS: Emphatic Expressive TTS with Linguistic Information INTERSPEECH 2023

MaskedSpeech: Context-aware Speech Synthesis with Masking Strategy INTERSPEECH 2023

P-Flow: A Fast and Data-Efficient Zero-Shot TTS through Speech Prompting NIPS 2023

SALTTS: Leveraging Self-Supervised Speech Representations for improved Text-to-Speech Synthesis INTERSPEECH 2023

VC-T: Streaming Voice Conversion Based on Neural Transducer INTERSPEECH 2023

Decoupling Segmental and Prosodic Cues of Non-native Speech through Vector Quantization INTERSPEECH 2023