voice conversion

259 papers

Explore in graph

Also known as

SVC VC EVC

Co-occurring keywords

speech synthesis (753) zero-shot learning (3637) variational autoencoder (1282) speaker identity (74) generative adversarial network (1939) self-supervised learning (3751) speaker verification (577) speaker similarity (35) automatic speech recognition (1764) speaker embedding (350)

Papers

Requirements and Motivations of Low-Resource Speech Synthesis for Language Revitalization ACL 2022

Glow-WaveGAN 2: High-quality Zero-shot Text-to-speech Synthesis and Any-to-any Voice Conversion INTERSPEECH 2022

SpeechT5: Unified-Modal Encoder-Decoder Pre-Training for Spoken Language Processing ACL 2022

MISRNet: Lightweight Neural Vocoder Using Multi-Input Single Shared Residual Blocks INTERSPEECH 2022

GlowVC: Mel-spectrogram space disentangling model for language-independent text-free voice conversion INTERSPEECH 2022

Transplantation of Conversational Speaking Style with Interjections in Sequence-to-Sequence Speech Synthesis INTERSPEECH 2022

Voice Conversion Can Improve ASR in Very Low-Resource Settings INTERSPEECH 2022

WavThruVec: Latent speech representation as intermediate features for neural speech synthesis INTERSPEECH 2022

The Effectiveness of Time Stretching for Enhancing Dysarthric Speech for Improved Dysarthric Speech Recognition INTERSPEECH 2022

Non-Parallel Voice Conversion for ASR Augmentation INTERSPEECH 2022

An Evaluation of Three-Stage Voice Conversion Framework for Noisy and Reverberant Conditions INTERSPEECH 2022

A Unified System for Voice Cloning and Voice Conversion through Diffusion Probabilistic Modeling INTERSPEECH 2022

Mitigating bias against non-native accents INTERSPEECH 2022

Leveraging Symmetrical Convolutional Transformer Networks for Speech to Singing Voice Style Transfer INTERSPEECH 2022

Towards Improved Zero-shot Voice Conversion with Conditional DSVAE INTERSPEECH 2022

Speech Representation Disentanglement with Adversarial Mutual Information Learning for One-shot Voice Conversion INTERSPEECH 2022

Low-data? No problem: low-resource, language-agnostic conversational text-to-speech via F0-conditioned data augmentation INTERSPEECH 2022

Are disentangled representations all you need to build speaker anonymization systems? INTERSPEECH 2022

Investigation into Target Speaking Rate Adaptation for Voice Conversion INTERSPEECH 2022

Learning Noise-independent Speech Representation for High-quality Voice Conversion for Noisy Target Speakers INTERSPEECH 2022

FlowCPCVC: A Contrastive Predictive Coding Supervised Flow Framework for Any-to-Any Voice Conversion INTERSPEECH 2022

Textless Speech Emotion Conversion using Discrete & Decomposed Representations EMNLP 2022

Learning Explicit Prosody Models and Deep Speaker Embeddings for Atypical Voice Conversion INTERSPEECH 2021

Improving Robustness of One-Shot Voice Conversion with Deep Discriminative Speaker Encoder INTERSPEECH 2021

CVC: Contrastive Learning for Non-Parallel Voice Conversion INTERSPEECH 2021