voice conversion

259 papers

Explore in graph

Also known as

SVC VC EVC

Co-occurring keywords

speech synthesis (753) zero-shot learning (3637) variational autoencoder (1282) speaker identity (74) generative adversarial network (1939) self-supervised learning (3751) speaker verification (577) speaker similarity (35) automatic speech recognition (1764) speaker embedding (350)

Papers

Fake the Real: Backdoor Attack on Deep Speech Classification via Voice Conversion INTERSPEECH 2023

Data augmentation for children ASR and child-adult speaker classification using voice conversion methods INTERSPEECH 2023

A Compressed Synthetic Speech Detection Method with Compression Feature Embedding INTERSPEECH 2023

Emo-StarGAN: A Semi-Supervised Any-to-Many Non-Parallel Emotion-Preserving Voice Conversion INTERSPEECH 2023

Interpretable Latent Space Using Space-Filling Curves for Phonetic Analysis in Voice Conversion INTERSPEECH 2023

Automatic Tuning of Loss Trade-offs without Hyper-parameter Search in End-to-End Zero-Shot Speech Synthesis INTERSPEECH 2023

Improving Generalization Ability of Countermeasures for New Mismatch Scenario by Combining Multiple Advanced Regularization Terms INTERSPEECH 2023

Audio-Visual Mandarin Electrolaryngeal Speech Voice Conversion INTERSPEECH 2023

STE-GAN: Speech-to-Electromyography Signal Conversion using Generative Adversarial Networks INTERSPEECH 2023

Zero-Shot Face-Based Voice Conversion: Bottleneck-Free Speech Disentanglement in the Real-World Scenario AAAI 2023

The VoiceMOS Challenge 2022 INTERSPEECH 2022

FlowCPCVC: A Contrastive Predictive Coding Supervised Flow Framework for Any-to-Any Voice Conversion INTERSPEECH 2022

Creating New Voices using Normalizing Flows INTERSPEECH 2022

Are disentangled representations all you need to build speaker anonymization systems? INTERSPEECH 2022

Voice Puppetry with FastPitch INTERSPEECH 2022

Speak Like a Professional: Increasing Speech Intelligibility by Mimicking Professional Announcer Voice with Voice Conversion INTERSPEECH 2022

Glow-WaveGAN 2: High-quality Zero-shot Text-to-speech Synthesis and Any-to-any Voice Conversion INTERSPEECH 2022

Anti-Spoofing Using Transfer Learning with Variational Information Bottleneck INTERSPEECH 2022

DeID-VC: Speaker De-identification via Zero-shot Pseudo Voice Conversion INTERSPEECH 2022

Speech Audio Corrector: using speech from non-target speakers for one-off correction of mispronunciations in grapheme-input text-to-speech INTERSPEECH 2022

Disentanglement of Emotional Style and Speaker Identity for Expressive Voice Conversion INTERSPEECH 2022

MISRNet: Lightweight Neural Vocoder Using Multi-Input Single Shared Residual Blocks INTERSPEECH 2022

GlowVC: Mel-spectrogram space disentangling model for language-independent text-free voice conversion INTERSPEECH 2022

Transplantation of Conversational Speaking Style with Interjections in Sequence-to-Sequence Speech Synthesis INTERSPEECH 2022

Voice Conversion Can Improve ASR in Very Low-Resource Settings INTERSPEECH 2022