voice conversion

259 papers

Explore in graph

Also known as

SVC VC EVC

Co-occurring keywords

speech synthesis (753) zero-shot learning (3637) variational autoencoder (1282) speaker identity (74) generative adversarial network (1939) self-supervised learning (3751) speaker verification (577) speaker similarity (35) automatic speech recognition (1764) speaker embedding (350)

Papers

CFVC: Conditional Filtering for Controllable Voice Conversion INTERSPEECH 2023

Reverberation-Controllable Voice Conversion Using Reverberation Time Estimator INTERSPEECH 2023

UnitSpeech: Speaker-adaptive Speech Synthesis with Untranscribed Data INTERSPEECH 2023

E2E-S2S-VC: End-To-End Sequence-To-Sequence Voice Conversion INTERSPEECH 2023

Voice Conversion With Just Nearest Neighbors INTERSPEECH 2023

Diff-HierVC: Diffusion-based Hierarchical Voice Conversion with Robust Pitch Generation and Masked Prior for Zero-shot Speaker Adaptation INTERSPEECH 2023

Speaking Style Conversion in the Waveform Domain Using Discrete Self-Supervised Units EMNLP 2023

A Compressed Synthetic Speech Detection Method with Compression Feature Embedding INTERSPEECH 2023

Robust Feature Decoupling in Voice Conversion by Using Locality-Based Instance Normalization INTERSPEECH 2023

Zero-Shot Face-Based Voice Conversion: Bottleneck-Free Speech Disentanglement in the Real-World Scenario AAAI 2023

VC-T: Streaming Voice Conversion Based on Neural Transducer INTERSPEECH 2023

DuTa-VC: A Duration-aware Typical-to-atypical Voice Conversion Approach with Diffusion Probabilistic Model INTERSPEECH 2023

Mandarin Electrolaryngeal Speech Voice Conversion using Cross-domain Features INTERSPEECH 2023

ALO-VC: Any-to-any Low-latency One-shot Voice Conversion INTERSPEECH 2023

ASR data augmentation in low-resource settings using cross-lingual multi-speaker TTS and cross-lingual voice conversion INTERSPEECH 2023

Iteratively Improving Speech Recognition and Voice Conversion INTERSPEECH 2023

Generalizable Zero-Shot Speaker Adaptive Speech Synthesis with Disentangled Representations INTERSPEECH 2023

Pseudo-Siamese Network based Timbre-reserved Black-box Adversarial Attack in Speaker Identification INTERSPEECH 2023

Automatic Speech Disentanglement for Voice Conversion using Rank Module and Speech Augmentation INTERSPEECH 2023

Audio-Visual Mandarin Electrolaryngeal Speech Voice Conversion INTERSPEECH 2023

S2CD: Self-heuristic Speaker Content Disentanglement for Any-to-Any Voice Conversion INTERSPEECH 2023

DualVC: Dual-mode Voice Conversion using Intra-model Knowledge Distillation and Hybrid Predictive Coding INTERSPEECH 2023

Data Augmentation for Diverse Voice Conversion in Noisy Environments INTERSPEECH 2023

Flow-VAE VC: End-to-End Flow Framework with Contrastive Loss for Zero-shot Voice Conversion INTERSPEECH 2023

SASPEECH: A Hebrew Single Speaker Dataset for Text To Speech and Voice Conversion INTERSPEECH 2023