speech synthesis

753 papers

Explore in graph

Also known as

SSS SS TTS

Co-occurring keywords

neural vocoder (126) voice conversion (259) text-to-speech synthesis (293) speech recognition (1223) deep neural network (1801) speech generation (97) low-resource language (2234) automatic speech recognition (1764) generative adversarial network (1939) neural network (6616)

Papers

Improving Robustness of LLM-based Speech Synthesis by Learning Monotonic Alignment INTERSPEECH 2024

Articulatory synthesis using representations learnt through phonetic label-aware contrastive loss INTERSPEECH 2024

Phoneme Hallucinator: One-Shot Voice Conversion via Set Expansion AAAI 2024

Make-A-Voice: Revisiting Voice Large Language Models as Scalable Multilingual and Multitask Learners ACL 2024

Enhancing Out-of-Vocabulary Performance of Indian TTS Systems for Practical Applications through Low-Effort Data Strategies INTERSPEECH 2024

Highly Intelligible Speaker-Independent Articulatory Synthesis INTERSPEECH 2024

MunTTS: A Text-to-Speech System for Mundari EACL 2024

Deepfake Defense: Constructing and Evaluating a Specialized Urdu Deepfake Audio Dataset ACL 2024

Uni-Dubbing: Zero-Shot Speech Synthesis from Visual Articulation ACL 2024

FakeSound: Deepfake General Audio Detection INTERSPEECH 2024

Speechworthy Instruction-tuned Language Models EMNLP 2024

MobileSpeech: A Fast and High-Fidelity Framework for Mobile Zero-Shot Text-to-Speech ACL 2024

StyleDubber: Towards Multi-Scale Style Learning for Movie Dubbing ACL 2024

QGAN: Low Footprint Quaternion Neural Vocoder for Speech Synthesis INTERSPEECH 2024

BiVocoder: A Bidirectional Neural Vocoder Integrating Feature Extraction and Waveform Generation INTERSPEECH 2024

Towards Naturalistic Voice Conversion: NaturalVoices Dataset with an Automatic Processing Pipeline INTERSPEECH 2024

Towards EMG-to-Speech with Necklace Form Factor INTERSPEECH 2024

GTSinger: A Global Multi-Technique Singing Corpus with Realistic Music Scores for All Singing Tasks NIPS 2024

1000 African Voices: Advancing inclusive multi-speaker multi-accent speech synthesis INTERSPEECH 2024

JenGAN: Stacked Shifted Filters in GAN-Based Speech Synthesis INTERSPEECH 2024

Examining Prosody in Spoken Navigation Instructions for People with Disabilities NAACL 2024

Acoustic barycenters as exemplar production targets NAACL 2024

Small-E: Small Language Model with Linear Attention for Efficient Speech Synthesis INTERSPEECH 2024

Stress transfer in speech-to-speech machine translation INTERSPEECH 2024

Enhancing Polyglot Voices by Leveraging Cross-Lingual Fine-Tuning in Any-to-One Voice Conversion EMNLP 2024