Co-occurring keywords
Papers
Multimodal Fine-grained Context Interaction Graph Modeling for Conversational Speech Synthesis
EMNLP 2025
End-to-End Multilingual Automatic Dubbing via Duration-based Translation with Large Language Models
EMNLP 2025
Advancing Zero-shot Text-to-Speech Intelligibility across Diverse Domains via Preference Alignment
ACL 2025
LLaMA-Omni 2: LLM-based Real-time Spoken Chatbot with Autoregressive Streaming Speech Synthesis
ACL 2025
TokSing: Singing Voice Synthesis based on Discrete Tokens
INTERSPEECH 2024
TunArTTS: Tunisian Arabic Text-To-Speech Corpus
COLING 2024
Contextual Interactive Evaluation of TTS Models in Dialogue Systems
INTERSPEECH 2024
Probing the Feasibility of Multilingual Speaker Anonymization
INTERSPEECH 2024
Pre-training Neural Transducer-based Streaming Voice Conversion for Faster Convergence and Alignment-free Training
INTERSPEECH 2024
Evaluating Text-to-Speech Synthesis from a Large Discrete Token-based Speech Language Model
COLING 2024
TacoLM: GaTed Attention Equipped Codec Language Model are Efficient Zero-Shot Text to Speech Synthesizers
INTERSPEECH 2024