Ziyue Jiang
27 papers · 2020–2025 · 9 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+13 more ↓ Show less ↑
πΊοΈ Taxonomy Completionist (11) π§ Keyword Pioneer π Renaissance Researcher (6) π Interdisciplinary Bridge π Conference Polyglot (9)
πΊοΈ
Taxonomy Completionist
(11)
π£
Hot Topic Early Bird
π§
Keyword Pioneer
π
Grand Slam
π
Triple Crown
π
Keyword Champion
(2)
π€
Dynamic Duo
(24)
π¬
Deep Specialist
(11)
π§¬
Topic Evolution
ποΈ
Keyword Collector
(105)
β‘
Prolific Year
(11)
π₯
Unstoppable
(6)
π
Century Club
(27)
Conferences
ACL (10)
ICLR (4)
EMNLP (3)
INTERSPEECH (3)
NIPS (3)
AAAI (1)
COLING (1)
ICML (1)
IJCAI (1)
Top co-authors
Keywords
speech synthesis
(12)
zero-shot learning
(6)
flow matching
(3)
singing voice synthesis
(3)
vector quantization
(3)
prosody modeling
(3)
style transfer
(3)
multimodal learning
(2)
speaker cloning
(2)
style control
(2)
speech generation
(2)
speech recognition
(2)
contrastive learning
(2)
diffusion model
(2)
voice conversion
(1)
benchmark evaluation
(1)
talking face generation
(1)
attention mechanism
(1)
neural decoding
(1)
self-supervised learning
(1)
Papers
BrainLoc: Brain Signal-Based Object Detection with Multi-modal Alignment
EMNLP 2025
ControlSpeech: Towards Simultaneous and Independent Zero-shot Speaker Cloning and Zero-shot Language Style Control
ACL 2025
Language-Codec: Bridging Discrete Codec Representations and Speech Language Models
ACL 2025
Rhythm Controllable and Efficient Zero-Shot Voice Conversion via Shortcut Flow Matching
ACL 2025
TCSinger 2: Customizable Multilingual Zero-shot Singing Voice Synthesis
ACL 2025
VoxpopuliTTS: a large-scale multilingual TTS corpus for zero-shot speech generation
COLING 2025
Versatile Framework for Song Generation with Prompt-based Control
EMNLP 2025
Speech Watermarking with Discrete Intermediate Representations
AAAI 2025
WavTokenizer: an Efficient Acoustic Discrete Codec Tokenizer for Audio Language Modeling
ICLR 2025
FluentEditor: Text-based Speech Editing by Considering Acoustic and Prosody Consistency
INTERSPEECH 2024
GTSinger: A Global Multi-Technique Singing Corpus with Realistic Music Scores for All Singing Tasks
NIPS 2024
MimicTalk: Mimicking a personalized and expressive 3D talking face in minutes
NIPS 2024
AIR-Bench: Benchmarking Large Audio-Language Models via Generative Comprehension
ACL 2024
Make-A-Voice: Revisiting Voice Large Language Models as Scalable Multilingual and Multitask Learners
ACL 2024
MobileSpeech: A Fast and High-Fidelity Framework for Mobile Zero-Shot Text-to-Speech
ACL 2024
TCSinger: Zero-Shot Singing Voice Synthesis with Style Transfer and Multi-Level Style Control
EMNLP 2024
Mega-TTS 2: Boosting Prompting Mechanisms for Zero-Shot Speech Synthesis
ICLR 2024
Real3D-Portrait: One-shot Realistic 3D Talking Portrait Synthesis
ICLR 2024
InstructSpeech: Following Speech Editing Instructions via Large Language Models
ICML 2024
MSceneSpeech: A Multi-Scene Speech Dataset For Expressive Speech Synthesis
INTERSPEECH 2024
FluentSpeech: Stutter-Oriented Automatic Speech Editing with Context-Aware Diffusion Models
ACL 2023
CLAPSpeech: Learning Prosody from Text Context with Contrastive Language-Audio Pre-Training
ACL 2023
FastDiff 2: Revisiting and Incorporating GANs and Diffusion Models in High-Fidelity Speech Synthesis
ACL 2023
GeneFace: Generalized and High-Fidelity Audio-Driven 3D Talking Face Synthesis
ICLR 2023
Dict-TTS: Learning to Pronounce with Prior Dictionary Knowledge for Text-to-Speech
NIPS 2022
FedSpeech: Federated Text-to-Speech with Continual Learning
IJCAI 2021
Self-Supervised Spoofing Audio Detection Scheme
INTERSPEECH 2020