Hiroshi Saruwatari
30 papers · 2016–2024 · 2 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+13 more ↓ Show less ↑
π§ Keyword Pioneer πΊοΈ Taxonomy Completionist (14) π Renaissance Researcher (5) π Interdisciplinary Bridge π Conference Polyglot (2)
π
Academic Marathon
(8)
πΊοΈ
Taxonomy Completionist
(14)
π§
Keyword Pioneer
π
Conference Loyalist
(29)
π€
Dynamic Duo
(24)
π¬
Deep Specialist
(11)
π
Keyword Champion
(2)
π₯
Unstoppable
(5)
π
Conference Pioneer
β‘
Prolific Year
(6)
ποΈ
Keyword Collector
(131)
π
Century Club
(30)
π
Trend Setter
Conferences
INTERSPEECH (29)
IJCAI (1)
Top co-authors
Keywords
speech synthesis
(9)
voice conversion
(5)
self-supervised learning
(4)
speech quality
(4)
domain adaptation
(3)
dialogue system
(3)
empathetic dialogue
(3)
speech enhancement
(3)
cross-lingual synthesis
(2)
speaker adaptation
(2)
speech corpus
(2)
speaker embedding
(2)
language model
(2)
sequence-to-sequence learning
(2)
speaker individuality
(2)
deep neural network
(2)
deep gaussian process
(2)
multilingual processing
(1)
emotion recognition
(1)
ensemble learning
(1)
Papers
Spatial Voice Conversion: Voice Conversion Preserving Spatial Information and Non-target Signals
INTERSPEECH 2024
SpeechBERTScore: Reference-Aware Automatic Evaluation of Speech Generation Leveraging NLP Evaluation Metrics
INTERSPEECH 2024
Noise-Robust Voice Conversion by Conditional Denoising Training Using Latent Variables of Recording Quality and Environment
INTERSPEECH 2024
SaSLaW: Dialogue Speech Corpus with Audio-visual Egocentric Information Toward Environment-adaptive Dialogue Speech Synthesis
INTERSPEECH 2024
SRC4VC: Smartphone-Recorded Corpus for Voice Conversion Benchmark
INTERSPEECH 2024
Learning to Speak from Text: Zero-Shot Multilingual Text-to-Speech with Unsupervised Text Pretraining
IJCAI 2023
Laughter Synthesis using Pseudo Phonetic Tokens with a Large-scale In-the-wild Laughter Corpus
INTERSPEECH 2023
How Generative Spoken Language Modeling Encodes Noisy Speech: Investigation from Phonetics to Syntactics
INTERSPEECH 2023
ChatGPT-EDSS: Empathetic Dialogue Speech Synthesis Trained from ChatGPT-derived Context Word Embeddings
INTERSPEECH 2023
HumanDiffusion: diffusion model using perceptual gradients
INTERSPEECH 2023
CALLS: Japanese Empathetic Dialogue Speech Corpus of Complaint Handling and Attentive Listening in Customer Center
INTERSPEECH 2023
Predicting VQVAE-based Character Acting Style from Quotation-Annotated Text for Audiobook Speech Synthesis
INTERSPEECH 2022
STUDIES: Corpus of Japanese Empathetic Dialogue Speech Towards Friendly Voice Agent
INTERSPEECH 2022
J-MAC: Japanese multi-speaker audiobook corpus for speech synthesis
INTERSPEECH 2022
Human-in-the-loop Speaker Adaptation for DNN-based Multi-speaker TTS
INTERSPEECH 2022
Acoustic Modeling for End-to-End Empathetic Dialogue Speech Synthesis Using Linguistic and Prosodic Contexts of Dialogue History
INTERSPEECH 2022
SelfRemaster: Self-Supervised Speech Restoration with Analysis-by-Synthesis Approach Using Channel Modeling
INTERSPEECH 2022
UTMOS: UTokyo-SaruLab System for VoiceMOS Challenge 2022
INTERSPEECH 2022
Cross-Lingual Speaker Adaptation Using Domain Adaptation and Speaker Consistency Loss for Text-To-Speech Synthesis
INTERSPEECH 2021
Harmonic WaveGAN: GAN-Based Speech Waveform Generation Model with Harmonic Structure Discriminator
INTERSPEECH 2021
Sequence-to-Sequence Learning for Deep Gaussian Process Based Speech Synthesis Using Self-Attention GP Layer
INTERSPEECH 2021
Real-Time, Full-Band, Online DNN-Based Voice Conversion System Using a Single CPU
INTERSPEECH 2020
Investigating Effective Additional Contextual Factors in DNN-Based Spontaneous Speech Synthesis
INTERSPEECH 2020
End-to-End Text-to-Speech Synthesis with Unaligned Multiple Language Units Based on Attention
INTERSPEECH 2020
Harmonic Lowering for Accelerating Harmonic Convolution for Audio Signals
INTERSPEECH 2020
Multi-Speaker Text-to-Speech Synthesis Using Deep Gaussian Processes
INTERSPEECH 2020
Cross-Lingual Text-To-Speech Synthesis via Domain Adaptation and Perceptual Similarity Regression in Speaker Space
INTERSPEECH 2020
Sampling-Based Speech Parameter Generation Using Moment-Matching Networks
INTERSPEECH 2017
Voice Conversion Using Sequence-to-Sequence Learning of Context Posterior Probabilities
INTERSPEECH 2017
Semi-Supervised Joint Enhancement of Spectral and Cepstral Sequences of Noisy Speech
INTERSPEECH 2016