conftrace_

Tomoki Toda

59 papers · 2014–2024 · 6 conferences · across top CS/AI conferences

Achievements

Jump to papers ↓

+13 more ↓

🧭 Keyword Pioneer 🗺️ Taxonomy Completionist (17) 🌈 Renaissance Researcher (6) 🌉 Interdisciplinary Bridge 🌍 Conference Polyglot (6)

🗺️ Taxonomy Completionist (17) 🏃 Academic Marathon (10) 🧭 Keyword Pioneer 🏠 Conference Loyalist (50) 🏆 Keyword Champion (3) 🔬 Deep Specialist (12) 🤝 Dynamic Duo (13) 🚀 Conference Pioneer 📈 Trend Setter 🗃️ Keyword Collector (195) ⚡ Prolific Year (5) 💎 Century Club (59) 🔥 Unstoppable (11)

Conferences

INTERSPEECH (50) ACL (3) COLING (2) IJCNLP (2) EACL (1) NAACL (1)

Top co-authors

Satoshi Nakamura (13) Tomoki Hayashi (12) Kazuhiro Kobayashi (11) YI-CHIAO WU (11) Sakriani Sakti (10) Graham Neubig (10) Patrick Lumban Tobing (8) Yusuke Yasuda (5) Kazuya Takeda (5) Wen-Chin Huang (5)

Research topics

Learning Types (1) Probability (1)

Keywords

speech synthesis (18) voice conversion (16) neural vocoder (11) speech enhancement (7) variational autoencoder (6) mean opinion score (5) gaussian mixture model (5) acoustic feature (4) waveform generation (4) fundamental frequency (4) sequence-to-sequence model (3) text-to-speech synthesis (3) speaker identity (3) wavenet vocoder (3) pitch control (3) waveform modification (3) speech quality assessment (3) hidden markov model (3) self-supervised learning (2) preference learning (2)

Papers

Multimodal Fusion of Music Theory-Inspired and Self-Supervised Representations for Improved Emotion Recognition INTERSPEECH 2024 Quantifying the effect of speech pathology on automatic and human speaker verification INTERSPEECH 2024 Embedding Learning for Preference-based Speech Quality Assessment INTERSPEECH 2024 Challenge of Singing Voice Synthesis Using Only Text-To-Speech Corpus With FIRNet Source-Filter Neural Vocoder INTERSPEECH 2024 2DP-2MRC: 2-Dimensional Pointer-based Machine Reading Comprehension Method for Multimodal Moment Retrieval INTERSPEECH 2024 CtrSVDD: A Benchmark Dataset and Baseline Analysis for Controlled Singing Voice Deepfake Detection INTERSPEECH 2024 Exploring the Robustness of Text-to-Speech Synthesis Based on Diffusion Probabilistic Models to Heavily Noisy Transcriptions INTERSPEECH 2024 QHM-GAN: Neural Vocoder based on Quasi-Harmonic Modeling INTERSPEECH 2024 Preference-based training framework for automatic speech quality assessment using deep neural network INTERSPEECH 2023 Analysis of Mean Opinion Scores in Subjective Evaluation of Synthetic Speech Based on Tail Probabilities INTERSPEECH 2023 Reverberation-Controllable Voice Conversion Using Reverberation Time Estimator INTERSPEECH 2023 E2E-S2S-VC: End-To-End Sequence-To-Sequence Voice Conversion INTERSPEECH 2023 Emotion Awareness in Multi-utterance Turn for Improving Emotion Prediction in Multi-Speaker Conversation INTERSPEECH 2023 Unified Source-Filter GAN with Harmonic-plus-Noise Source Excitation Generation INTERSPEECH 2022 An Evaluation of Three-Stage Voice Conversion Framework for Noisy and Reverberant Conditions INTERSPEECH 2022 Spoken-Text-Style Transfer with Conditional Variational Autoencoder and Content Word Storage INTERSPEECH 2022 The VoiceMOS Challenge 2022 INTERSPEECH 2022 Investigating Self-supervised Pretraining Frameworks for Pathological Speech Recognition INTERSPEECH 2022 A Preliminary Study of a Two-Stage Paradigm for Preserving Speaker Identity in Dysarthric Voice Conversion INTERSPEECH 2021 Relational Data Selection for Data Augmentation of Speaker-Dependent Multi-Band MelGAN Vocoder INTERSPEECH 2021 High-Fidelity and Low-Latency Universal Neural Vocoder Based on Multiband WaveRNN with Data-Driven Linear Prediction for Discrete Waveform Modeling INTERSPEECH 2021 Unified Source-Filter GAN: Unified Source-Filter Network Based On Factorization of Quasi-Periodic Parallel WaveGAN INTERSPEECH 2021 Quasi-Periodic Parallel WaveGAN Vocoder: A Non-Autoregressive Pitch-Dependent Dilated Convolution Model for Parametric Speech Generation INTERSPEECH 2020 A Cyclical Post-Filtering Approach to Mismatch Refinement of Neural Vocoder for Text-to-Speech Systems INTERSPEECH 2020 Semi-Supervised Self-Produced Speech Enhancement and Suppression Based on Joint Source Modeling of Air- and Body-Conducted Signals Using Variational Autoencoder INTERSPEECH 2020 Intelligibility Enhancement Based on Speech Waveform Modification Using Hearing Impairment INTERSPEECH 2020 Cyclic Spectral Modeling for Unsupervised Unit Discovery into Voice Conversion with Excitation and Waveform Modeling INTERSPEECH 2020 Voice Transformer Network: Sequence-to-Sequence Voice Conversion Using Transformer with Text-to-Speech Pretraining INTERSPEECH 2020 Pre-Trained Text Embeddings for Enhanced Text-to-Speech Synthesis INTERSPEECH 2019 Quasi-Periodic WaveNet Vocoder: A Pitch Dependent Dilated Convolution Model for Parametric Speech Generation INTERSPEECH 2019 Non-Parallel Voice Conversion with Cyclic Variational Autoencoder INTERSPEECH 2019 Robustness of Statistical Voice Conversion Based on Direct Waveform Modification Against Background Sounds INTERSPEECH 2019 Investigation of F0 Conditioning and Fully Convolutional Networks in Variational Autoencoder Based Voice Conversion INTERSPEECH 2019 Real-Time Neural Text-to-Speech with Sequence-to-Sequence Acoustic Model and WaveGlow or Single Gaussian WaveRNN Vocoders INTERSPEECH 2019 Designing a Pneumatic Bionic Voice Prosthesis - A Statistical Approach for Source Excitation Generation INTERSPEECH 2018 Audio-visual Voice Conversion Using Deep Canonical Correlation Analysis for Deep Bottleneck Features INTERSPEECH 2018 Frequency Domain Variants of Velvet Noise and Their Application to Speech Processing and Synthesis INTERSPEECH 2018 Collapsed Speech Segment Detection and Suppression for WaveNet Vocoder INTERSPEECH 2018 Multi-Head Decoder for End-to-End Speech Recognition INTERSPEECH 2018 A Modulation Property of Time-Frequency Derivatives of Filtered Phase and its Application to Aperiodicity and foEstimation INTERSPEECH 2017 Physically Constrained Statistical F0Prediction for Electrolaryngeal Speech Enhancement INTERSPEECH 2017 Speech Enhancement Using Non-Negative Spectrogram Models with Mel-Generalized Cepstral Regularization INTERSPEECH 2017 A New Cosine Series Antialiasing Function and its Application to Aliasing-Free Glottal Source Models for Speech and Singing Synthesis INTERSPEECH 2017 Statistical Voice Conversion with WaveNet-Based Waveform Generation INTERSPEECH 2017 Speaker-Dependent WaveNet Vocoder INTERSPEECH 2017 The Voice Conversion Challenge 2016 INTERSPEECH 2016 A Hybrid System for Continuous Word-Level Emphasis Modeling Based on HMM State Clustering and Adaptive Training INTERSPEECH 2016 Model Integration for HMM- and DNN-Based Speech Synthesis Using Product-of-Experts Framework INTERSPEECH 2016 The NU-NAIST Voice Conversion System for the Voice Conversion Challenge 2016 INTERSPEECH 2016 Acoustic-to-Articulatory Inversion Mapping Based on Latent Trajectory Gaussian Mixture Model INTERSPEECH 2016 Ckylark: A More Robust PCFG-LA Parser NAACL 2015 Improving Pivot Translation by Remembering the Pivot IJCNLP 2015 Syntax-based Simultaneous Translation through Prediction of Unseen Syntactic Constituents IJCNLP 2015 Improving Pivot Translation by Remembering the Pivot ACL 2015 Syntax-based Simultaneous Translation through Prediction of Unseen Syntactic Constituents ACL 2015 Discriminative Language Models as a Tool for Machine Translation Error Analysis COLING 2014 Optimizing Segmentation Strategies for Simultaneous Speech Translation ACL 2014 Acquiring a Dictionary of Emotion-Provoking Events EACL 2014 Reinforcement Learning of Cooperative Persuasive Dialogue Policies using Framing COLING 2014