Daisuke Saito

25 papers · 2016–2024 · 2 conferences · across top CS/AI conferences

Achievements

+12 more ↓

🌍 Conference Polyglot (2) 🏃 Academic Marathon (8) 🧭 Keyword Pioneer 🌉 Interdisciplinary Bridge 🐝 Cross-Pollinator (6)

🐝 Cross-Pollinator (6) 🌈 Renaissance Researcher (7) 🗺️ Taxonomy Completionist (39) 🏠 Conference Loyalist (24) 🏆 Keyword Champion (4) 🤝 Dynamic Duo (22) 🗃️ Keyword Collector (123) 🔥 Unstoppable (9) 💎 Century Club (25) 🚀 Conference Pioneer ❓ The Questioner ⚡ Prolific Year (5)

Conferences

INTERSPEECH (24) COLING (1)

Top co-authors

Nobuaki Minematsu (22) Noriko Nakanishi (4) Hidetsugu Uchida (4) Yutaka Yamauchi (3) Shohei Toyama (3) Junwei Yue (2) Jaehyun Choi (2) Yingxiang Gao (2) Shintaro Ando (2) Mirjam Wester (1)

Research topics

Education (1)

Keywords

speaker identity (4) voice conversion (4) deep neural network (3) speech shadowing (3) speech comprehensibility (2) speech synthesis (2) latent variable (2) acoustic feature (2) pronunciation assessment (2) bidirectional long short-term memory (2) acoustic-to-articulatory mapping (2) gaussian mixture model (2) shadowing speech (2) listening disfluency (2) dynamic time warping (2) language learning (2) goodness of pronunciation (2) question generation (1) automatic speech recognition (1) language model adaptation (1)

Papers

A ChatGPT-based oral Q&A practice system for first-time student participants in international conferences INTERSPEECH 2024 Analysis and Visualization of Directional Diversity in Listening Fluency of World Englishes Speakers in the Framework of Mutual Shadowing INTERSPEECH 2024 A Pilot Study of GSLM-based Simulation of Foreign Accentuation Only Using Native Speech Corpora INTERSPEECH 2024 Acceleration of Posteriorgram-based DTW by Distilling the Class-to-class Distances Encoded in the Classifier Used to Calculate Posteriors INTERSPEECH 2024 Automatic Prediction of Language Learners' Listenability Using Speech and Text Features Extracted from Listening Drills INTERSPEECH 2023 Can We Train a Language Model Inside an End-to-End ASR Model? - Investigating Effective Implicit Language Modeling COLING 2022 Text-to-speech synthesis using spectral modeling based on non-negative autoencoder INTERSPEECH 2022 Detection of Learners' Listening Breakdown with Oral Dictation and Its Use to Model Listening Skill Improvement Exclusively Through Shadowing INTERSPEECH 2022 Lexical Density Analysis of Word Productions in Japanese English Using Acoustic Word Embeddings INTERSPEECH 2021 Nonparallel Training of Exemplar-Based Voice Conversion System Using INCA-Based Alignment Technique INTERSPEECH 2020 Attention-Based Speaker Embeddings for One-Shot Voice Conversion INTERSPEECH 2020 Shadowability Annotation with Fine Granularity on L2 Utterances and its Improvement with Native Listeners’ Script-Shadowing INTERSPEECH 2020 Discriminative Method to Extract Coarse Prosodic Structure and its Application for Statistical Phrase/Accent Command Estimation INTERSPEECH 2020 Analysis of Native Listeners’ Facial Microexpressions While Shadowing Non-Native Speech — Potential of Shadowers’ Facial Expressions for Comprehensibility Prediction INTERSPEECH 2019 A Comparative Study of Statistical Conversion of Face to Voice Based on Their Subjective Impressions INTERSPEECH 2018 A Study of Objective Measurement of Comprehensibility through Native Speakers' Shadowing of Learners' Utterances INTERSPEECH 2018 Acoustic-to-Articulatory Mapping Based on Mixture of Probabilistic Canonical Correlation Analysis INTERSPEECH 2017 Automatic Scoring of Shadowing Speech Based on DNN Posteriors and Their DTW INTERSPEECH 2017 Use of Global and Acoustic Features Associated with Contextual Factors to Adapt Language Models for Spontaneous Speech Recognition INTERSPEECH 2017 Parallel-Data-Free Many-to-Many Voice Conversion Based on DNN Integrated with Eigenspace Using a Non-Parallel Speech Corpus INTERSPEECH 2017 Automatic Assessment and Error Detection of Shadowing Speech: Case of English Spoken by Japanese Learners INTERSPEECH 2016 Speaker Representations for Speaker Adaptation in Multiple Speakers’ BLSTM-RNN-Based Speech Synthesis INTERSPEECH 2016 The Voice Conversion Challenge 2016 INTERSPEECH 2016 Prediction of the Articulatory Movements of Unseen Phonemes of a Speaker Using the Speech Structure of Another Speaker INTERSPEECH 2016 Voice Conversion Based on Matrix Variate Gaussian Mixture Model Using Multiple Frame Features INTERSPEECH 2016