Research Explorer

PromptStyle: Controllable Style Transfer for Text-to-Speech with Natural Language Descriptions

Guanghou Liu, Yongmao Zhang, Yi Lei et al.

2023 INTERSPEECH

PronScribe: Highly Accurate Multimodal Phonemic Transcription From Speech and Text

Yang Yu, Matthew Perez, Ankur Bapna et al.

2023 INTERSPEECH

ProsAudit, a prosodic benchmark for self-supervised speech models

Maureen de Seyssel, Marvin Lavechin, Hadrien Titeux et al.

2023 INTERSPEECH

Prosody-controllable Gender-ambiguous Speech Synthesis: A Tool for Investigating Implicit Bias in Speech Perception

Éva Székely, Joakim Gustafson, Ilaria Torre

2023 INTERSPEECH

Prosody Modeling with 3D Visual Information for Expressive Video Dubbing

Zhihan Yang, Shansong Liu, Xu Li et al.

2023 INTERSPEECH

Prospective Validation of Motor-Based Intervention with Automated Mispronunciation Detection of Rhotics in Residual Speech Sound Disorders

Nina R Benway, Jonathan L Preston

2023 INTERSPEECH

Providing Interpretable Insights for Neurological Speech and Cognitive Disorders from Interactive Serious Games

Mario Zusag, Laurin Wagner

2023 INTERSPEECH

Pruning Self-Attention for Zero-Shot Multi-Speaker Text-to-Speech

Hyungchan Yoon, Changhwan Kim, Eunwoo Song et al.

2023 INTERSPEECH

Pseudo-Siamese Network based Timbre-reserved Black-box Adversarial Attack in Speaker Identification

Qing Wang, Jixun Yao, Ziqian Wang et al.

2023 INTERSPEECH

PunCantonese: A Benchmark Corpus for Low-Resource Cantonese Punctuation Restoration from Speech Transcripts

Yunxiang Li, Pengfei Liu, Xixin Wu et al.

2023 INTERSPEECH

Pushing the Limits of Unsupervised Unit Discovery for SSL Speech Representation

Ziyang Ma, Zhisheng Zheng, Guanrou Yang et al.

2023 INTERSPEECH

P-vectors: A Parallel-coupled TDNN/Transformer Network for Speaker Verification

Xiyuan Wang, Fangyuan Wang, Bo Xu et al.

2023 INTERSPEECH

pyannote.audio 2.1 speaker diarization pipeline: principle, benchmark, and recipe

Hervé Bredin

2023 INTERSPEECH

Quantifying Informational Masking due to Masker Intelligibility in Same-talker Speech-in-speech Perception

Mingyue Huo, Yinglun Sun, Dan Fogerty et al.

2023 INTERSPEECH

Quantifying the perceptual value of lexical and non-lexical channels in speech

Sarenne Wallbridge, Peter Bell, Catherine Lai

2023 INTERSPEECH

Quantization-aware and Tensor-compressed Training of Transformers for Natural Language Understanding

Zi Yang, Samridhi Choudhary, Siegfried Kunzmann et al.

2023 INTERSPEECH

Queer Events, Relationships, and Sports: Does Topic Influence Speakers’ Acoustic Expression of Sexual Orientation?

Sven Kachel, Manuel Pöhlmann, Christine Nussbaum

2023 INTERSPEECH

Query Based Acoustic Summarization for Podcasts

Samantha Kotey, Rozenn Dahyot, Naomi Harte

2023 INTERSPEECH

Question-Context Alignment and Answer-Context Dependencies for Effective Answer Sentence Selection

Minh Van Nguyen, Kishan KC, Toan Nguyen et al.

2023 INTERSPEECH

QVoice: Arabic Speech Pronunciation Learning Application

Yassine El Kheir, Fouad Khnaisser, Shammur Absar Chowdhury et al.

2023 INTERSPEECH

RAD-MMM: Multilingual Multiaccented Multispeaker Text To Speech

Rohan Badlani, Rafael Valle, Kevin J. Shih et al.

2023 INTERSPEECH

RAMP: Retrieval-Augmented MOS Prediction via Confidence-based Dynamic Weighting

Hui Wang, Shiwan Zhao, Xiguang Zheng et al.

2023 INTERSPEECH

Random Forest Classification of Breathing Phases from Audio Signals Recorded using Mobile Devices

Vitória S. Fahed, Emer P Doheny, Madeleine M Lowery

2023 INTERSPEECH

Random Utterance Concatenation Based Data Augmentation for Improving Short-video Speech Recognition

Yist Y. Lin, Tao Han, Haihua Xu et al.

2023 INTERSPEECH

Range-Based Equal Error Rate for Spoof Localization

Lin Zhang, Xin Wang, Erica Cooper et al.

2023 INTERSPEECH

Papers