Research Explorer

Phonetic-assisted Multi-Target Units Modeling for Improving Conformer-Transducer ASR system

Li Li, Dongxing Xu, Haoran Wei et al.

2023 INTERSPEECH

PhonMatchNet: Phoneme-Guided Zero-Shot Keyword Spotting for User-Defined Keywords

Yong-Hyeok Lee, Namhyun Cho

2023 INTERSPEECH

PIAVE: A Pose-Invariant Audio-Visual Speaker Extraction Network

Qinghua Liu, Meng Ge, Zhizheng Wu et al.

2023 INTERSPEECH

Pitch Accent Variation and the Interpretation of Rising and Falling Intonation in American English

Thomas Sostarics, Jennifer Cole

2023 INTERSPEECH

Pitch distributions in a very large corpus of spontaneous Finnish speech

Mietta Lennes, Minnaleena Toivola

2023 INTERSPEECH

PLCMOS – A Data-driven Non-intrusive Metric for The Evaluation of Packet Loss Concealment Algorithms

Lorenz Diener, Marju Purin, Sten Sootla et al.

2023 INTERSPEECH

PoCaPNet: A Novel Approach for Surgical Phase Recognition Using Speech and X-Ray Images

Kubilay Can Demir, Tobias Weise, Matthias May et al.

2023 INTERSPEECH

Point to the Hidden: Exposing Speech Audio Splicing via Signal Pointer Nets

Denise Moussa, Germans Hirsch, Sebastian Wankerl et al.

2023 INTERSPEECH

Powerset multi-class cross entropy loss for neural speaker diarization

Alexis Plaquet, Hervé Bredin

2023 INTERSPEECH

Pragmatic Pertinence: A Learnable Confidence Metric to Assess the Subjective Quality of LM-Generated Text

Jerome R. Bellegarda

2023 INTERSPEECH

Predicting Perceptual Centers Located at Vowel Onset in German Speech Using Long Short-Term Memory Networks

Felicia Schulz, Mirella De Sisto, M. Paula Roncaglia-Denissen et al.

2023 INTERSPEECH

Prediction of the Gender-based Violence Victim Condition using Speech: What do Machine Learning Models rely on?

Emma Reyner-Fuentes, Esther Rituerto-González, Isabel Trancoso et al.

2023 INTERSPEECH

Preference-based training framework for automatic speech quality assessment using deep neural network

Cheng-Hung Hu, Yusuke Yasuda, Tomoki Toda

2023 INTERSPEECH

Preference Learning Labels by Anchoring on Consecutive Annotations

Abinay Reddy Naini, Ali N. Salman, Carlos Busso

2023 INTERSPEECH

Pre-Finetuning for Few-Shot Emotional Speech Recognition

Maximillian Chen, Zhou Yu

2023 INTERSPEECH

Prefix Search Decoding for RNN Transducers

Kiran Praveen, Advait Vinay Dhopeshwarkar, Abhishek Pandey et al.

2023 INTERSPEECH

Prior-free Guided TTS: An Improved and Efficient Diffusion-based Text-Guided Speech Synthesis

Won-Gook Choi, So-Jeong Kim, TaeHo Kim et al.

2023 INTERSPEECH

Privacy-preserving Representation Learning for Speech Understanding

Minh Tran, Mohammad Soleymani

2023 INTERSPEECH

Privacy Risks in Speech Emotion Recognition: A Systematic Study on Gender Inference Attack

Basmah Alsenani, Tanaya Guha, Alessandro Vinciarelli

2023 INTERSPEECH

Probing Self-supervised Speech Models for Phonetic and Phonemic Information: A Case Study in Aspiration

Kinan Martin, Jon Gauthier, Canaan Breiss et al.

2023 INTERSPEECH

Probing Speech Quality Information in ASR Systems

Bao Thang Ta, Minh Tu Le, Nhat Minh Le et al.

2023 INTERSPEECH

Progress and Prospects for Spoken Language Technology: Results from Five Sexennial Surveys

Roger K. Moore, Ricard Marxer

2023 INTERSPEECH

Promoting Mental Self-Disclosure in a Spoken Dialogue System

Mahdin Rohmatillah, Bobbi Aditya, Li-Jen Yang et al.

2023 INTERSPEECH

Prompt Guided Copy Mechanism for Conversational Question Answering

Yong Zhang, Zhitao Li, Jianzong Wang et al.

2023 INTERSPEECH

Prompting the Hidden Talent of Web-Scale Speech Models for Zero-Shot Task Generalization

Puyuan Peng, Brian Yan, Shinji Watanabe et al.

2023 INTERSPEECH

Papers