Papers
8,761 papers found
Phonetic Enhanced Language Modeling for Text-to-Speech Synthesis
Kun Zhou, Shengkui Zhao, Yukun Ma et al.
PhoneViz: exploring alignments at a glance
Margot Masson, Erfan A. Shams, Iona Gessinger et al.
Phonological Feature Detection for US English using the Phonet Library
Harsha Veena Tadavarthy, Austin Jones, Margaret E. L. Renwick
Phonological-Level Mispronunciation Detection and Diagnosis
Mostafa Shahin, Beena Ahmed
Phonological Symmetry Does Not Predict Generalization of Perceptual Adaptation to Vowels
Zuheyra Tokac, Jennifer Cole
Pinyin Regularization in Error Correction for Chinese Speech Recognition with Large Language Models
Zhiyuan Tang, Dong Wang, Shen Huang et al.
Pitch-Aware RNN-T for Mandarin Chinese Mispronunciation Detection and Diagnosis
Xintong Wang, Mingqian Shi, Ye Wang
Pitch-driven adjustments in tongue positions: Insights from ultrasound imaging
May Pik Yu Chan, Jianjing Kuang
PitchFlow: adding pitch control to a Flow-matching based TTS model
Tasnima Sadekova, Mikhail Kudinov, Vadim Popov et al.
PLDNet: PLD-Guided Lightweight Deep Network Boosted by Efficient Attention for Handheld Dual-Microphone Speech Enhancement
Nan Zhou, Youhai Jiang, Jialin Tan et al.
PL-TTS: A Generalizable Prompt-based Diffusion TTS Augmented by Large Language Model
Shuhua Li, Qirong Mao, Jiatong Shi
Positional Description for Numerical Normalization
Deepanshu Gupta, Javier Latorre
Post-Net: A linguistically inspired sequence-dependent transformed neural architecture for automatic syllable stress detection
Sai Harshitha Aluru, Jhansi Mallela, Chiranjeevi Yarra
PPPR: Portable Plug-in Prompt Refiner for Text to Audio Generation
Shuchen Shi, Ruibo Fu, Zhengqi Wen et al.
Pragmatically similar utterance finder demonstration
Nigel G. Ward, Andres Segura
Predefined Prototypes for Intra-Class Separation and Disentanglement
Antonio Almudévar, Théo Mariotte, Alfonso Ortega et al.
Predicting Acute Pain Levels Implicitly from Vocal Features
Jennifer Williams, Eike Schneiders, Henry Card et al.
Predicting Heart Activity from Speech using Data-driven and Knowledge-based features
Gasser Elbanna, Zohreh Mostaani, Mathew Magimai.-Doss
Preliminary Investigation of Psychometric Properties of a Novel Multimodal Dialog Based Affect Production Task in Children and Adolescents with Autism
Carly Demopoulos, Linnea Lampinen, Cristian Preciado et al.
Preprocessing for acoustic-to-articulatory inversion using real-time MRI movies of Japanese speech
Anna Oura, Hideaki Kikuchi, Tetsunori Kobayashi
Preservation, conservation and phonetic study of the voices of Italian poets: A study on the seven years of the VIP archive
Federico Lo Iacono, Valentina Colonna, Antonio Romano
Pre-trained Feature Fusion and Matching for Mild Cognitive Impairment Detection
Junwen Duan, Fangyuan Wei, Hong-Dong Li et al.
Pretraining End-to-End Keyword Search with Automatically Discovered Acoustic Units
Bolaji Yusuf, Jan Honza Cernocky, Murat Saraçlar
Pre-training Feature Guided Diffusion Model for Speech Enhancement
Yiyuan Yang, Niki Trigoni, Andrew Markham
Pre-training Neural Transducer-based Streaming Voice Conversion for Faster Convergence and Alignment-free Training
Hiroki Kanagawa, Takafumi Moriya, Yusuke Ijima