Papers
8,761 papers found
Privacy PORCUPINE: Anonymization of Speaker Attributes Using Occurrence Normalization for Space-Filling Vector Quantization
Mohammad Hassan Vali, Tom Bäckström
Probing the Feasibility of Multilingual Speaker Anonymization
Sarina Meyer, Florian Lux, Ngoc Thang Vu
Production of fricative consonants in French-speaking children with cochlear implants and typical hearing: acoustic and phonological analyses.
Sophie Fagniart, Brigitte Charlier, Véronique Delvaux et al.
Production of phrases by mechanical models of the human vocal tract
Takayuki Arai, Ryohei Suzuki, Chandler Earp et al.
Prompting Large Language Models with Audio for General-Purpose Speech Summarization
Wonjune Kang, Deb Roy
Prompting Large Language Models with Mispronunciation Detection and Diagnosis Abilities
Minglin Wu, Jing Xu, Xixin Wu et al.
Prompting Whisper for QA-driven Zero-shot End-to-end Spoken Language Understanding
Mohan Li, Simon Keizer, Rama Doddipatla
Prompt Link Multimodal Fusion in Multimodal Sentiment Analysis
Kang Zhu, Cunhang Fan, Jianhua Tao et al.
Prompt Tuning for Audio Deepfake Detection: Computationally Efficient Test-time Domain Adaptation with Limited Target Dataset
Hideyuki Oiso, Yuto Matsunaga, Kazuya Kakizaki et al.
Prompt Tuning for Speech Recognition on Unknown Spoken Name Entities
Xizi Wei, Stephen McGregor
Prosodic marking of syntactic boundaries in Khoekhoe
Kira Tulchynska, Sylvanus Job, Alena Witzlack-Makarevich et al.
Prosody-Driven Privacy-Preserving Dementia Detection
Dominika Woszczyk, Ranya Aloufi, Soteris Demetriou
Prosody of speech production in latent post-stroke aphasia
Cong Zhang, Tong Li, Gayle DeDe et al.
PRVAE-VC2: Non-Parallel Voice Conversion by Distillation of Speech Representations
Kou Tanaka, Hirokazu Kameoka, Takuhiro Kaneko et al.
QGAN: Low Footprint Quaternion Neural Vocoder for Speech Synthesis
Aryan Chaudhary, Vinayak Abrol
QHM-GAN: Neural Vocoder based on Quasi-Harmonic Modeling
Shaowen Chen, Tomoki Toda
Qifusion-Net: Layer-adapted Stream/Non-stream Model for End-to-End Multi-Accent Speech Recognition
Jinming Chen, Jingyi Fang, Yuanzhong Zheng et al.
QMixCAT: Unsupervised Speech Enhancement Using Quality-guided Signal Mixing and Competitive Alternating Model Training
Shilin Wang, Haixin Guan, Yanhua Long
Quantification of stylistic differences in human- and ASR-produced transcripts of African American English
Annika Heuser, Tyler Kendall, Miguel del Rio et al.
Quantifying the effect of speech pathology on automatic and human speaker verification
Bence Mark Halpern, Thomas Tienkamp, Wen-Chin Huang et al.
Quantifying the Role of Textual Predictability in Automatic Speech Recognition
Sean Robertson, Gerald Penn, Ewan Dunbar
Quantifying Unintended Memorization in BEST-RQ ASR Encoders
Virat Shejwalkar, Om Thakkar, Arun Narayanan
Quantity-sensitivity affects recall performance of word stress
Constantijn Kaland, Maria Lialiou
Query-by-Example Keyword Spotting Using Spectral-Temporal Graph Attentive Pooling and Multi-Task Learning
Zhenyu Wang, Shuyu Kong, Li Wan et al.
RaD-Net 2: A causal two-stage repairing and denoising speech enhancement network with knowledge distillation and complex axial self-attention
Mingshuai Liu, Zhuangqi Chen, Xiaopeng Yan et al.