Papers
SpeechFormer: A Hierarchical Efficient Framework Incorporating the Characteristics of Speech
Weidong Chen, Xiaofen Xing, Xiangmin Xu et al.
Speech imitation skills predict automatic phonetic convergence: a GMM-UBM study on L2
Dorina de Jong, Aldo Pastore, Noël Nguyen et al.
Speech intelligibility of simulated hearing loss sounds and its prediction using the Gammachirp Envelope Similarity Index (GESI)
Toshio Irino, Honoka Tamaru, Ayako Yamamoto
Speech Intelligibility Prediction for Hearing-Impaired Listeners with the LEAP Model
Jana Roßbach, Rainer Huber, Saskia Röttges et al.
Speech Modification for Intelligibility in Cochlear Implant Listeners: Individual Effects of Vowel- and Consonant-Boosting
Juliana N. Saba, John H.L. Hansen
SpeechPainter: Text-conditioned Speech Inpainting
Zalan Borsos, Matthew Sharifi, Marco Tagliasacchi
Speech Pre-training with Acoustic Piece
Shuo Ren, Shujie Liu, Yu Wu et al.
Speech Quality Assessment through MOS using Non-Matching References
Pranay Manocha, Anurag Kumar
Speech Representation Disentanglement with Adversarial Mutual Information Learning for One-shot Voice Conversion
SiCheng Yang, Methawee Tantrawenith, Haolin Zhuang et al.
Speech Segmentation Optimization using Segmented Bilingual Speech Corpus for End-to-end Speech Translation
Ryo Fukuda, Katsuhito Sudoh, Satoshi Nakamura
Speech Separation for an Unknown Number of Speakers Using Transformers With Encoder-Decoder Attractors
Srikanth Raj Chetupalli, Emanuël Habets
Speech Sequence Embeddings using Nearest Neighbors Contrastive Learning
Robin Algayres, Adel Nabli, Benoît Sagot et al.
SPLICEOUT: A Simple and Efficient Audio Augmentation Method
Arjit Jain, Pranay Reddy Samala, Deepak Mittal et al.
Spoken Dialogue System for Call Centers with Expressive Speech Synthesis
Davis Nicmanis, Askars Salimbajevs
Spoken-Text-Style Transfer with Conditional Variational Autoencoder and Content Word Storage
Daiki Yoshioka, Yusuke Yasuda, Noriyuki Matsunaga et al.
Spoofed speech from the perspective of a forensic phonetician
Christin Kirchhübel, Georgina Brown
Spoofing-Aware Attention based ASV Back-end with Multiple Enrollment Utterances and a Sampling Strategy for the SASV Challenge 2022
Chang Zeng, Lin Zhang, Meng Liu et al.
Spoofing-Aware Speaker Verification by Multi-Level Fusion
Haibin Wu, Lingwei Meng, Jiawen Kang et al.
Squashed Weight Distribution for Low Bit Quantization of Deep Models
Nikko Strom, Haidar Khan, Wael Hamza
State & Trait Measurement from Nonverbal Vocalizations: A Multi-Task Joint Learning Approach
Alice Baird, Panagiotis Tzirakis, Jeff Brooks et al.
Statistical and clinical utility of multimodal dialogue-based speech and facial metrics for Parkinson's disease assessment
Hardik Kothare, Michael Neumann, Jackson Liscombe et al.
Steering vector correction in MVDR beamformer for speech enhancement
Suliang Bu, Yunxin Zhao, Tuo Zhao
Strategies for developing a Conversational Speech Dataset for Text-To-Speech Synthesis
Adaeze O. Adigwe, Esther Klabbers
Strategies to Improve Robustness of Target Speech Extraction to Enrollment Variations
Hiroshi Sato, Tsubasa Ochiai, Marc Delcroix et al.
Streamable Speech Representation Disentanglement and Multi-Level Prosody Modeling for Live One-Shot Voice Conversion
Haoquan Yang, Liqun Deng, Yu Ting Yeung et al.