Papers
Fake the Real: Backdoor Attack on Deep Speech Classification via Voice Conversion
Zhe Ye, Terui Mao, Li Dong et al.
Fast and Efficient Multilingual Self-Supervised Pre-training for Low-Resource Speech Recognition
Zhilong Zhang, Wei Wang, Yanmin Qian
Fast Enrollable Streaming Keyword Spotting System: Training and Inference using a Web Browser
Namhyun Cho, Sunmin Kim, Yoseb Kang et al.
FastFit: Towards Real-Time Iterative Neural Vocoder by Replacing U-Net Encoder With Multiple STFTs
Won Jang, Dan Lim, Heayoung Park
FC-MTLF: A Fine- and Coarse-grained Multi-Task Learning Framework for Cross-Lingual Spoken Language Understanding
Xuxin Cheng, Wanshi Xu, Ziyu Yao et al.
Feature Normalization for Fine-tuning Self-Supervised Models in Speech Enhancement
Hejung Yang, Hong-Goo Kang
Federated Learning for Secure Development of AI Models for Parkinson’s Disease Detection Using Speech from Different Languages
Soroosh Tayebi Arasteh, Cristian David Ríos-Urrego, Elmar Nöth et al.
Federated Learning Toolkit with Voice-based User Verification Demo
Prathamesh Mandke, Rachel Oberst, Matthias Reisser et al.
Few-shot Class-incremental Audio Classification Using Adaptively-refined Prototypes
Wei Xie, Yanxiong Li, Qianhua He et al.
Few-shot Class-incremental Audio Classification Using Stochastic Classifier
Yanxiong Li, Wenchang Cao, Jialong Li et al.
Few-shot Dysarthric Speech Recognition with Text-to-Speech Data Augmentation
Enno Hermann, Mathew Magimai.-Doss
Few-Shot Open-Set Learning for On-Device Customization of KeyWord Spotting Systems
Manuele Rusci, Tinne Tuytelaars
Filling the population statistics gap: Swiss German reference data on F0 and speech tempo for forensic contexts
Hannah Hedegard, Andrea Fröhlich, Fabian Tomaschek et al.
Fine-tuned RoBERTa Model with a CNN-LSTM Network for Conversational Emotion Recognition
Jiachen Luo, Huy Phan, Joshua Reiss
Fine-tuning Audio Spectrogram Transformer with Task-aware Adapters for Sound Event Detection
Kang Li, Yan Song, Ian McLoughlin et al.
FlexiAST: Flexibility is What AST Needs
Jiu Feng, Mehmet Hamza Erol, Joon Son Chung et al.
Flow-VAE VC: End-to-End Flow Framework with Contrastive Loss for Zero-shot Voice Conversion
Le Xu, Rongxiu Zhong, Ying Liu et al.
FN-SSL: Full-Band and Narrow-Band Fusion for Sound Source Localization
Yabo Wang, Bing Yang, Xiaofei Li
Focus on the Sound around You: Monaural Target Speaker Extraction via Distance and Speaker Information
Jiuxin Lin, Peng Wang, Heinrich Dinkel et al.
FOOCTTS: Generating Arabic Speech with Acoustic Environment for Football Commentator
Massa Baali, Ahmed M. Ali
Fooling Speaker Identification Systems with Adversarial Background Music
Chu-Xiao Zuo, Jia-Yi Leng, Wu-Jun Li
FRA-RIR: Fast Random Approximation of the Image-source Method
Yi Luo, Jianwei Yu
Frequency Patterns of Individual Speaker Characteristics at Higher and Lower Spectral Ranges
Zhao Zhang, Ju Zhang, Ziyu Zhu et al.