Papers
8,761 papers found
EZTalking: English assessment platform for teachers and students
Yu-Sheng Tsao, Yung-Chang Hsu, Jiun-Ting Li et al.
Factor-Conditioned Speaking-Style Captioning
Atsushi Ando, Takafumi Moriya, Shota Horiguchi et al.
FA-GAN: Artifacts-free and Phase-aware High-fidelity GAN-based Vocoder
Rubing Shen, Yanzhen Ren, Zongkun Sun
FakeSound: Deepfake General Audio Detection
Zeyu Xie, Baihan Li, Xuenan Xu et al.
Familiar and Unfamiliar Speaker Identification in Speech and Singing
Katelyn Taylor, Amelia Gully, Helena Daffern
FastAST: Accelerating Audio Spectrogram Transformer via Token Merging and Cross-Model Knowledge Distillation
Swarup Ranjan Behera, Abhishek Dhiman, Karthik Gowda et al.
Fast Context-Biasing for CTC and Transducer ASR models with CTC-based Word Spotter
Andrei Andrusenko, Aleksandr Laptev, Vladimir Bataev et al.
Faster Vocoder: a multi threading approach to achieve low latency during TTS Inference
Vishal Gourav, Ankit Tyagi, Phanindra Mankale
FastLips: an End-to-End Audiovisual Text-to-Speech System with Lip Features Prediction for Virtual Avatars
Martin Lenglet, Olivier Perrotin, Gerard Bailly
FastVoiceGrad: One-step Diffusion-Based Voice Conversion with Adversarial Conditional Diffusion Distillation
Takuhiro Kaneko, Hirokazu Kameoka, Kou Tanaka et al.
Few-Shot Keyword-Incremental Learning with Total Calibration
Ilseok Kim, Ju-Seok Seong, Joon-Hyuk Chang
Few-Shot Keyword Spotting from Mixed Speech
Junming Yuan, Ying Shi, LanTian Li et al.
Finding Task-specific Subnetworks in Multi-task Spoken Language Understanding Model
Hayato Futami, Siddhant Arora, Yosuke Kashiwagi et al.
Fine-Grained and Interpretable Neural Speech Editing
Max Morrison, Cameron Churchwell, Nathan Pruyne et al.
Fine-tune Pre-Trained Models with Multi-Level Feature Fusion for Speaker Verification
Shengyu Peng, Wu Guo, Haochen Wu et al.
Fine-Tuning Automatic Speech Recognition for People with Parkinson's: An Effective Strategy for Enhancing Speech Technology Accessibility
Xiuwen Zheng, Bornali Phukon, Mark Hasegawa-Johnson
Fine-tuning of Pre-trained Models for Classification of Vocal Intensity Category from Speech Signals
Manila Kodali, Sudarsana Reddy Kadiri, Paavo Alku
Fine-Tuning Strategies for Dutch Dysarthric Speech Recognition: Evaluating the Impact of Healthy, Disease-Specific, and Speaker-Specific Data
Spyretta Leivaditi, Tatsunari Matsushima, Matt Coler et al.
FLEURS-R: A Restored Multilingual Speech Corpus for Generation Tasks
Min Ma, Yuma Koizumi, Shigeki Karita et al.
FlowAVSE: Efficient Audio-Visual Speech Enhancement with Conditional Flow Matching
Chaeyoung Jung, Suyeon Lee, Ji-Hoon Kim et al.
FluentEditor: Text-based Speech Editing by Considering Acoustic and Prosody Consistency
Rui Liu, Jiatian Xi, Ziyue Jiang et al.
FLY-TTS: Fast, Lightweight and High-Quality End-to-End Text-to-Speech Synthesis
Yinlin Guo, Yening Lv, Jinqiao Dou et al.
Form and Function in Prosodic Representation: In the Case of 'ma' in Tianjin Mandarin
Tianqi Geng, Hui Feng
FoVNet: Configurable Field-of-View Speech Enhancement with Low Computation and Distortion for Smart Glasses
Zhongweiyang Xu, Ali Aroudi, Ke Tan et al.
Frame-Wise Breath Detection with Self-Training: An Exploration of Enhancing Breath Naturalness in Text-to-Speech
Dong Yang, Tomoki Koriyama, Yuki Saito