Research Explorer

EZTalking: English assessment platform for teachers and students

Yu-Sheng Tsao, Yung-Chang Hsu, Jiun-Ting Li et al.

2024 INTERSPEECH

Factor-Conditioned Speaking-Style Captioning

Atsushi Ando, Takafumi Moriya, Shota Horiguchi et al.

2024 INTERSPEECH

FA-GAN: Artifacts-free and Phase-aware High-fidelity GAN-based Vocoder

Rubing Shen, Yanzhen Ren, Zongkun Sun

2024 INTERSPEECH

FakeSound: Deepfake General Audio Detection

Zeyu Xie, Baihan Li, Xuenan Xu et al.

2024 INTERSPEECH

Familiar and Unfamiliar Speaker Identification in Speech and Singing

Katelyn Taylor, Amelia Gully, Helena Daffern

2024 INTERSPEECH

FastAST: Accelerating Audio Spectrogram Transformer via Token Merging and Cross-Model Knowledge Distillation

Swarup Ranjan Behera, Abhishek Dhiman, Karthik Gowda et al.

2024 INTERSPEECH

Fast Context-Biasing for CTC and Transducer ASR models with CTC-based Word Spotter

Andrei Andrusenko, Aleksandr Laptev, Vladimir Bataev et al.

2024 INTERSPEECH

Faster Vocoder: a multi threading approach to achieve low latency during TTS Inference

Vishal Gourav, Ankit Tyagi, Phanindra Mankale

2024 INTERSPEECH

FastLips: an End-to-End Audiovisual Text-to-Speech System with Lip Features Prediction for Virtual Avatars

Martin Lenglet, Olivier Perrotin, Gerard Bailly

2024 INTERSPEECH

FastVoiceGrad: One-step Diffusion-Based Voice Conversion with Adversarial Conditional Diffusion Distillation

Takuhiro Kaneko, Hirokazu Kameoka, Kou Tanaka et al.

2024 INTERSPEECH

Few-Shot Keyword-Incremental Learning with Total Calibration

Ilseok Kim, Ju-Seok Seong, Joon-Hyuk Chang

2024 INTERSPEECH

Few-Shot Keyword Spotting from Mixed Speech

Junming Yuan, Ying Shi, LanTian Li et al.

2024 INTERSPEECH

Finding Task-specific Subnetworks in Multi-task Spoken Language Understanding Model

Hayato Futami, Siddhant Arora, Yosuke Kashiwagi et al.

2024 INTERSPEECH

Fine-Grained and Interpretable Neural Speech Editing

Max Morrison, Cameron Churchwell, Nathan Pruyne et al.

2024 INTERSPEECH

Fine-tune Pre-Trained Models with Multi-Level Feature Fusion for Speaker Verification

Shengyu Peng, Wu Guo, Haochen Wu et al.

2024 INTERSPEECH

Fine-Tuning Automatic Speech Recognition for People with Parkinson's: An Effective Strategy for Enhancing Speech Technology Accessibility

Xiuwen Zheng, Bornali Phukon, Mark Hasegawa-Johnson

2024 INTERSPEECH

Fine-tuning of Pre-trained Models for Classification of Vocal Intensity Category from Speech Signals

Manila Kodali, Sudarsana Reddy Kadiri, Paavo Alku

2024 INTERSPEECH

Fine-Tuning Strategies for Dutch Dysarthric Speech Recognition: Evaluating the Impact of Healthy, Disease-Specific, and Speaker-Specific Data

Spyretta Leivaditi, Tatsunari Matsushima, Matt Coler et al.

2024 INTERSPEECH

FLEURS-R: A Restored Multilingual Speech Corpus for Generation Tasks

Min Ma, Yuma Koizumi, Shigeki Karita et al.

2024 INTERSPEECH

FlowAVSE: Efficient Audio-Visual Speech Enhancement with Conditional Flow Matching

Chaeyoung Jung, Suyeon Lee, Ji-Hoon Kim et al.

2024 INTERSPEECH

FluentEditor: Text-based Speech Editing by Considering Acoustic and Prosody Consistency

Rui Liu, Jiatian Xi, Ziyue Jiang et al.

2024 INTERSPEECH

FLY-TTS: Fast, Lightweight and High-Quality End-to-End Text-to-Speech Synthesis

Yinlin Guo, Yening Lv, Jinqiao Dou et al.

2024 INTERSPEECH

Form and Function in Prosodic Representation: In the Case of 'ma' in Tianjin Mandarin

Tianqi Geng, Hui Feng

2024 INTERSPEECH

FoVNet: Configurable Field-of-View Speech Enhancement with Low Computation and Distortion for Smart Glasses

Zhongweiyang Xu, Ali Aroudi, Ke Tan et al.

2024 INTERSPEECH

Frame-Wise Breath Detection with Self-Training: An Exploration of Enhancing Breath Naturalness in Text-to-Speech

Dong Yang, Tomoki Koriyama, Yuki Saito

2024 INTERSPEECH

Papers