Research Explorer

Towards Supervised Performance on Speaker Verification with Self-Supervised Learning by Leveraging Large-Scale ASR Models

Victor Miara, Theo Lepage, Reda Dehak

2024 INTERSPEECH

To what extent can ASV systems naturally defend against spoofing attacks?

Jee-weon Jung, Xin Wang, Nicholas Evans et al.

2024 INTERSPEECH

TraceableSpeech: Towards Proactively Traceable Text-to-Speech with Watermarking

Junzuo Zhou, Jiangyan Yi, Tao Wang et al.

2024 INTERSPEECH

Tradition or Innovation: A Comparison of Modern ASR Methods for Forced Alignment

Rotem Rousso, Eyal Cohen, Joseph Keshet et al.

2024 INTERSPEECH

Training Data Augmentation for Dysarthric Automatic Speech Recognition by Text-to-Dysarthric-Speech Synthesis

Wing-Zin Leung, Mattias Cross, Anton Ragni et al.

2024 INTERSPEECH

Training speech-breathing coordination in computer-assisted reading

Delphine Charuau, Andrea Briglia, Erika Godde et al.

2024 INTERSPEECH

Transcription-Free Fine-Tuning of Speech Separation Models for Noisy and Reverberant Multi-Speaker Automatic Speech Recognition

William Ravenscroft, George Close, Stefan Goetze et al.

2024 INTERSPEECH

Transfer Learning from Whisper for Microscopic Intelligibility Prediction

Paul Best, Santiago Cuervo, Ricard Marxer

2024 INTERSPEECH

Transformer-based Model for ASR N-Best Rescoring and Rewriting

Iwen E Kang, Christophe Van Gysel, Man-Hung Siu

2024 INTERSPEECH

Translating speech with just images

Dan Oneata, Herman Kamper

2024 INTERSPEECH

Translingual Language Markers for Cognitive Assessment from Spontaneous Speech

Bao Hoang, Yijiang Pang, Hiroko Dodge et al.

2024 INTERSPEECH

Transmitted and Aggregated Self-Attention for Automatic Speech Recognition

Tian-Hao Zhang, Xinyuan Qian, Feng Chen et al.

2024 INTERSPEECH

TSE-PI: Target Sound Extraction under Reverberant Environments with Pitch Information

Yiwen Wang, Xihong Wu

2024 INTERSPEECH

TSP-TTS: Text-based Style Predictor with Residual Vector Quantization for Expressive Text-to-Speech

Donghyun Seong, Hoyoung Lee, Joon-Hyuk Chang

2024 INTERSPEECH

Uh, um and mh: Are filled pauses prone to conversational converge?

Mathilde Hutin, Junfei Hu, Liesbeth Degand

2024 INTERSPEECH

Uncertainty-Aware Mean Opinion Score Prediction

Hui Wang, Shiwan Zhao, Jiaming Zhou et al.

2024 INTERSPEECH

Understanding Sounds, Missing the Questions: The Challenge of Object Hallucination in Large Audio-Language Models

Chun-Yi Kuan, Wei-Ping Huang, Hung-yi Lee

2024 INTERSPEECH

Understanding “understanding”: presenting a richly annotated multimodal corpus of dyadic interaction

Leonie Schade, Nico Dallmann, Olcay Tük et al.

2024 INTERSPEECH

Unified Audio Visual Cues for Target Speaker Extraction

Tianci Wu, Shulin He, Jiahui Pan et al.

2024 INTERSPEECH

Unified Framework for Spoken Language Understanding and Summarization in Task-Based Human Dialog processing

Eunice Akani, Frederic Bechet, Benoît Favre et al.

2024 INTERSPEECH

Unified Multi-Talker ASR with and without Target-speaker Enrollment

Ryo Masumura, Naoki Makishima, Tomohiro Tanaka et al.

2024 INTERSPEECH

UNIQUE : Unsupervised Network for Integrated Speech Quality Evaluation

Juhwan Yoon, WooSeok Ko, Seyun Um et al.

2024 INTERSPEECH

Universal Score-based Speech Enhancement with High Content Preservation

Robin Scheibler, Yusuke Fujita, Yuma Shirahata et al.

2024 INTERSPEECH

Unmasking Neural Codecs: Forensic Identification of AI-compressed Speech

Denise Moussa, Sandra Bergmann, Christian Riess

2024 INTERSPEECH

Unsupervised Domain Adaptation for Speech Emotion Recognition using K-Nearest Neighbors Voice Conversion

Pravin Mote, Berrak Sisman, Carlos Busso

2024 INTERSPEECH

Papers