Research Explorer

Sign Value Constraint Decomposition for Efficient 1-Bit Quantization of Speech Translation Tasks

Nan Chen, Yonghe Wang, Feilong Bao

2024 INTERSPEECH

SilentCipher: Deep Audio Watermarking

Mayank Kumar Singh, Naoya Takahashi, Weihsiang Liao et al.

2024 INTERSPEECH

SimpleSpeech: Towards Simple and Efficient Text-to-Speech with Scalar Latent Transformer Diffusion Models

Dongchao Yang, Dingdong Wang, Haohan Guo et al.

2024 INTERSPEECH

Simulating articulatory trajectories with phonological feature interpolation

Angelo Ortiz Tandazo, Thomas Schatz, Thomas Hueber et al.

2024 INTERSPEECH

Simul-Whisper: Attention-Guided Streaming Whisper with Truncation Detection

Haoyu Wang, Guoqiang Hu, Guodong Lin et al.

2024 INTERSPEECH

SimuSOE: A Simulated Snoring Dataset for Obstructive Sleep Apnea-Hypopnea Syndrome Evaluation during Wakefulness

Jie Lin, Xiuping Yang, Li Xiao et al.

2024 INTERSPEECH

Singing Voice Data Scaling-up: An Introduction to ACE-Opencpop and ACE-KiSing

Jiatong Shi, Yueqian Lin, Xinyi Bai et al.

2024 INTERSPEECH

Singing Voice Graph Modeling for SingFake Detection

Xuanjun Chen, Haibin Wu, Roger Jang et al.

2024 INTERSPEECH

Single-Codec: Single-Codebook Speech Codec towards High-Performance Speech Generation

Hanzhao Li, Liumeng Xue, Haohan Guo et al.

2024 INTERSPEECH

SingOMD: Singing Oriented Multi-resolution Discrete Representation Construction from Speech Models

Yuxun Tang, Yuning Wu, Jiatong Shi et al.

2024 INTERSPEECH

Small-E: Small Language Model with Linear Attention for Efficient Speech Synthesis

Théodor Lemerle, Nicolas Obin, Axel Roebel

2024 INTERSPEECH

Soft Language Identification for Language-Agnostic Many-to-One End-to-End Speech Translation

Peidong Wang, Jian Xue, Jinyu Li et al.

2024 INTERSPEECH

SOMSRED: Sequential Output Modeling for Joint Multi-talker Overlapped Speech Recognition and Speaker Diarization

Naoki Makishima, Naotaka Kawata, Mana Ihori et al.

2024 INTERSPEECH

“So . . . my child . . . ” – How Child ADHD Influences the Way Parents Talk

Anika A. Spiesberger, Andreas Triantafyllopoulos, Alexander Kathan et al.

2024 INTERSPEECH

Song Data Cleansing for End-to-End Neural Singer Diarization Using Neural Analysis and Synthesis Framework

Hokuto Munakata, Ryo Terashima, Yusuke Fujita

2024 INTERSPEECH

SOT Triggered Neural Clustering for Speaker Attributed ASR

Xianrui Zheng, Guangzhi Sun, Chao Zhang et al.

2024 INTERSPEECH

Sound Event Bounding Boxes

Janek Ebbers, François G. Germain, Gordon Wichern et al.

2024 INTERSPEECH

Sound of Traffic: A Dataset for Acoustic Traffic Identification and Counting

Shabnam Ghaffarzadegan, Luca Bondi, Wei-Chang Lin et al.

2024 INTERSPEECH

Sound of Vision: Audio Generation from Visual Text Embedding through Training Domain Discriminator

Jaewon Kim, Won-Gook Choi, Seyun Ahn et al.

2024 INTERSPEECH

Source Tracing of Audio Deepfake Systems

Nicholas Klein, Tianxiang Chen, Hemlata Tak et al.

2024 INTERSPEECH

Sparse Binarization for Fast Keyword Spotting

Jonathan Svirsky, Uri Shaham, Ofir Lindenbaum

2024 INTERSPEECH

SparseWAV: Fast and Accurate One-Shot Unstructured Pruning for Large Speech Foundation Models

Tianteng Gu, Bei Liu, Hang Shao et al.

2024 INTERSPEECH

SPA-SVC: Self-supervised Pitch Augmentation for Singing Voice Conversion

Bingsong Bai, Fengping Wang, Yingming Gao et al.

2024 INTERSPEECH

Spatial Acoustic Enhancement Using Unbiased Relative Harmonic Coefficients

Liang Tao, Maoshen Jia, Yonggang Hu et al.

2024 INTERSPEECH

Spatial Voice Conversion: Voice Conversion Preserving Spatial Information and Non-target Signals

Kentaro Seki, Shinnosuke Takamichi, Norihiro Takamune et al.

2024 INTERSPEECH

Papers