Research Explorer

Karaoker: Alignment-free singing voice synthesis with speech training data

Panagiotis Kakoulidis, Nikolaos Ellinas, Georgios Vamvoukakis et al.

2022 INTERSPEECH

KaraTuner: Towards End-to-End Natural Pitch Correction for Singing Voice in Karaoke

Xiaobin Zhuang, Huiran Yu, Weifeng Zhao et al.

2022 INTERSPEECH

Keyword Spotting with Synthetic Data using Heterogeneous Knowledge Distillation

Yuna Lee, Seung Jun Baek

2022 INTERSPEECH

kidsTALC: A Corpus of 3- to 11-year-old German Children’s Connected Natural Speech

Lars Rumberg, Christopher Gebauer, Hanna Ehlert et al.

2022 INTERSPEECH

Knowledge Distillation For CTC-based Speech Recognition Via Consistent Acoustic Representation Learning

Sanli Tian, Keqi Deng, Zehan Li et al.

2022 INTERSPEECH

Knowledge distillation for In-memory keyword spotting model

Zeyang Song, Qi Liu, Qu Yang et al.

2022 INTERSPEECH

Knowledge Distillation via Module Replacing for Automatic Speech Recognition with Recurrent Neural Network Transducer

Kaiqi Zhao, Hieu Nguyen, Animesh Jain et al.

2022 INTERSPEECH

Knowledge of accent differences can be used to predict speech recognition

Tuende Szalay, Mostafa Shahin, Beena Ahmed et al.

2022 INTERSPEECH

Knowledge Transfer and Distillation from Autoregressive to Non-Autoregessive Speech Recognition

Xun Gong, Zhikai Zhou, Yanmin Qian

2022 INTERSPEECH

KSC2: An Industrial-Scale Open-Source Kazakh Speech Corpus

Saida Mussakhojayeva, Yerbolat Khassanov, Huseyin Atakan Varol

2022 INTERSPEECH

K-Wav2vec 2.0: Automatic Speech Recognition based on Joint Decoding of Graphemes and Syllables

Jounghee Kim, Pilsung Kang

2022 INTERSPEECH

L2-GEN: A Neural Phoneme Paraphrasing Approach to L2 Speech Synthesis for Mispronunciation Diagnosis

Daniel Zhang, Ashwinkumar Ganesan, Sarah Campbell et al.

2022 INTERSPEECH

Label-Efficient Self-Supervised Speaker Verification With Information Maximization and Contrastive Learning

Theo Lepage, Reda Dehak

2022 INTERSPEECH

LAE: Language-Aware Encoder for Monolingual and Multilingual ASR

Jinchuan Tian, Jianwei Yu, Chunlei Zhang et al.

2022 INTERSPEECH

Language Model-Based Emotion Prediction Methods for Emotional Speech Synthesis Systems

Hyun-Wook Yoon, Ohsung Kwon, Hoyeon Lee et al.

2022 INTERSPEECH

Language-specific Characteristic Assistance for Code-switching Speech Recognition

Tongtong Song, Qiang Xu, Meng Ge et al.

2022 INTERSPEECH

Language-specific interactions of vowel discrimination in noise

Mark Gibson, Marcel Schlechtweg, Beatriz Blecua Falgueras et al.

2022 INTERSPEECH

Large-Scale Streaming End-to-End Speech Translation with Neural Transducers

Jian Xue, Peidong Wang, Jinyu Li et al.

2022 INTERSPEECH

Latency Control for Keyword Spotting

Christin Jose, Joe Wang, Grant Strimel et al.

2022 INTERSPEECH

LCSM: A Lightweight Complex Spectral Mapping Framework for Stereophonic Acoustic Echo Cancellation

Chenggang Zhang, JinJiang Liu, Xueliang Zhang

2022 INTERSPEECH

Learn2Sing 2.0: Diffusion and Mutual Information-Based Target Speaker SVS by Learning from Singing Teacher

Heyang Xue, Xinsheng Wang, Yongmao Zhang et al.

2022 INTERSPEECH

Learnable Sparse Filterbank for Speaker Verification

Junyi Peng, Rongzhi Gu, Ladislav Mošner et al.

2022 INTERSPEECH

Learning Audio-Text Agreement for Open-vocabulary Keyword Spotting

Hyeon-Kyeong Shin, Hyewon Han, Doyeon Kim et al.

2022 INTERSPEECH

Learning from human perception to improve automatic speaker verification in style-mismatched conditions

Amber Afshan, Abeer Alwan

2022 INTERSPEECH

Learning Lip-Based Audio-Visual Speaker Embeddings with AV-HuBERT

Bowen Shi, Abdelrahman Mohamed, Wei-Ning Hsu

2022 INTERSPEECH

Papers