Papers
SpecAugment++: A Hidden Space Data Augmentation Method for Acoustic Scene Classification
Helin Wang, Yuexian Zou, Wenwu Wang
SpecMix : A Mixed Sample Data Augmentation Method for Training with Time-Frequency Domain Features
Gwantae Kim, David K. Han, Hanseok Ko
SpecRec: An Alternative Solution for Improving End-to-End Speech-to-Text Translation via Spectrogram Reconstruction
Junkun Chen, Mingbo Ma, Renjie Zheng et al.
Spectral and Latent Speech Representation Distortion for TTS Evaluation
Thananchai Kongthaworn, Burin Naowarat, Ekapol Chuangsuwanich
Spectro-Temporal Deep Features for Disordered Speech Assessment and Recognition
Mengzhe Geng, Shansong Liu, Jianwei Yu et al.
Speech2Video: Cross-Modal Distillation for Speech to Video Generation
Shijing Si, Jianzong Wang, Xiaoyang Qu et al.
Speech Acoustic Modelling Using Raw Source and Filter Components
Erfan Loweimi, Zoran Cvetkovic, Peter Bell et al.
Speech Activity Detection Based on Multilingual Speech Recognition System
Seyyed Saeed Sarfjoo, Srikanth Madikeri, Petr Motlicek
SpeechAdjuster: A Tool for Investigating Listener Preferences and Speech Intelligibility
Olympia Simantiraki, Martin Cooke
Speech Based Depression Severity Level Classification Using a Multi-Stage Dilated CNN-LSTM Model
Nadee Seneviratne, Carol Espy-Wilson
Speech Decomposition Based on a Hybrid Speech Model and Optimal Segmentation
Alfredo Esquivel Jaramillo, Jesper Kjær Nielsen, Mads Græsbøll Christensen
Speech Denoising with Auditory Models
Mark R. Saddler, Andrew Francl, Jenelle Feather et al.
Speech Denoising Without Clean Training Data: A Noise2Noise Approach
Madhav Mahesh Kashyap, Anuj Tambwekar, Krishnamoorthy Manohara et al.
Speech Disorder Classification Using Extended Factorized Hierarchical Variational Auto-Encoders
Jinzi Qi, Hugo Van hamme
Speech Emotion Recognition Based on Attention Weight Correction Using Word-Level Confidence Measure
Jennifer Santoso, Takeshi Yamada, Shoji Makino et al.
Speech Emotion Recognition via Multi-Level Cross-Modal Distillation
Ruichen Li, Jinming Zhao, Qin Jin
Speech Emotion Recognition with Multi-Task Learning
Xingyu Cai, Jiahong Yuan, Renjie Zheng et al.
Speech Enhancement with Topology-Enhanced Generative Adversarial Networks (GANs)
Xudong Zhang, Liang Zhao, Feng Gu
Speech Enhancement with Weakly Labelled Data from AudioSet
Qiuqiang Kong, Haohe Liu, Xingjian Du et al.
Speech Intelligibility of Dysarthric Speech: Human Scores and Acoustic-Phonetic Features
Wei Xue, Roeland van Hout, Fleur Boogmans et al.
SpeechMoE: Scaling to Large Acoustic Models with Dynamic Routing Mixture of Experts
Zhao You, Shulin Feng, Dan Su et al.
speechocean762: An Open-Source Non-Native English Speech Corpus for Pronunciation Assessment
Junbo Zhang, Zhiwen Zhang, Yongqing Wang et al.
Speech Perception and Loanword Adaptations: The Case of Copy-Vowel Epenthesis
Adriana Guevara-Rukoz, Shi Yu, Sharon Peperkamp
Speech Representation Learning Combining Conformer CPC with Deep Cluster for the ZeroSpeech Challenge 2021
Takashi Maekaku, Xuankai Chang, Yuya Fujita et al.
Speech Resynthesis from Discrete Disentangled Self-Supervised Representations
Adam Polyak, Yossi Adi, Jade Copet et al.