Papers
Alzheimer's Detection from English to Spanish Using Acoustic and Linguistic Embeddings
Paula Andrea Pérez-Toro, Philipp Klumpp, Abner Hernandez et al.
A Multi-grained based Attention Network for Semi-supervised Sound Event Detection
Ying Hu, Xiujuan Zhu, Yunlong Li et al.
A Multi-level Acoustic Feature Extraction Framework for Transformer Based End-to-End Speech Recognition
Jin Li, Rongfeng Su, Xurong Xie et al.
A Multimodal Strategy for Singing Language Identification
Wo Jae Lee, Emanuele Coviello
A Multi-Scale Time-Frequency Spectrogram Discriminator for GAN-based Non-Autoregressive TTS
Haohan Guo, Hui Lu, Xixin Wu et al.
A Multi-Stage Multi-Codebook VQ-VAE Approach to High-Performance Neural TTS
Haohan Guo, Feng-Long Xie, Frank Soong et al.
A Multi-Task BERT Model for Schema-Guided Dialogue State Tracking
Eleftherios Kapelonis, Efthymios Georgiou, Alexandros Potamianos
An Alignment Method Leveraging Articulatory Features for Mispronunciation Detection and Diagnosis in L2 English
Qi Chen, BingHuai Lin, YanLu Xie
Analysis of expressivity transfer in non-autoregressive end-to-end multispeaker TTS systems
Ajinkya Kulkarni, Vincent Colotte, Denis Jouvet
Analysis of praising skills focusing on utterance contents
Asahi Ogushi, Toshiki Onishi, Yohei Tahara et al.
Analysis of Self-Attention Head Diversity for Conformer-based Automatic Speech Recognition
Kartik Audhkhasi, Yinghui Huang, Bhuvana Ramabhadran et al.
Analysis of Self-Supervised Learning and Dimensionality Reduction Methods in Clustering-Based Active Learning for Speech Emotion Recognition
Einari Vaaras, Manu Airaksinen, Okko Räsänen
Analyzing Language-Independent Speaker Anonymization Framework under Unseen Conditions
Xiaoxiao Miao, Xin Wang, Erica Cooper et al.
Analyzing the impact of SARS-CoV-2 variants on respiratory sound signals
Debarpan Bhattacharya, Debottam Dutta, Neeraj Sharma et al.
An Anchor-Free Detector for Continuous Speech Keyword Spotting
Zhiyuan Zhao, Chuanxin Tang, Chengdong Yao et al.
An Attention-Based Method for Guiding Attribute-Aligned Speech Representation Learning
Yu-Lin Huang, Bo-Hao Su, Y.-W. Peter Hong et al.
An Automated Mood Diary for Older User’s using Ambient Assisted Living Recorded Speech
Fasih Haider, Saturnino Luz
An Automatic Soundtracking System for Text-to-Speech Audiobooks
Zikai Chen, Lin Wu, Junjie Pan et al.
An Efficient and High Fidelity Vietnamese Streaming End-to-End Speech Synthesis
Tho Nguyen Duc Tran, The Chuong Chu, Vu Hoang et al.
An Empirical Analysis on the Vulnerabilities of End-to-End Speech Segregation Models
Rahil Parikh, Gaspar Rochette, Carol Espy-Wilson et al.
An Empirical Study of Language Model Integration for Transducer based Speech Recognition
Huahuan Zheng, keyu An, Zhijian Ou et al.
An End-to-End Macaque Voiceprint Verification Method Based on Channel Fusion Mechanism
Peng Liu, Songbin Li, Jigang Tang
An Evaluation of Three-Stage Voice Conversion Framework for Noisy and Reverberant Conditions
Yeonjong Choi, Chao Xie, Tomoki Toda
An Exploration of Prompt Tuning on Generative Spoken Language Model for Speech Processing Tasks
Kai-Wei Chang, Wei-Cheng Tseng, Shang-Wen Li et al.