Papers
A Study of Gender Impact in Self-supervised Models for Speech-to-Text Systems
Marcely Zanon Boito, Laurent Besacier, Natalia Tomashenko et al.
A Study of Modeling Rising Intonation in Cantonese Neural Speech Synthesis
Qibing Bai, Tom Ko, Yu Zhang
A study of production error analysis for Mandarin-speaking Children with Hearing Impairment
Jingwen Cheng, Yuchen Yan, Yingming Gao et al.
A study on constraining Connectionist Temporal Classification for temporal audio alignment
Yann TEYTAUT, Baptiste Bouvier, Axel Roebel
A Study on the Phonetic Inventory Development of Children with Cochlear Implants for 5 Years after Implantation
Seonwoo Lee, Sunhee Kim, Minhwa Chung
A Subnetwork Approach for Spoofing Aware Speaker Verification
Alexander Alenin, Nikita Torgashov, Anton Okhotnikov et al.
Asymmetric Proxy Loss for Multi-View Acoustic Word Embeddings
Myunghun Jung, Hoi Rin Kim
A Systematic Comparison of Phonetic Aware Techniques for Speech Enhancement
Or Tal, Moshe Mandel, Felix Kreuk et al.
A Temporal Extension of Latent Dirichlet Allocation for Unsupervised Acoustic Unit Discovery
Werner van der Merwe, Herman Kamper, Johan Adam du Preez
A Transfer and Multi-Task Learning based Approach for MOS Prediction
Xiaohai Tian, Kaiqi Fu, Shaojun Gao et al.
ATST: Audio Representation Learning with Teacher-Student Transformer
Xian LI, Xiaofei Li
Attack Agnostic Dataset: Towards Generalization and Stabilization of Audio DeepFake Detection
Piotr Kawa, Marcin Plata, Piotr Syga
Attacker Attribution of Audio Deepfakes
Nicolas Müller, Franziska Diekmann, Jennifer Williams
Attention-based conditioning methods using variable frame rate for style-robust speaker verification
Amber Afshan, Abeer Alwan
Attention Weight Smoothing Using Prior Distributions for Transformer-Based End-to-End ASR
Takashi Maekaku, Yuya Fujita, Yifan Peng et al.
Attentive Feature Fusion for Robust Speaker Verification
Bei Liu, Zhengyang Chen, Yanmin Qian
Attentive Recurrent Network for Low-Latency Active Noise Control
Hao Zhang, Ashutosh Pandey, DeLiang Wang
Attentive Training: A New Training Framework for Talker-independent Speaker Extraction
Ashutosh Pandey, DeLiang Wang
Audio Pyramid Transformer with Domain Adaption for Weakly Supervised Sound Event Detection and Audio Classification
Yifei Xin, Dongchao Yang, Yuexian Zou
Audio Similarity is Unreliable as a Proxy for Audio Quality
Pranay Manocha, Zeyu Jin, Adam Finkelstein
AudioTagging Done Right: 2nd comparison of deep learning methods for environmental sound classification
Juncheng Li, Shuhui Qu, Po-Yao Huang et al.
Audio-Visual Domain Adaptation Feature Fusion for Speech Emotion Recognition
Jie Wei, Guanyu Hu, Xinyu Yang et al.
Audio-Visual Generalized Few-Shot Learning with Prototype-Based Co-Adaptation
Yi-Kai Zhang, Da-Wei Zhou, Han-Jia Ye et al.