Papers
A Deep Learning-Based Kalman Filter for Speech Enhancement
Sujan Kumar Roy, Aaron Nicolson, Kuldip K. Paliwal
A Differentiable Perceptual Audio Metric Learned from Just Noticeable Differences
Pranay Manocha, Adam Finkelstein, Richard Zhang et al.
A DNN-HMM-DNN Hybrid Model for Discovering Word-Like Units from Spoken Captions and Image Regions
Liming Wang, Mark Hasegawa-Johnson
Advancing Multiple Instance Learning with Attention Modeling for Categorical Speech Emotion Recognition
Shuiyang Mao, P.C. Ching, C.-C. Jay Kuo et al.
Adventitious Respiratory Classification Using Attentive Residual Neural Networks
Zijiang Yang, Shuo Liu, Meishu Song et al.
Adversarial Audio: A New Information Hiding Method
Yehao Kong, Jiliang Zhang
Adversarial Dictionary Learning for Monaural Speech Enhancement
Yunyun Ji, Longting Xu, Wei-Ping Zhu
Adversarial Domain Adaptation for Speaker Verification Using Partially Shared Network
Zhengyang Chen, Shuai Wang, Yanmin Qian
Adversarial Latent Representation Learning for Speech Enhancement
Yuanhang Qiu, Ruili Wang
Adversarial Separation Network for Speaker Recognition
Hanyi Zhang, Longbiao Wang, Yunchun Zhang et al.
A Dynamic 3D Pronunciation Teaching Model Based on Pronunciation Attributes and Anatomy
Xiaoli Feng, Yanlu Xie, Yayue Deng et al.
A Federated Approach in Training Acoustic Models
Dimitrios Dimitriadis, Kenichi Kumatani, Robert Gmyr et al.
Affective Conditioning on Hierarchical Attention Networks Applied to Depression Detection from Transcribed Clinical Interviews
Danai Xezonaki, Georgios Paraskevopoulos, Alexandros Potamianos et al.
Age-Related Differences of Tone Perception in Mandarin-Speaking Seniors
Yan Feng, Gang Peng, William Shi-Yuan Wang
A Hybrid HMM-Waveglow Based Text-to-Speech Synthesizer Using Histogram Equalization for Low Resource Indian Languages
Mano Ranjith Kumar M., Sudhanshu Srivastava, Anusha Prakash et al.
Air-Tissue Boundary Segmentation in Real Time Magnetic Resonance Imaging Video Using 3-D Convolutional Neural Network
Renuka Mannem, Navaneetha Gaddam, Prasanta Kumar Ghosh
A Joint Framework for Audio Tagging and Weakly Supervised Acoustic Event Detection Using DenseNet with Global Average Pooling
Chieh-Chi Kao, Bowen Shi, Ming Sun et al.
A Lightweight Model Based on Separable Convolution for Speech Emotion Recognition
Ying Zhong, Ying Hu, Hao Huang et al.
All-in-One Transformer: Unifying Speech Recognition, Audio Tagging, and Event Detection
Niko Moritz, Gordon Wichern, Takaaki Hori et al.
A Low Latency ASR-Free End to End Spoken Language Understanding System
Mohamed Mhiri, Samuel Myer, Vikrant Singh Tomar
Alzheimer’s Dementia Recognition Through Spontaneous Speech: The ADReSS Challenge
Saturnino Luz, Fasih Haider, Sofia de la Fuente et al.
A Machine of Few Words: Interactive Speaker Recognition with Reinforcement Learning
Mathieu Seurin, Florian Strub, Philippe Preux et al.
A Mandarin L2 Learning APP with Mispronunciation Detection and Feedback
Yanlu Xie, Xiaoli Feng, Boxue Li et al.