Papers
Multi-Microphone Adaptive Noise Cancellation for Robust Hotword Detection
Yiteng Huang, Turaj Z. Shabestary, Alexander Gruenstein et al.
Multimodal Articulation-Based Pronunciation Error Detection with Spectrogram and Acoustic Features
Sabrina Jenne, Ngoc Thang Vu
Multimodal Dialog with the MALACH Audiovisual Archive
Adam Chýlek, Luboš Šmídl, Jan Švec
Multi-Modal Learning for Speech Emotion Recognition: An Analysis and Comparison of ASR Outputs with Ground Truth Transcription
Saurabh Sahu, Vikramjit Mitra, Nadee Seneviratne et al.
Multimodal Response Obligation Detection with Unsupervised Online Domain Adaptation
Shota Horiguchi, Naoyuki Kanda, Kenji Nagamatsu
Multi-Modal Sentiment Analysis Using Deep Canonical Correlation Analysis
Zhongkai Sun, Prathusha K. Sarma, William Sethares et al.
Multimodal SpeakerBeam: Single Channel Target Speech Extraction with Audio-Visual Speaker Clues
Tsubasa Ochiai, Marc Delcroix, Keisuke Kinoshita et al.
Multimodal Word Discovery and Retrieval with Phone Sequence and Image Concepts
Liming Wang, Mark A. Hasegawa-Johnson
Multi-PLDA Diarization on Children’s Speech
Jiamin Xie, Leibny Paola García-Perera, Daniel Povey et al.
Multiple Sound Source Localization with SVD-PHAT
François Grondin, James Glass
Multi-Scale Time-Frequency Attention for Acoustic Event Detection
Jingyang Zhang, Wenhao Ding, Jintao Kang et al.
Multi-Span Acoustic Modelling Using Raw Waveform Signals
P. von Platen, Chao Zhang, P.C. Woodland
Multi-Stream Network with Temporal Attention for Environmental Sound Classification
Xinyu Li, Venkata Chebiyyam, Katrin Kirchhoff
Multi-Stride Self-Attention for Speech Recognition
Kyu J. Han, Jing Huang, Yun Tang et al.
Multi-Task CTC Training with Auxiliary Feature Reconstruction for End-to-End Speech Recognition
Gakuto Kurata, Kartik Audhkhasi
Multi-Task Discriminative Training of Hybrid DNN-TVM Model for Speaker Verification with Noisy and Far-Field Speech
Arindam Jati, Raghuveer Peri, Monisankha Pal et al.
Multi-Task Learning with High-Order Statistics for x-Vector Based Text-Independent Speaker Verification
Lanhua You, Wu Guo, Li-Rong Dai et al.
Multi-Task Multi-Network Joint-Learning of Deep Residual Networks and Cycle-Consistency Generative Adversarial Networks for Robust Speech Recognition
Shengkui Zhao, Chongjia Ni, Rong Tong et al.
Multi-Task Multi-Resolution Char-to-BPE Cross-Attention Decoder for End-to-End Speech Recognition
Dhananjaya Gowda, Abhinav Garg, Kwangyoun Kim et al.
Multiview Shared Subspace Learning Across Speakers and Speech Commands
Krishna Somandepalli, Naveen Kumar, Arindam Jati et al.
Music Genre Classification Using Duplicated Convolutional Layers in Neural Networks
Hansi Yang, Wei-Qiang Zhang
My Lips Are Concealed: Audio-Visual Speech Enhancement Through Obstructions
Triantafyllos Afouras, Joon Son Chung, Andrew Zisserman
Nasal Air Emission in Sibilant Fricatives of Cleft Lip and Palate Speech
Sishir Kalita, Protima Nomo Sudro, S.R. Mahadeva Prasanna et al.
Nasal Consonant Discrimination in Infant- and Adult-Directed Speech
Bogdan Ludusan, Annett Jorschick, Reiko Mazuka
Neural Machine Translation for Multilingual Grapheme-to-Phoneme Conversion
Alex Sokolov, Tracy Rohlin, Ariya Rastrow