Papers
8,761 papers found
Factors Affecting the Intelligibility of Sine-Wave Speech
Fei Chen, Daniel Fogerty
Far-Field ASR Without Parallel Data
Vijayaditya Peddinti, Vimal Manohar, Yiming Wang et al.
Fast, Compact, and High Quality LSTM-RNN Based Statistical Parametric Speech Synthesizers for Mobile Devices
Heiga Zen, Yannis Agiomyrgiannakis, Niels Egberts et al.
Feature Learning and Automatic Segmentation for Dolphin Communication Analysis
Daniel Kohlsdorf, Denise Herzing, Thad Starner
Feature Learning with Raw-Waveform CLDNNs for Voice Activity Detection
Ruben Zazo, Tara N. Sainath, Gabor Simko et al.
First Step Towards End-to-End Parametric TTS Synthesis: Generating Spectral Parameters with Neural Attention
Wenfu Wang, Shuang Xu, Bo Xu
Flexible, Rapid Authoring of Goal-Orientated, Multi-Turn Dialogues Using the Task Completion Platform
Alex Marin, Paul Crook, Omar Zia Khan et al.
Formant Estimation and Tracking Using Deep Learning
Yehoshua Dissen, Joseph Keshet
Frequency Estimation from Waveforms Using Multi-Layered Neural Networks
Prateek Verma, Ronald W. Schafer
Fusing Acoustic Feature Representations for Computational Paralinguistics Tasks
Heysem Kaya, Alexey A. Karpov
Fusion Strategies for Robust Speech Recognition and Keyword Spotting for Channel- and Noise-Degraded Speech
Vikramjit Mitra, Julien VanHout, Wen Wang et al.
Future Context Attention for Unidirectional LSTM Based Acoustic Model
Jian Tang, Shiliang Zhang, Si Wei et al.
Gating Recurrent Enhanced Memory Neural Networks on Language Identification
Wang Geng, Yuanyuan Zhao, Wenfu Wang et al.
Generalized Discriminant Analysis (GDA) for Improved i-Vector Based Speaker Recognition
Fahimeh Bahmaninezhad, John H.L. Hansen
Generalizing Steady State Suppression for Enhanced Intelligibility Under Reverberation
Petko N. Petkov, Yannis Stylianou
Generating Complementary Acoustic Model Spaces in DNN-Based Sequence-to-Frame DTW Scheme for Out-of-Vocabulary Spoken Term Detection
Shi-wook Lee, Kazuyo Tanaka, Yoshiaki Itoh
Generating Gestural Scores from Acoustics Through a Sparse Anchor-Based Representation of Speech
Christopher Liberatore, Ricardo Gutierrez-Osuna
Generating Natural Video Descriptions via Multimodal Processing
Qin Jin, Junwei Liang, Xiaozhu Lin
Generation and Pruning of Pronunciation Variants to Improve ASR Accuracy
Zhenhao Ge, Aravind Ganapathiraju, Ananth N. Iyer et al.
Generation of Emotion Control Vector Using MDS-Based Space Transformation for Expressive Speech Synthesis
Yan-You Chen, Chung-Hsien Wu, Yu-Fong Huang
Generative Acoustic-Phonemic-Speaker Model Based on Three-Way Restricted Boltzmann Machine
Toru Nakashika, Yasuhiro Minami
Glimpse-Based Metrics for Predicting Speech Intelligibility in Additive Noise Conditions
Yan Tang, Martin Cooke
Glottal Squeaks in VC Sequences
Míša Hejná, Pertti Palo, Scott Moisik