Papers
Vector-Quantized Autoregressive Predictive Coding
Yu-An Chung, Hao Tang, James Glass
Vector-Quantized Neural Networks for Acoustic Unit Discovery in the ZeroSpeech 2020 Challenge
Benjamin van Niekerk, Leanne Nortje, Herman Kamper
Vector Quantized Temporally-Aware Correspondence Sparse Autoencoders for Zero-Resource Acoustic Unit Discovery
Batuhan Gundogdu, Bolaji Yusuf, Mansur Yesilbursa et al.
Virtual Acoustic Channel Expansion Based on Neural Networks for Weighted Prediction Error-Based Speech Dereverberation
Joon-Young Yang, Joon-Hyuk Chang
Visual Speech In Real Noisy Environments (VISION): A Novel Benchmark Dataset and Deep Learning-Based Baseline System
Mandar Gogate, Kia Dashtipour, Amir Hussain
Vocal Markers from Sustained Phonation in Huntington’s Disease
Rachid Riad, Hadrien Titeux, Laurie Lemoine et al.
VocGAN: A High-Fidelity Real-Time Vocoder with a Hierarchically-Nested Adversarial Network
Jinhyeok Yang, Junmo Lee, Youngik Kim et al.
Vocoder-Based Speech Synthesis from Silent Videos
Daniel Michelsanti, Olga Slizovskaia, Gloria Haro et al.
Voice Activity Detection in the Wild via Weakly Supervised Sound Event Detection
Yefei Chen, Heinrich Dinkel, Mengyue Wu et al.
Voice Conversion Based Data Augmentation to Improve Children’s Speech Recognition in Limited Data Scenario
S. Shahnawazuddin, Nagaraj Adiga, Kunal Kumar et al.
Voice Conversion Using Speech-to-Speech Neuro-Style Transfer
Ehab A. AlBadawy, Siwei Lyu
VoiceFilter-Lite: Streaming Targeted Voice Separation for On-Device Speech Recognition
Quan Wang, Ignacio Lopez Moreno, Mert Saglam et al.
VoiceID on the Fly: A Speaker Recognition System that Learns from Scratch
Baihan Lin, Xinxin Zhang
Voice Transformer Network: Sequence-to-Sequence Voice Conversion Using Transformer with Text-to-Speech Pretraining
Wen-Chin Huang, Tomoki Hayashi, Yi-Chiao Wu et al.
Voicing Distinction of Obstruents in the Hangzhou Wu Chinese Dialect
Yang Yue, Fang Hu
VOP Detection in Variable Speech Rate Condition
Ayush Agarwal, Jagabandhu Mishra, S.R. Mahadeva Prasanna
VQVC+: One-Shot Voice Conversion by Vector Quantization and U-Net Architecture
Da-Yi Wu, Yen-Hao Chen, Hung-yi Lee
Wake Word Detection with Alignment-Free Lattice-Free MMI
Yiming Wang, Hang Lv, Daniel Povey et al.
Wav2Spk: A Simple DNN Architecture for Learning Speaker Embeddings from Waveforms
Weiwei Lin, Man-Wai Mak
Weak-Attention Suppression for Transformer Based Speech Recognition
Yangyang Shi, Yongqiang Wang, Chunyang Wu et al.
Weakly Supervised Training of Hierarchical Attention Networks for Speaker Identification
Yanpei Shi, Qiang Huang, Thomas Hain
WG-WaveNet: Real-Time High-Fidelity Speech Synthesis Without GPU
Po-chun Hsu, Hung-yi Lee
What Does an End-to-End Dialect Identification Model Learn About Non-Dialectal Information?
Shammur A. Chowdhury, Ahmed Ali, Suwon Shon et al.
What the Future Brings: Investigating the Impact of Lookahead for Incremental Neural TTS
Brooke Stephenson, Laurent Besacier, Laurent Girin et al.