Papers
Learning Speech Models from Multi-Modal Data
Karen Livescu
Learning Speech Structure to Improve Time-Frequency Masks
Suliang Bu, Yunxin Zhao, Shaojun Wang et al.
Learning to Rank Microphones for Distant Speech Recognition
Samuele Cornell, Alessio Brutti, Marco Matassoni et al.
LeBenchmark: A Reproducible Framework for Assessing Self-Supervised Representation Learning from Speech
Solène Evain, Ha Nguyen, Hang Le et al.
Leveraging ASR N-Best in Deep Entity Retrieval
Haoyu Wang, John Chen, Majid Laali et al.
Leveraging Phone Mask Training for Phonetic-Reduction-Robust E2E Uyghur Speech Recognition
Guodong Ma, Pengfei Hu, Jian Kang et al.
Leveraging Pre-Trained Language Model for Speech Sentiment Analysis
Suwon Shon, Pablo Brusco, Jing Pan et al.
Leveraging Real-Time MRI for Illuminating Linguistic Velum Action
Miran Oh, Dani Byrd, Shrikanth S. Narayanan
Leveraging Speaker Attribute Information Using Multi Task Learning for Speaker Verification and Diarization
Chau Luu, Peter Bell, Steve Renals
Lexical Density Analysis of Word Productions in Japanese English Using Acoustic Word Embeddings
Shintaro Ando, Nobuaki Minematsu, Daisuke Saito
Lexical Entrainment and Intra-Speaker Variability in Cooperative Dialogues
Alla Menshikova, Daniil Kocharov, Tatiana Kachkovskaia
Lexical Modeling of ASR Errors for Robust Speech Translation
Giuseppe Martucci, Mauro Cettolo, Matteo Negri et al.
Librispeech Transducer Model with Internal Language Model Prior Correction
Albert Zeyer, André Merboldt, Wilfried Michel et al.
Lightweight Causal Transformer with Local Self-Attention for Real-Time Speech Enhancement
Koen Oostermeijer, Qing Wang, Jun Du
Limited Data Emotional Voice Conversion Leveraging Text-to-Speech: Two-Stage Sequence-to-Sequence Training
Kun Zhou, Berrak Sisman, Haizhou Li
LinearSpeech: Parallel Text-to-Speech with Linear Complexity
Haozhe Zhang, Zhihua Huang, Zengqiang Shang et al.
LiRA: Learning Visual Speech Representations from Audio Through Self-Supervision
Pingchuan Ma, Rodrigo Mira, Stavros Petridis et al.
Listen with Intent: Improving Speech Recognition with Audio-to-Intent Front-End
Swayambhu Nath Ray, Minhua Wu, Anirudh Raju et al.
LiteTTS: A Lightweight Mel-Spectrogram-Free Text-to-Wave Synthesizer Based on Generative Adversarial Networks
Huu-Kim Nguyen, Kihyuk Jeong, Seyun Um et al.
Live Subtitling for BigBlueButton with Open-Source Software
Robert Geislinger, Benjamin Milde, Timo Baumann et al.
Live TV Subtitling Through Respeaking
Aleš Pražák, Zdeněk Loose, Josef V. Psutka et al.
Log-Likelihood-Ratio Cost Function as Objective Loss for Speaker Verification Systems
Victoria Mingote, Antonio Miguel, Alfonso Ortega et al.
Lookup-Table Recurrent Language Models for Long Tail Speech Recognition
W. Ronny Huang, Tara N. Sainath, Cal Peyser et al.