Papers
Joint Speech Translation and Named Entity Recognition
Marco Gaido, Sara Papi, Matteo Negri et al.
Joint Time and Frequency Transformer for Chinese Opera Classification
Qiang Li, Beibei Hu
Knowledge Distillation Approach for Efficient Internal Language Model Estimation
Zhipeng Chen, Haihua Xu, Yerbolat Khassanov et al.
Knowledge Distillation for Neural Transducer-based Target-Speaker ASR: Exploiting Parallel Mixture/Single-Talker Speech Data
Takafumi Moriya, Hiroshi Sato, Tsubasa Ochiai et al.
Knowledge Distillation from Non-streaming to Streaming ASR Encoder using Auxiliary Non-streaming Layer
Kyuhong Shim, Jinkyu Lee, Simyoung Chang et al.
Knowledge Distillation on Joint Task End-to-End Speech Translation
Khandokar Md. Nayem, Ran Xue, Ching-Yun Chang et al.
Knowledge-Retrieval Task-Oriented Dialog Systems with Semi-Supervision
Yucheng Cai, Hong Liu, Zhijian Ou et al.
Knowledge Transfer from Pre-trained Language Models to Cif-based Speech Recognizers via Hierarchical Distillation
Minglun Han, Feilong Chen, Jing Shi et al.
L2-Mandarin regional accent variability during Mandarin tone-word training facilitates English listeners’ subsequent tone categorizations
Yanping Li, Michael D. Tyler, Denis Burnham et al.
Label Aware Speech Representation Learning For Language Identification
Shikhar Vashishth, Shikhar Bharadwaj, Sriram Ganapathy et al.
LAMASSU: A Streaming Language-Agnostic Multilingual Speech Recognition and Translation Model Using Neural Transducers
Peidong Wang, Eric Sun, Jian Xue et al.
Language Agnostic Data-Driven Inverse Text Normalization
Szu-Jui Chen, Debjyoti Paul, Yutong Pang et al.
Language Identification Networks for Multilingual Everyday Recordings
Kiran Praveen, Balaji Radhakrishnan, Kamini Sabu et al.
Language Model Personalization for Improved Touchscreen Typing
Jiban Adhikary, Keith Vertanen
Language-Routing Mixture of Experts for Multilingual and Code-Switching Speech Recognition
Wenxuan Wang, Guodong Ma, Yuke Li et al.
Language-specific Boundary Learning for Improving Mandarin-English Code-switching Speech Recognition
Zhiyun Fan, Linhao Dong, Chen Shen et al.
Language-universal Phonetic Encoder for Low-resource Speech Recognition
Siyuan Feng, Ming Tu, Rui Xia et al.
Language-Universal Phonetic Representation in Multilingual Speech Pretraining for Low-Resource Speech Recognition
Siyuan Feng, Ming Tu, Rui Xia et al.
LanSER: Language-Model Supported Speech Emotion Recognition
Taesik Gong, Josh Belanich, Krishna Somandepalli et al.
Large Dataset Generation of Synchronized Music Audio and Lyrics at Scale using Teacher-Student Paradigm
Cristian Chivriga, Rinita Roy
Large-Scale Automatic Audiobook Creation
Brendan Walsh, Mark Hamilton, Greg Newby et al.
Latent Phrase Matching for Dysarthric Speech
Dianna Yee, Colin Lea, Jaya Narain et al.
Laughter in task-based settings: whom we talk to affects how, when, and how often we laugh
Catarina Branco, Isabel Trancoso, Paulo Infante et al.