Papers
Improved Contextualized Speech Representations for Tonal Analysis
Jiahong Yuan, Xingyu Cai, Kenneth Church
Improved DeepFake Detection Using Whisper Features
Piotr Kawa, Marcin Plata, Michał Czuba et al.
Improved Training for End-to-End Streaming Automatic Speech Recognition Model with Punctuation
Hanbyul Kim, Seunghyun Seo, Lukas Lee et al.
Improving Bilingual TTS Using Language And Phonology Embedding With Embedding Strength Modulator
Fengyu Yang, Jian Luan, Meng Meng et al.
Improving Code-Switching and Name Entity Recognition in ASR with Speech Editing based Data Augmentation
Zheng Liang, Zheshu Song, Ziyang Ma et al.
Improving End-to-End Modeling For Mandarin-English Code-Switching Using Lightweight Switch-Routing Mixture-of-Experts
Fengyun Tan, Chaofeng Feng, Tao Wei et al.
Improving End-to-End Neural Diarization Using Conversational Summary Representations
Samuel J. Broughton, Lahiru Samarakoon
Improving Frame-level Classifier for Word Timings with Non-peaky CTC in End-to-End Automatic Speech Recognition
Xianzhao Chen, Yist Y. Lin, Kang Wang et al.
Improving Generalization Ability of Countermeasures for New Mismatch Scenario by Combining Multiple Advanced Regularization Terms
Chang Zeng, Xin Wang, Xiaoxiao Miao et al.
Improving grapheme-to-phoneme conversion by learning pronunciations from speech recordings
Sam Ribeiro, Giulia Comini, Jaime Lorenzo-Trueba
Improving Isochronous Machine Translation with Target Factors and Auxiliary Counters
Proyag Pal, Brian Thompson, Yogesh Virkar et al.
Improving Joint Speech and Emotion Recognition Using Global Style Tokens
Jehyun Kyung, Ju-Seok Seong, Jeong-Hwan Choi et al.
Improving Joint Speech-Text Representations Without Alignment
Cal Peyser, Zhong Meng, Rohit Prabhavalkar et al.
Improving Label Assignments Learning by Dynamic Sample Dropout Combined with Layer-wise Optimization in Speech Separation
Chenyang Gao, Yue Gu, Ivan Marsic
Improving RNN Transducer Acoustic Models for English Conversational Speech Recognition
Xiaodong Cui, George Saon, Brian Kingsbury
Improving RNN-Transducers with Acoustic LookAhead
Vinit S. Unni, Ashish Mittal, Preethi Jyothi et al.
Improving Small Footprint Few-shot Keyword Spotting with Supervision on Auxiliary Data
Seunghan Yang, Byeonggeun Kim, Kyuhong Shim et al.
Improving Speaker Verification with Self-Pretrained Transformer Models
Junyi Peng, Oldřich Plchot, Themos Stafylakis et al.
Improving Textless Spoken Language Understanding with Discrete Units as Intermediate Target
Guan-Wei Wu, Guan-Ting Lin, Shang-Wen Li et al.
Improving the Gap in Visual Speech Recognition Between Normal and Silent Speech Based on Metric Learning
Sara Kashiwagi, Keitaro Tanaka, Qi Feng et al.
Improving the response timing estimation for spoken dialogue systems by reducing the effect of speech recognition delay
Jin Sakuma, Shinya Fujie, Huaibo Zhao et al.
Improving training datasets for resource-constrained speaker recognition neural networks
Pierre-Michel Bousquet, Mickael Rouvier
Improving Under-Resourced Code-Switched Speech Recognition: Large Pre-trained Models or Architectural Interventions
Joshua Jansen van Vüren, Thomas Niesler