Papers
8,761 papers found
Seamless Language Expansion: Enhancing Multilingual Mastery in Self-Supervised Models
Jing Xu, Minglin Wu, Xixin Wu et al.
Searching for Structure: Appraising the Organisation of Speech Features in wav2vec 2.0 Embeddings
Patrick Cormac English, John D. Kelleher, Julie Carson-Berndsen
SE/BN Adapter: Parametric Efficient Domain Adaptation for Speaker Recognition
Tianhao Wang, Lantian Li, Dong Wang
SecureSpectra: Safeguarding Digital Identity from Deep Fake Threats via Intelligent Signatures
Oguzhan Baser, Kaan Kale, Sandeep P. Chinchali
Segmental and Suprasegmental Speech Foundation Models for Classifying Cognitive Risk Factors: Evaluating Out-of-the-Box Performance
Si-Ioi Ng, Lingfeng Xu, Kimberly D. Mueller et al.
Self-Supervised Embeddings for Detecting Individual Symptoms of Depression
Sri Harsha Dumpala, Katerina Dikaios, Abraham Nunes et al.
Self-Supervised Learning for ASR Pre-Training with Uniquely Determined Target Labels and Controlling Cepstrum Truncation for Speech Augmentation
Akihiro Kato, Hiroyuki Nagano, Kohei Chike et al.
Self-Supervised Learning with Multi-Head Multi-Mode Knowledge Distillation for Speaker Verification
Zezhong Jin, Youzhi Tu, Man-Wai Mak
Self-Supervised Models for Phoneme Recognition: Applications in Children's Speech for Reading Learning
Lucas Block Medin, Thomas Pellegrini, Lucile Gelin
Self-Supervised Speaker Verification with Mini-Batch Prediction Correction
Junxu Wang, Zhihua Fang, Liang He
Self-supervised speaker verification with relational mask prediction
Ju-ho Kim, Hee-Soo Heo, Bong-Jin Lee et al.
Self-Supervised Speech Representations are More Phonetic than Semantic
Kwanghee Choi, Ankita Pasad, Tomohiko Nakamura et al.
Self-supervised Speech Representations Still Struggle with African American Vernacular English
Kalvin Chang, Yi-Hui Chou, Jiatong Shi et al.
Self-Train Before You Transcribe
Robert Flynn, Anton Ragni
Self-training ASR Guided by Unsupervised ASR Teacher
Hyung Yong Kim, Byeong-Yeol Kim, Yunkyu Lim et al.
SELM: Enhancing Speech Emotion Recognition for Out-of-Domain Scenarios
Hazim Bukhari, Soham Deshmukh, Hira Dhamyal et al.
SeMaScore: A new evaluation metric for automatic speech recognition tasks
Zitha Sasindran, Harsha Yelchuri, T. V. Prabhakar
Sentence-wise Speech Summarization: Task, Datasets, and End-to-End Modeling with LM Knowledge Distillation
Kohei Matsuura, Takanori Ashihara, Takafumi Moriya et al.
SEQ-former: A context-enhanced and efficient automatic speech recognition framework
Qinglin Meng, Min Liu, Kaixun Huang et al.
Sequential Editing for Lifelong Training of Speech Recognition Models
Devang Kulshreshtha, Nikolaos Pappas, Brady Houston et al.
SER Evals: In-domain and Out-of-domain benchmarking for speech emotion recognition
Mohamed Osman, Daniel Z. Kaplan, Tamer Nadeem
Serialized Output Training by Learned Dominance
Ying Shi, Lantian Li, Shi Yin et al.
Should you use a probabilistic duration model in TTS? Probably! Especially for spontaneous speech
Shivam Mehta, Harm Lameris, Rajiv Punmiya et al.
Signal processing algorithm effective for sound quality of hearing loss simulators
Toshio Irino, Shintaro Doan, Minami Ishikawa