Papers
Scoring of Large-Margin Embeddings for Speaker Verification: Cosine or PLDA?
Qiongqiong Wang, Kong Aik Lee, Tianchi Liu
ScoutWav: Two-Step Fine-Tuning on Self-Supervised Automatic Speech Recognition for Low-Resource Environments
Kavan Fatehi, Mercedes Torres Torres, Ayse Kucukyilmaz
Selective Pseudo-labeling and Class-wise Discriminative Fusion for Sound Event Detection
Yunhao Liang, Yanhua Long, Yijie Li et al.
Self-Distillation Based on High-level Information Supervision for Compressing End-to-End ASR Model
Qiang Xu, Tongtong Song, Longbiao Wang et al.
Self-Normalized Importance Sampling for Neural Language Modeling
Zijian Yang, Yingbo Gao, Alexander Gerstenberger et al.
Self-regularised Minimum Latency Training for Streaming Transformer-based Speech Recognition
Mohan Li, Rama Sanand Doddipatla, Catalin Zorila
SelfRemaster: Self-Supervised Speech Restoration with Analysis-by-Synthesis Approach Using Channel Modeling
Takaaki Saeki, Shinnosuke Takamichi, Tomohiko Nakamura et al.
Self-supervised Context-aware Style Representation for Expressive Speech Synthesis
Yihan Wu, Xi Wang, Shaofei Zhang et al.
Self supervised learning for robust voice cloning
Konstantinos Klapsas, Nikolaos Ellinas, Karolos Nikitaras et al.
Self-Supervised Learning with Multi-Target Contrastive Coding for Non-Native Acoustic Modeling of Mispronunciation Verification
Longfei Yang, Jinsong Zhang, Takahiro Shinozaki
Self-supervised Representation Fusion for Speech and Wearable Based Emotion Recognition
Vipula Dissanayake, Sachith Seneviratne, Hussel Suriyaarachchi et al.
Self-supervised Speaker Diarization
Yehoshua Dissen, Felix Kreuk, Joseph Keshet
Self-Supervised Speaker Verification Using Dynamic Loss-Gate and Label Correction
Bing Han, Zhengyang Chen, Yanmin Qian
Self-supervised speech unit discovery from articulatory and acoustic features using VQ-VAE
Marc-Antoine Georges, Jean-Luc Schwartz, Thomas Hueber
Semantically Meaningful Metrics for Norwegian ASR Systems
Janine Rugayan, Torbjørn Svendsen, Giampiero Salvi
Semi-FedSER: Semi-supervised Learning for Speech Emotion Recognition On Federated Learning using Multiview Pseudo-Labeling
Tiantian Feng, Shrikanth Narayanan
Semi-supervised Acoustic and Language Modeling for Hindi ASR
Tarun Sai Bandarupalli, Shakti Rath, Nirmesh Shah et al.
Sentence-Select: Large-Scale Language Model Data Selection for Rare-Word Speech Recognition
W. Ronny Huang, Cal Peyser, Tara Sainath et al.
Separate What You Describe: Language-Queried Audio Source Separation
Xubo Liu, Haohe Liu, Qiuqiang Kong et al.
Separating Long-Form Speech with Group-wise Permutation Invariant Training
Wangyou Zhang, Zhuo Chen, Naoyuki Kanda et al.
Separator-Transducer-Segmenter: Streaming Recognition and Segmentation of Multi-party Speech
Ilya Sklyar, Anna Piunova, Christian Osendorfer
SepIt: Approaching a Single Channel Speech Separation Bound
Shahar Lutati, Eliya Nachmani, Lior Wolf
SepTr: Separable Transformer for Audio Spectrogram Processing
Nicolaea Catalin Ristea, Radu Tudor Ionescu, Fahad Shahbaz Khan
Seq-2-Seq based Refinement of ASR Output for Spoken Name Capture
Karan Singla, Shahab Jalalvand, Yeon-Jun Kim et al.
SF-DST: Few-Shot Self-Feeding Reading Comprehension Dialogue State Tracking with Auxiliary Task
Jihyun Lee, Gary Geunbae Lee