Papers
8,761 papers found
Speech quality evaluation of neural audio codecs
Thomas Muller, Stephane Ragot, Laetitia Gros et al.
Speech ReaLLM – Real-time Speech Recognition with Multimodal Language Models by Teaching the Flow of Time
Frank Seide, Yangyang Shi, Morrie Doulaty et al.
Speech Recognition for Greek Dialects: A Challenging Benchmark
Socrates Vakirtzian, Chara Tsoukala, Stavros Bompolas et al.
Speech Recognition Models are Strong Lip-readers
K R Prajwal, Triantafyllos Afouras, Andrew Zisserman
Speech Topic Classification Based on Multi-Scale and Graph Attention Networks
Fangjing Niu, Xiaozhe Qi, Xinya Chen et al.
Speed of Light Exact Greedy Decoding for RNN-T Speech Recognition Models on GPU
Daniel Galvez, Vladimir Bataev, Hainan Xu et al.
Spoken-Term Discovery using Discrete Speech Units
Benjamin van Niekerk, Julian Zaïdi, Marc-André Carbonneau et al.
Spoken-to-written text conversion with Large Language Model
HyunJung Choi, Muyeol Choi, Yohan Lim et al.
Spoken Word2Vec: Learning Skipgram Embeddings from Speech
Mohammad Amaan Sayeed, Hanan Aldarmaki
Spontaneous Speech-Based Suicide Risk Detection Using Whisper and Large Language Models
Ziyun Cui, Chang Lei, Wen Wu et al.
Spontaneous Style Text-to-Speech Synthesis with Controllable Spontaneous Behaviors Based on Language Models
Weiqin Li, Peiji Yang, Yicheng Zhong et al.
Spoof Diarization: "What Spoofed When" in Partially Spoofed Audio
Lin Zhang, Xin Wang, Erica Cooper et al.
Spoofed Speech Detection with a Focus on Speaker Embedding
Hoan My Tran, David Guennec, Philippe Martin et al.
Spoofing Speech Detection by Modeling Local Spectro-Temporal and Long-term Dependency
Haochen Wu, Wu Guo, Zhentao Zhang et al.
SRC4VC: Smartphone-Recorded Corpus for Voice Conversion Benchmark
Yuki Saito, Takuto Igarashi, Kentaro Seki et al.
State-of-the-art speech production MRI protocol for new 0.55 Tesla scanners
Prakash Kumar, Ye Tian, Yongwan Lim et al.
STraDa: A Singer Traits Dataset
Yuexuan Kong, Viet-Anh Tran, Romain Hennequin
Stream-based Active Learning for Anomalous Sound Detection in Machine Condition Monitoring
Tuan Vu Ho, Kota Dohi, Yohei Kawaguchi
Streaming Audio Transformers for Online Audio Tagging
Heinrich Dinkel, Zhiyong Yan, Yongqing Wang et al.
Streaming Decoder-Only Automatic Speech Recognition with Discrete Speech Units: A Pilot Study
Peikun Chen, Sining Sun, Changhao Shan et al.
Streamlining Speech Enhancement DNNs: an Automated Pruning Method Based on Dependency Graph with Advanced Regularized Loss Strategies
Zugang Zhao, Jinghong Zhang, Yonghui Liu et al.
Stress transfer in speech-to-speech machine translation
Sai Akarsh, Vamshiraghusimha Narasinga, Anil Kumar Vuppala
Study Selectively: An Adaptive Knowledge Distillation based on a Voting Network for Heart Sound Classification
Xihang Qiu, Lixian Zhu, Zikai Song et al.
Sub-PNWR: Speech Enhancement Based on Signal Sub-Band Splitting and Pseudo Noisy Waveform Reconstruction Loss
Yuewei Zhang, Huanbin Zou, Jie Zhu
SummaryMixing: A Linear-Complexity Alternative to Self-Attention for Speech Recognition and Understanding
Titouan Parcollet, Rogier van Dalen, Shucong Zhang et al.