Papers
8,761 papers found
Sustained Vowels for Pre- vs Post-Treatment COPD Classification
Andreas Triantafyllopoulos, Anton Batliner, Wolfgang Mayr et al.
SVSNet+: Enhancing Speaker Voice Similarity Assessment Models with Representations from Speech Foundation Models
Chun Yin, Tai-Shih Chi, Yu Tsao et al.
SWAN: SubWord Alignment Network for HMM-free word timing estimation in end-to-end automatic speech recognition
Woo Hyun Kang, Srikanth Vishnubhotla, Rudolf Braun et al.
SWiBE: A Parameterized Stochastic Diffusion Process for Noise-Robust Bandwidth Expansion
Yin-Tse Lin, Shreya G. Upadhyay, Bo-Hao Su et al.
Switching Tongues, Sharing Hearts: Identifying the Relationship between Empathy and Code-switching in Speech
Debasmita Bhattacharya, Eleanor Lin, Run Chen et al.
SyncVSR: Data-Efficient Visual Speech Recognition with End-to-End Crossmodal Audio Token Synchronization
Young Jin Ahn, Jungwoo Park, Sangha Park et al.
Synthesizing Long-Form Speech merely from Sentence-Level Corpus with Content Extrapolation and LLM Contextual Enrichment
Shijie Lai, Minglu He, Zijing Zhao et al.
Tackling Missing Modalities in Audio-Visual Representation Learning Using Masked Autoencoders
Georgios Chochlakis, Chandrashekhar Lavania, Prashant Mathur et al.
TacoLM: GaTed Attention Equipped Codec Language Model are Efficient Zero-Shot Text to Speech Synthesizers
Yakun Song, Zhuo Chen, Xiaofei Wang et al.
TalTech-IRIT-LIS Speaker and Language Diarization Systems for DISPLACE 2024
Joonas Kalda, Tanel Alumae, Martin Lebourdais et al.
Target conversation extraction: Source separation using turn-taking dynamics
Tuochao Chen, Qirui Wang, Bohan Wu et al.
Target Speaker Extraction with Curriculum Learning
Yun Liu, Xuechen Liu, Xiaoxiao Miao et al.
TD-PLC: A Semantic-Aware Speech Encoding for Improved Packet Loss Concealment
Jinghong Zhang, Zugang Zhao, Yonghui Liu et al.
TEEMI: a speaking practice tool for L2 English learners
Szu-Yu Chen, Tien-Hong Lo, Yao-Ting Sung et al.
Temporal-Channel Modeling in Multi-head Self-Attention for Synthetic Speech Detection
Duc-Tuan Truong, Ruijie Tao, Tuan Nguyen et al.
Temporal Co-Registration of Simultaneous Electromagnetic Articulography and Electroencephalography for Precise Articulatory and Neural Data Alignment
Daniel Friedrichs, Monica Lancheros, Sam Kirkham et al.
Text-aware and Context-aware Expressive Audiobook Speech Synthesis
Dake Guo, Xinfa Zhu, Liumeng Xue et al.
Text-aware Speech Separation for Multi-talker Keyword Spotting
Haoyu Li, Baochen Yang, Yu Xi et al.
Text Injection for Neural Contextual Biasing
Zhong Meng, Zelin Wu, Rohit Prabhavalkar et al.
Textless Dependency Parsing by Labeled Sequence Prediction
Shunsuke Kando, Yusuke Miyao, Jason Naradowsky et al.
Text-only Domain Adaptation for CTC-based Speech Recognition through Substitution of Implicit Linguistic Information in the Search Space
Tatsunari Takagi, Yukoh Wakabayashi, Atsunori Ogawa et al.
Textual-Driven Adversarial Purification for Speaker Verification
Sizhou Chen, Yibo Bai, Jiadi Yao et al.
TfCleanformer: A streaming, array-agnostic, full- and sub-band modeling front-end for robust ASR
Jens Heitkaemper, Joe Caroselli, Arun Narayanan et al.
The Difficulty and Importance of Estimating the Lower and Upper Bounds of Infant Speech Exposure
Joseph Coffey, Okko Räsänen, Camila Scaff et al.
The Greek podcast corpus: Competitive speech models for low-resourced languages with weakly supervised data
Georgios Paraskevopoulos, Chara Tsoukala, Athanasios Katsamanis et al.