Papers
8,761 papers found
Iterative Prototype Refinement for Ambiguous Speech Emotion Recognition
Haoqin Sun, Shiwan Zhao, Xiangyu Kong et al.
It’s Time to Take Action: Acoustic Modeling of Motor Verbs to Detect Parkinson’s Disease
Daniel Escobar-Grisales, Cristian David Ríos-Urrego, Ilja Baumann et al.
JenGAN: Stacked Shifted Filters in GAN-Based Speech Synthesis
Hyunjae Cho, Junhyeok Lee, Wonbin Jung
Joint Learning of Context and Feedback Embeddings in Spoken Dialogue
Livia Qian, Gabriel Skantze
Joint prediction of subjective listening effort and speech intelligibility based on end-to-end learning
Dirk Eike Hoffner, Jana Roßbach, Bernd T. Meyer
Joint Speaker Features Learning for Audio-visual Multichannel Speech Separation and Recognition
Guinan Li, Jiajun Deng, Youjun Chen et al.
Joint vs Sequential Speaker-Role Detection and Automatic Speech Recognition for Air-traffic Control
Alexander Blatt, Aravind Krishnan, Dietrich Klakow
Just Because We Camp, Doesn't Mean We Should: The Ethics of Modelling Queer Voices.
Atli Sigurgeirsson, Eddie L. Ungless
Keep, Delete, or Substitute: Frame Selection Strategy for Noise-Robust Speech Emotion Recognition
Seong-Gyun Leem, Daniel Fulford, Jukka-Pekka Onnela et al.
Key-Element-Informed sLLM Tuning for Document Summarization
Sangwon Ryu, Heejin Do, Yunsu Kim et al.
Keyword-Guided Adaptation of Automatic Speech Recognition
Aviv Shamsian, Aviv Navon, Neta Glazer et al.
K-means and hierarchical clustering of f0 contours
Constantijn Kaland, Jeremy Steffman, Jennifer Cole
Knowledge boosting during low-latency inference
Vidya Srinivas, Malek Itani, Tuochao Chen et al.
Knowledge Distillation for Tiny Speech Enhancement with Latent Feature Augmentation
Behnam Gholami, Mostafa El-Khamy, KeeBong Song
Knowledge-Preserving Pluggable Modules for Multilingual Speech Translation Tasks
Nan Chen, Yonghe Wang, Feilong Bao
LAFMA: A Latent Flow Matching Model for Text-to-Audio Generation
Wenhao Guan, Kaidi Wang, Wangjin Zhou et al.
LAHAJA: A Robust Multi-accent Benchmark for Evaluating Hindi ASR Systems
Tahir Javed, Janki Nawale, Sakshi Joshi et al.
Language-Universal Speech Attributes Modeling for Zero-Shot Multilingual Spoken Keyword Recognition
Hao Yen, Pin-Jui Ku, Sabato Marco Siniscalchi et al.
Large Language Model-based FMRI Encoding of Language Functions for Subjects with Neurocognitive Disorder
Yuejiao Wang, Xianmin Gong, Lingwei Meng et al.
Large Language Models for Dysfluency Detection in Stuttered Speech
Dominik Wagner, Sebastian P. Bayerl, Ilja Baumann et al.
LASER: Learning by Aligning Self-supervised Representations of Speech for Improving Content-related Tasks
Amit Meghanani, Thomas Hain
LDM-SVC: Latent Diffusion Model Based Zero-Shot Any-to-Any Singing Voice Conversion with Singer Guidance
Shihao Chen, Yu Gu, Jie Zhang et al.
Learnable Layer Selection and Model Fusion for Speech Self-Supervised Learning Models
Sheng-Chieh Chiu, Chia-Hua Wu, Jih-Kang Hsieh et al.