Papers
Sequence-to-Sequence Multi-Modal Speech In-Painting
Mahsa Kadkhodaei Elyaderani, Shahram Shirani
Severity Classification of Parkinson's Disease from Speech using Single Frequency Filtering-based Features
Sudarsana Reddy Kadiri, Manila Kodali, Paavo Alku
SGEM: Test-Time Adaptation for Automatic Speech Recognition via Sequential-Level Generalized Entropy Minimization
Changhun Kim, Joonhyung Park, Hajin Shim et al.
Short-term Extrapolation of Speech Signals Using Recursive Neural Networks in the STFT Domain
Maurice Oberhag, Daniel Neudek, Rainer Martin et al.
Show & Tell: Voice Activity Projection and Turn-taking
Erik Ekstedt, Gabriel Skantze
Silent Speech Recognition with Articulator Positions Estimated from Tongue Ultrasound and Lip Video
Rachel Beeson, Korin Richmond
Similar Hierarchical Representation of Speech and Other Complex Sounds In the Brain and Deep Residual Networks: An MEG Study
Tzu-Han Zoe Cheng, Kuan-Lin Chen, Juliane Schubert et al.
SlothSpeech: Denial-of-service Attack Against Speech Recognition Models
Mirazul Haque, Rutvij Shah, Simin Chen et al.
Small Footprint Multi-channel Network for Keyword Spotting with Centroid Based Awareness
Dianwen Ng, Yang Xiao, Jia Qi Yip et al.
Sociodemographic and Attitudinal Effects on Dialect Speakers’ Articulation of the Standard Language: Evidence from German-Speaking Switzerland
Carina Steiner, Dieter Studer-Joho, Corinne Lanthemann et al.
Some Voices are Too Common: Building Fair Speech Recognition Systems Using the CommonVoice Dataset
Lucas Maison, Yannick Estève
So-to-Speak: An Exploratory Platform for Investigating the Interplay between Style and Prosody in TTS
Éva Székely, Siyang Wang, Joakim Gustafson
SOT: Self-supervised Learning-Assisted Optimal Transport for Unsupervised Adaptive Speech Emotion Recognition
Ruiteng Zhang, Jianguo Wei, Xugang Lu et al.
Sp1NY: A Quick and Flexible Speech Visualisation Tool in Python
Sébastien Le Maguer, Mark Anderson, Naomi Harte
Spanish Phone Confusion Analysis for EMG-Based Silent Speech Interfaces
Inge Salomons, Eder del Blanco, Eva Navas et al.
SparseVSR: Lightweight and Noise Robust Visual Speech Recognition
Adriana Fernandez-Lopez, Honglie Chen, Pingchuan Ma et al.
Spatialization Quality Metric for Binaural Speech
Pranay Manocha, Israel Dejene Gebru, Anurag Kumar et al.
Spatial LibriSpeech: An Augmented Dataset for Spatial Audio Learning
Miguel Sarabia, Elena Menyaylenko, Alessandro Toso et al.
Speaker-Aware Anti-spoofing
Xuechen Liu, Md Sahidullah, Kong Aik Lee et al.
Speaker-aware Cross-modal Fusion Architecture for Conversational Emotion Recognition
Huan Zhao, Bo Li, Zixing Zhang
Speaker Diarization for ASR Output with T-vectors: A Sequence Classification Approach
Midia Yousefi, Naoyuki Kanda, Dongmei Wang et al.
Speaker Embeddings as Individuality Proxy for Voice Stress Detection
Zihan Wu, Neil Scheidwasser-Clow, Karl El Hajal et al.
Speaker Extraction with Detection of Presence and Absence of Target Speakers
Ke Zhang, Marvin Borsdorf, Zexu Pan et al.
Speaker-independent neural formant synthesis
Pablo Pérez Zarazaga, Zofia Malisz, Gustav Eje Henter et al.
Speaker-independent Speech Inversion for Estimation of Nasalance
Yashish M Siriwardena, Carol Espy-Wilson, Suzanne Boyce et al.