Papers
8,761 papers found
The influence of L2 accent strength and different error types on personality trait ratings
Sarah Wesolek, Piotr Gulgowski, Joanna Blaszczak et al.
The Interspeech 2024 Challenge on Speech Processing Using Discrete Units
Xuankai Chang, Jiatong Shi, Jinchuan Tian et al.
The Interspeech 2024 TAUKADIAL Challenge: Multilingual Mild Cognitive Impairment Detection with Multimodal Approach
Benjamin Barrera-Altuna, Daeun Lee, Zaima Zarnaz et al.
The MARRYS helmet: A new device for researching and training “jaw dancing”
Vidar Freyr Gudmundsson, Keve Márton Gönczi, Malin Svensson Lundmark et al.
The PESQetarian: On the Relevance of Goodhart's Law for Speech Enhancement
Danilo de Oliveira, Simon Welker, Julius Richter et al.
The Processing of Stress in End-to-End Automatic Speech Recognition Models
Martijn Bentum, Louis ten Bosch, Tom Lentz
The Production of Contrastive Focus by 7 to 13-year-olds Learning Mandarin Chinese
Zimeng Li, Zhongxuan Mao, Shengting Shen et al.
The prosody of the verbal prefix ge-: historical and experimental evidence
Chiara Riegger, Tina Bögel, George Walkden
The reasonable effectiveness of speaker embeddings for violence detection
Sarthak Jain, Orchid Chetia Phukan, Arun Balaji Buduru et al.
The Second DISPLACE Challenge: DIarization of SPeaker and LAnguage in Conversational Environments
Shareef Babu Kalluri, Prachi Singh, Pratik Roy Chowdhuri et al.
The speech motor chaining web app for speech motor learning
Jonathan L Preston, Nina R Benway, Nathan Prestopnik et al.
The sub-band cepstrum as a tool for locating local spectral regions of phonetic sensitivity: A first attempt with multi-speaker vowel data
Michael Lambropoulos, Frantz Clermont, Shunichi Ishihara
The Use of Modifiers and f0 in Remote Referential Communication with Human and Computer Partners
Iona Gessinger, Bistra Andreeva, Benjamin R. Cowan
The Use of Phone Categories and Cross-Language Modeling for Phone Alignment of Panãra
Emily P. Ahn, Eleanor Chodroff, Myriam Lapierre et al.
The Whole Is Bigger Than the Sum of Its Parts: Modeling Individual Annotators to Capture Emotional Variability
James Tavernor, Yara El-Tawil, Emily Mower Provost
This Paper Had the Smartest Reviewers - Flattery Detection Utilising an Audio-Textual Transformer-Based Approach
Lukas Christ, Shahin Amiriparian, Friederike Hawighorst et al.
Thunder : Unified Regression-Diffusion Speech Enhancement with a Single Reverse Step using Brownian Bridge
Thanapat Trachu, Chawan Piansaddhayanon, Ekapol Chuangsuwanich
tinyCLAP: Distilling Constrastive Language-Audio Pretrained Models
Francesco Paissan, Elisabetta Farella
TM-PATHVQA: 90000+ Textless Multilingual Questions for Medical Visual Question Answering
Tonmoy Rajkhowa, Amartya Roy Chowdhury, Sankalp Nagaonkar et al.
TokSing: Singing Voice Synthesis based on Discrete Tokens
Yuning Wu, Chunlei Zhang, Jiatong Shi et al.
Total-Duration-Aware Duration Modeling for Text-to-Speech Systems
Sefik Emre Eskimez, Xiaofei Wang, Manthan Thakker et al.
Toward Fully-End-to-End Listened Speech Decoding from EEG Signals
Jihwan Lee, Aditya Kommineni, Tiantian Feng et al.
Towards a better understanding of receptive multilingualism: listening conditions and priming effects
Wei Xue, Ivan Yuen, Bernd Möbius
Towards a General-Purpose Model of Perceived Pragmatic Similarity
Nigel G. Ward, Andres Segura, Alejandro Ceballos et al.
Towards an End-to-End Framework for Invasive Brain Signal Decoding with Large Language Models
Sheng Feng, Heyang Liu, Yu Wang et al.