Papers
Investigating Acoustic Cues for Multilingual Abuse Detection
Yash Thakran, Vinayak Abrol
Investigating model performance in language identification: beyond simple error statistics
Suzy J. Styles, Victoria Y. H. Chua, Fei Ting Woon et al.
Investigating Pre-trained Audio Encoders in the Low-Resource Condition
Hao Yang, Jinming Zhao, Gholamreza Haffari et al.
Investigating Range-Equalizing Bias in Mean Opinion Score Ratings of Synthesized Speech
Erica Cooper, Junichi Yamagishi
Investigating Reproducibility at Interspeech Conferences: A Longitudinal and Comparative Perspective
Mohammad Arvan, A. Seza Doğruöz, Natalie Parde
Investigating the cortical tracking of speech and music with sung speech
Giorgia Cantisani, Amirhossein Chalehchaleh, Giovanni Di Liberto et al.
Investigating the dynamics of hand and lips in French Cued Speech using attention mechanisms and CTC-based decoding
Sanjana Sankar, Denis Beautemps, Frédéric Elisei et al.
Investigating the Perception Production Link through Perceptual Adaptation and Phonetic Convergence
Lena-Marie Huttner, Noël Nguyen, Martin J. Pickering
Investigating the Syntax-Discourse Interface in the Phonetic Implementation of Discourse Markers
Mathilde Hutin, Liesbeth Degand, Marc Allassonnière-Tang
Investigating the Utility of Synthetic Data for Doctor-Patient Conversation Summarization
Siyuan Chen, Colin A. Grambow, Mojtaba Kadkhodaie Elyaderani et al.
Investigating wav2vec2 context representations and the effects of fine-tuning, a case-study of a Finnish model
Tamas Grosz, Yaroslav Getman, Ragheb Al-Ghezi et al.
Investigation of Music Emotion Recognition Based on Segmented Semi-Supervised Learning
Yifu Sun, Xulong Zhang, Jianzong Wang et al.
Investigation of Training Mute-Expressive End-to-End Speech Separation Networks for an Unknown Number of Speakers
Younggwan Kim, Hyungjun Lim, Kiho Yeom et al.
iSTFTNet2: Faster and More Lightweight iSTFT-Based Neural Vocoder Using 1D-2D CNN
Takuhiro Kaneko, Hirokazu Kameoka, Kou Tanaka et al.
ITALIC: An Italian Intent Classification Dataset
Alkis Koudounas, Moreno La Quatra, Lorenzo Vaiani et al.
Iterative autoregression: a novel trick to improve your low-latency speech enhancement model
Pavel Andreev, Nicholas Babaev, Azat Saginbaev et al.
Iteratively Improving Speech Recognition and Voice Conversion
Mayank Kumar Singh, Naoya Takahashi, Naoyuki Onoe
JAMFN: Joint Attention Multi-Scale Fusion Network for Depression Detection
Li Zhou, Zhenyu Liu, Zixuan Shangguan et al.
Joint Autoregressive Modeling of End-to-End Multi-Talker Overlapped Speech Recognition and Utterance-level Timestamp Prediction
Naoki Makishima, Keita Suzuki, Satoshi Suzuki et al.
Joint Blind Source Separation and Dereverberation for Automatic Speech Recognition using Delayed-Subsource MNMF with Localization Prior
Mieszko Fraś, Marcin Witkowski, Konrad Kowalczyk
Joint compensation of multi-talker noise and reverberation for speech enhancement with cochlear implants using one or more microphones
Clément Gaultier, Tobias Goehring
Joint-Former: Jointly Regularized and Locally Down-sampled Conformer for Semi-supervised Sound Event Detection
Lijian Gao, Qirong Mao, Ming Dong
Joint Instance Reconstruction and Feature Subspace Alignment for Cross-Domain Speech Emotion Recognition
Keke Zhao, Peng Song, Shaokai Li et al.
Joint Prediction of Audio Event and Annoyance Rating in an Urban Soundscape by Hierarchical Graph Representation Learning
Yuanbo Hou, Siyang Song, Cheng Luo et al.