Papers
Target Vocabulary Recognition Based on Multi-Task Learning with Decomposed Teacher Sequences
Aoi Ito, Tatsuya Komatsu, Yusuke Fujita et al.
Task-Agnostic Structured Pruning of Speech Representation Models
Haoyu Wang, Siyuan Wang, Wei-Qiang Zhang et al.
TaylorBeamixer: Learning Taylor-Inspired All-Neural Multi-Channel Speech Enhancement from Beam-Space Dictionary Perspective
Andong Li, Weixin Meng, Guochen Yu et al.
Technology Pipeline for Large Scale Cross-Lingual Dubbing of Lecture Videos into Multiple Indian Languages
Anusha Prakash, Arun Kumar, Ashish Seth et al.
Tensor decomposition for minimization of E2E SLU model toward on-device processing
Yosuke Kashiwagi, Siddhant Arora, Hayato Futami et al.
Text Injection for Capitalization and Turn-Taking Prediction in Speech Models
Shaan Bijwadia, Shuo-Yiin Chang, Weiran Wang et al.
Text-only domain adaptation for end-to-end ASR using integrated text-to-mel-spectrogram generator
Vladimir Bataev, Roman Korostik, Evgeny Shabalin et al.
Text-Only Domain Adaptation for End-to-End Speech Recognition through Down-Sampling Acoustic Representation
Jiaxu Zhu, Weinan Tong, Yaoxun Xu et al.
Text-only Domain Adaptation using Unified Speech-Text Representation in Transducer
Lu Huang, Boyu Li, Jun Zhang et al.
Text Only Domain Adaptation with Phoneme Guided Data Splicing for End-to-End Speech Recognition
Wei Wang, Xun Gong, Hang Shao et al.
TFECN: Time-Frequency Enhanced ConvNet for Audio Classification
Mengwei Wang, Zhe Yang
Thai Dialect Corpus and Transfer-based Curriculum Learning Investigation for Dialect Automatic Speech Recognition
Artit Suwanbandit, Burin Naowarat, Orathai Sangpetch et al.
The 2022 NIST Language Recognition Evaluation
Yooyoung Lee, Craig Greenberg, Eliot Godard et al.
The Androids Corpus: A New Publicly Available Benchmark for Speech Based Depression Detection
Fuxiang Tao, Anna Esposito, Alessandro Vinciarelli
The ART of Conversation: Measuring Phonetic Convergence and Deliberate Imitation in L2-Speech with a Siamese RNN
Zheng Yuan, Aldo Pastore, Dorina de Jong et al.
The co-use of laughter and head gestures across speech styles
Bogdan Ludusan, Marin Schröer, Martina Rossi et al.
The DISPLACE Challenge 2023 - DIarization of SPeaker and LAnguage in Conversational Environments
Shikha Baghel, Shreyas Ramoji, Sidharth et al.
The effect of clinical intervention on the speech of individuals with PTSD: features and recognition performances
Alexander Kathan, Andreas Triantafyllopoulos, Shahin Amiriparian et al.
The effect of masking noise on listeners’ spectral tilt preferences
Olympia Simantiraki, Yannis Pantazis, Martin Cooke
The effect of stress on Mandarin tonal perception in continuous speech for Spanish-speaking learners
Lixia Hao, Qi Gong, Jinsong Zhang
The Effect of Whistled Vowels on Whistled Word Categorization for Naive Listeners
Anais Tran Ngoc, Fanny Meunier, Julien Meyer
The Effects of Input Type and Pronunciation Dictionary Usage in Transfer Learning for Low-Resource Text-to-Speech
Phat Do, Matt Coler, Jelske Dijkstra et al.
The emergence of obstruent-intrinsic f0 and VOT as cues to the fortis/lenis contrast in West Central Bavarian
Jasmin Pöhnlein, Felicitas Kleber
The Hidden Dance of Phonemes and Visage: Unveiling the Enigmatic Link between Phonemes and Facial Features
Liao Qu, Xianwei Zou, Xiang Li et al.