Papers
The Importance of Calibration: Rethinking Confidence and Performance of Speech Multi-label Emotion Classifiers
Huang-Cheng Chou, Lucas Goncalves, Seong-Gyun Leem et al.
The MALACH Corpus: Results with End-to-End Architectures and Pretraining
Michael Picheny, Qin Yang, Daiheng Zhang et al.
The MASCFLICHT Corpus: Face Mask Type and Coverage Area Recognition from Speech
Adria Mallol-Ragolta, Nils Urbach, Shuo Liu et al.
There is more than one kind of robustness: Fooling Whisper with adversarial examples
Raphael Olivier, Bhiksha Raj
The Role of Formant and Excitation Source Features in Perceived Naturalness of Low Resource Tribal Language TTS: An Empirical Study
Ashwini Dasare, Pradyoth Hegde, Supritha Shetty et al.
The SpeeD--ZevoTech submission at DISPLACE 2023
Gabriel Pirlogeanu, Dan Oneata, Alexandru-Lucian Georgescu et al.
The Tag-Team Approach: Leveraging CLS and Language Tagging for Enhancing Multilingual ASR
Kaousheik Jayakumar, Vrunda N. Sukhadia, A Arunkumar et al.
Time-Domain Speech Enhancement for Robust Automatic Speech Recognition
Yufeng Yang, Ashutosh Pandey, DeLiang Wang
Time-domain Transformer-based Audiovisual Speaker Separation
Vahid Ahmadi Kalkhorani, Anurag Kumar, Ke Tan et al.
Time-frequency Domain Filter-and-sum Network for Multi-channel Speech Separation
Zhewen Deng, Yi Zhou, Hongqing Liu
Time-synchronous one-pass Beam Search for Parallel Online and Offline Transducers with Dynamic Block Training
Yui Sudo, Shakeel Muhammad, Yifan Peng et al.
TokenSplit: Using Discrete Speech Representations for Direct, Refined, and Transcript-Conditioned Speech Separation and Recognition
Hakan Erdogan, Scott Wisdom, Xuankai Chang et al.
Tonal coarticulation as a cue for upcoming prosodic boundary
Jianjing Kuang, May Pik Yu Chan, Nari Rhee
Topological Data Analysis for Speech Processing
Eduard Tulchinskii, Kristian Kuznetsov, Laida Kushnareva et al.
TO-Rawnet: Improving RawNet with TCN and Orthogonal Regularization for Fake Audio Detection
Chenglong Wang, Jiangyan Yi, Jianhua Tao et al.
Towards Attention-based Contrastive Learning for Audio Spoof Detection
Chirag Goel, Surya Koppisetti, Ben Colman et al.
Towards continually learning new languages
Quan Pham, Jan Niehues, Alex Waibel
Towards Cross-Language Prosody Transfer for Dialog
Jonathan E. Avila, Nigel G. Ward
Towards Dialect-inclusive Recognition in a Low-resource Language: Are Balanced Corpora the Answer?
Liam Lonergan, Mengjie Qian, Neasa Ní Chiaráin et al.
Towards Effective and Compact Contextual Representation for Conformer Transducer Speech Recognition Systems
Mingyu Cui, Jiawen Kang, Jiajun Deng et al.
Towards Fully Quantized Neural Networks For Speech Enhancement
Elad Cohen, Hai Victor Habi, Arnon Netzer
Towards hate speech detection in low-resource languages: Comparing ASR to acoustic word embeddings on Wolof and Swahili
Christiaan Jacobs, Nathanaël Carraz Rakotonirina, Everlyn Asiko Chimoto et al.
Towards Multi-Lingual Audio Question Answering
Swarup Ranjan Behera, Pailla Balakrishna Reddy, Achyut Mani Tripathi et al.
Towards Multi-task Learning of Speech and Speaker Recognition
Nik Vaessen, David A. van Leeuwen
Towards Paralinguistic-Only Speech Representations for End-to-End Speech Emotion Recognition
Georgios Ioannides, Michael Owen, Andrew Fletcher et al.