Research Explorer

Biometric Russian Audio-Visual Extended MASKS (BRAVE-MASKS) Corpus: Multimodal Mask Type Recognition Task

Maxim Markitantov, Elena Ryumina, Dmitry Ryumin et al.

2022 INTERSPEECH

BIT-MI Deep Learning-based Model to Non-intrusive Speech Quality Assessment Challenge in Online Conferencing Applications

Miao Liu, Jing Wang, Liang Xu et al.

2022 INTERSPEECH

Blind Language Separation: Disentangling Multilingual Cocktail Party Voices by Language

Marvin Borsdorf, Kevin Scheck, Haizhou Li et al.

2022 INTERSPEECH

Blockwise Streaming Transformer for Spoken Language Understanding and Simultaneous Speech Translation

Keqi Deng, Shinji Watanabe, Jiatong Shi et al.

2022 INTERSPEECH

Boosting Self-Supervised Embeddings for Speech Enhancement

Kuo-Hsuan Hung, Szu-wei Fu, Huan-Hsin Tseng et al.

2022 INTERSPEECH

Bottleneck Low-rank Transformers for Low-resource Spoken Language Understanding

Pu Wang, Hugo Van hamme

2022 INTERSPEECH

Bottom-up discovery of structure and variation in response tokens (‘backchannels’) across diverse languages

Andreas Liesenfeld, Mark Dingemanse

2022 INTERSPEECH

Bring dialogue-context into RNN-T for streaming ASR

junfeng Hou, Jinkun Chen, Wanyu Li et al.

2022 INTERSPEECH

Building African Voices

Perez Ogayo, Graham Neubig, Alan W Black

2022 INTERSPEECH

Building Vietnamese Conversational Smart Home Dataset and Natural Language Understanding Model

Thi Thu Trang NGUYEN, Trung Duc Anh Dang, Quoc Viet Vu et al.

2022 INTERSPEECH

Bunched LPCNet2: Efficient Neural Vocoders Covering Devices from Cloud to Edge

Sangjun Park, Kihyun Choo, Joohyung Lee et al.

2022 INTERSPEECH

ByT5 model for massively multilingual grapheme-to-phoneme conversion

Jian Zhu, Cong Zhang, David Jurgens

2022 INTERSPEECH

Calibrate and Refine! A Novel and Agile Framework for ASR Error Robust Intent Detection

Peilin Zhou, Dading Chong, Helin Wang et al.

2022 INTERSPEECH

CALM: Constrastive Cross-modal Speaking Style Modeling for Expressive Text-to-Speech Synthesis

Yi Meng, Xiang Li, Zhiyong Wu et al.

2022 INTERSPEECH

Can Humans Correct Errors From System? Investigating Error Tendencies in Speaker Identification Using Crowdsourcing

Yuta Ide, Susumu Saito, Teppei Nakano et al.

2022 INTERSPEECH

CaTT-KWS: A Multi-stage Customized Keyword Spotting Framework based on Cascaded Transducer-Transformer

Zhanheng Yang, Sining Sun, Jin Li et al.

2022 INTERSPEECH

CAUSE: Crossmodal Action Unit Sequence Estimation from Speech

Hirokazu Kameoka, Takuhiro Kaneko, Shogo Seki et al.

2022 INTERSPEECH

CCATMos: Convolutional Context-aware Transformer Network for Non-intrusive Speech Quality Assessment

Yuchen Liu, Li-Chia Yang, Alexander Pawlicki et al.

2022 INTERSPEECH

Censer: Curriculum Semi-supervised Learning for Speech Recognition Based on Self-supervised Pre-training

Bowen Zhang, Songjun Cao, Xiaoming Xhang et al.

2022 INTERSPEECH

Chain-based Discriminative Autoencoders for Speech Recognition

Hung-Shin Lee, Pin-Tuan Huang, Yao-Fei Cheng et al.

2022 INTERSPEECH

Challenges and Opportunities in Multi-device Speech Processing

Gregory Ciccarelli, Jarred Barber, Arun Nair et al.

2022 INTERSPEECH

Challenges in Metadata Creation for Massive Naturalistic Team-Based Audio Data

Chelzy Belitz, John H.L. Hansen

2022 INTERSPEECH

Challenges of using longitudinal and cross-domain corpora on studies of pathological speech

Catarina Botelho, Tanja Schultz, Alberto Abad et al.

2022 INTERSPEECH

Challenges remain in Building ASR for Spontaneous Preschool Children Speech in Naturalistic Educational Environments

Satwik Dutta, Sarah Anne Tao, Jacob C. Reyna et al.

2022 INTERSPEECH

Characterizing Therapist's Speaking Style in Relation to Empathy in Psychotherapy

Dehua Tao, Tan Lee, Harold Chui et al.

2022 INTERSPEECH

Papers