Papers - Conftrace

Streaming Align-Refine for Non-autoregressive Deliberation

Wang Weiran, Ke Hu, Tara Sainath

2022 INTERSPEECH

Streaming Automatic Speech Recognition with Re-blocking Processing Based on Integrated Voice Activity Detection

Yui Sudo, Shakeel Muhammad, Kazuhiro Nakadai et al.

2022 INTERSPEECH

Streaming End-to-End Multilingual Speech Recognition with Joint Language Identification

Chao Zhang, Bo Li, Tara Sainath et al.

2022 INTERSPEECH

Streaming Intended Query Detection using E2E Modeling for Continued Conversation

Shuo-Yiin Chang, Guru Prakash, Zelin Wu et al.

2022 INTERSPEECH

Streaming model for Acoustic to Articulatory Inversion with transformer networks

Sathvik Udupa, Aravind Illa, Prasanta Ghosh

2022 INTERSPEECH

Streaming Multi-Talker ASR with Token-Level Serialized Output Training

Naoyuki Kanda, Jian Wu, Yu Wu et al.

2022 INTERSPEECH

Streaming parallel transducer beam search with fast slow cascaded encoders

Jay Mahadeokar, Yangyang Shi, Ke Li et al.

2022 INTERSPEECH

Streaming Speaker-Attributed ASR with Token-Level Speaker Embeddings

Naoyuki Kanda, Jian Wu, Yu Wu et al.

2022 INTERSPEECH

Streaming Target-Speaker ASR with Neural Transducer

Takafumi Moriya, Hiroshi Sato, Tsubasa Ochiai et al.

2022 INTERSPEECH

STUDIES: Corpus of Japanese Empathetic Dialogue Speech Towards Friendly Voice Agent

Yuki Saito, Yuto Nishimura, Shinnosuke Takamichi et al.

2022 INTERSPEECH

Sub-8-Bit Quantization Aware Training for 8-Bit Neural Network Accelerator with On-Device Speech Recognition

Kai Zhen, Hieu Duy Nguyen, Raviteja Chinta et al.

2022 INTERSPEECH

Supervision-Guided Codebooks for Masked Prediction in Speech Pre-training

Chengyi Wang, Yiming Wang, Yu Wu et al.

2022 INTERSPEECH

SVTS: Scalable Video-to-Speech Synthesis

Rodrigo Schoburg Carrillo de Mira, Alexandros Haliassos, Stavros Petridis et al.

2022 INTERSPEECH

Syllable sequence of /a/+/ta/ can be heard as /atta/ in Japanese with visual or tactile cues

Takayuki Arai, Miho Yamada, Megumi Okusawa

2022 INTERSPEECH

Synthesizing Near Native-accented Speech for a Non-native Speaker by Imitating the Pronunciation and Prosody of a Native Speaker

Raymond Chung, Brian Mak

2022 INTERSPEECH

TALCS: An open-source Mandarin-English code-switching corpus and a speech recognition baseline

Chengfei Li, Shuhao Deng, Yaoping Wang et al.

2022 INTERSPEECH

Tandem Multitask Training of Speaker Diarisation and Speech Recognition for Meeting Transcription

Xianrui Zheng, Chao Zhang, Phil Woodland

2022 INTERSPEECH

Target Confusion in End-to-end Speaker Extraction: Analysis and Approaches

Zifeng Zhao, Dongchao Yang, Rongzhi Gu et al.

2022 INTERSPEECH

TaylorBeamformer: Learning All-Neural Beamformer for Multi-Channel Speech Enhancement from Taylor’s Approximation Theory

Andong Li, Guochen Yu, Chengshi Zheng et al.

2022 INTERSPEECH

TB or not TB? Acoustic cough analysis for tuberculosis classification

Geoffrey T. Frost, Grant Theron, Thomas Niesler

2022 INTERSPEECH

Telling self-defining memories: An acoustic study of natural emotional speech productions

Veronique Delvaux, Audrey Lavallée, Fanny Degouis et al.

2022 INTERSPEECH

Temporal coding with magnitude-phase regularization for sound event detection

Sangwook Park, Sandeep Reddy Kothinti, Mounya Elhilali

2022 INTERSPEECH

Temporal Self Attention-Based Residual Network for Environmental Sound Classification

Achyut Tripathi, Konark Paul

2022 INTERSPEECH

Text aware Emotional Text-to-speech with BERT

Arijit Mukherjee, Shubham Bansal, Sandeepkumar Satpal et al.

2022 INTERSPEECH

Text-driven Emotional Style Control and Cross-speaker Style Transfer in Neural TTS

Yookyung Shin, Younggun Lee, Suhee Jo et al.

2022 INTERSPEECH