Papers
Improving Target Sound Extraction with Timestamp Information
Helin Wang, Dongchao Yang, Chao Weng et al.
Improving the Training Recipe for a Robust Conformer-based Hybrid Model
Mohammad Zeineldeen, Jingjing Xu, Christoph Lüscher et al.
Improving Transformer-based Conversational ASR by Inter-Sentential Attention Mechanism
Kun Wei, Pengcheng Guo, Ning Jiang
Improving Visual Speech Enhancement Network by Learning Audio-visual Affinity with Multi-head Attention
Xinmeng Xu, Yang Wang, Jie Jia et al.
Improving Voice Trigger Detection with Metric Learning
Prateeth Nayak, Takuya Higuchi, Anmol Gupta et al.
Incorporating Dual-Aware with Hierarchical Interactive Memory Networks for Task-Oriented Dialogue
yangyang Ou, Peng Zhang, Jing Zhang et al.
Incremental Layer-Wise Self-Supervised Learning for Efficient Unsupervised Speech Domain Adaptation On Device
Zhouyuan Huo, Dongseong Hwang, Khe Chai Sim et al.
Incremental learning for RNN-Transducer based speech recognition models
Deepak Baby, Pasquale D'Alterio, Valentin Mendelev
Independence-based Joint Dereverberation and Separation with Neural Source Model
Kohei Saijo, Robin Scheibler
Induce Spoken Dialog Intents via Deep Unsupervised Context Contrastive Clustering
Ting-Wei Wu, Biing Juang
Integrating Discrete Word-Level Style Variations into Non-Autoregressive Acoustic Models for Speech Synthesis
Zhaoci Liu, Ningqian Wu, Yajie Zhang et al.
Integrating Form and Meaning: A Multi-Task Learning Model for Acoustic Word Embeddings
Badr M. Abdullah, Bernd Möbius, Dietrich Klakow
Intent classification using pre-trained language agnostic embeddings for low resource languages
Hemant Yadav, Akshat Gupta, Sai Krishna Rallabandi et al.
Interactive Auido-text Representation for Automated Audio Captioning with Contrastive Learning
Chen Chen, Nana Hou, Yuchen Hu et al.
Interactive Co-Learning with Cross-Modal Transformer for Audio-Visual Emotion Recognition
Akihiko Takashima, Ryo Masumura, Atsushi Ando et al.
InterAug: Augmenting Noisy Intermediate Predictions for CTC-based ASR
Yu Nakagome, Tatsuya Komatsu, Yusuke Fujita et al.
Internal Language Model Adaptation with Text-Only Data for End-to-End Speech Recognition
Zhong Meng, Yashesh Gaur, Naoyuki Kanda et al.
Internal Language Model Estimation Through Explicit Context Vector Learning for Attention-based Encoder-decoder ASR
Yufei Liu, Rao Ma, Haihua Xu et al.
Interpretabilty of Speech Emotion Recognition modelled using Self-Supervised Speech and Text Pre-Trained Embeddings
K V Vijay Girish, Srikanth Konjeti, Jithendra Vepa
Interpretable dysarthric speaker adaptation based on optimal-transport
Rosanna Turrisi, Leonardo Badino
Interrelate Training and Searching: A Unified Online Clustering Framework for Speaker Diarization
Yifan Chen, Yifan Guo, Qingxuan Li et al.
INTERSPEECH 2022 Audio Deep Packet Loss Concealment Challenge
Lorenz Diener, Sten Sootla, Solomiya Branets et al.
Intra-speaker phonetic variation in read speech: comparison with inter-speaker variability in a controlled population
Nicolas Audibert, Cécile Fougeron
Introducing Auxiliary Text Query-modifier to Content-based Audio Retrieval
Daiki Takeuchi, Yasunori Ohishi, Daisuke Niizumi et al.