Papers
Gradual Improvements Observed in Learners' Perception and Production of L2 Sounds Through Continuing Shadowing Practices on a Daily Basis
Takuya Kunihara, Chuanbo Zhu, Nobuaki Minematsu et al.
Gram Vaani ASR Challenge on spontaneous telephone speech recordings in regional variations of Hindi
Anish Bhanushali, Grant Bridgman, Deekshitha G et al.
Graph-based Multi-View Fusion and Local Adaptation: Mitigating Within-Household Confusability for Speaker Identification
Long Chen, Yixiong Meng, Venkatesh Ravichandran et al.
Hesitations in Urdu/Hindi: Distribution and Properties of Fillers & Silences
Farhat Jabeen, Simon Betz
Heterogeneous Target Speech Separation
Efthymios Tzinis, Gordon Wichern, Aswin Shanmugam Subramanian et al.
Hierarchical and Multi-Scale Variational Autoencoder for Diverse and Natural Non-Autoregressive Text-to-Speech
Jaesung Bae, Jinhyeok Yang, Taejun Bak et al.
Hierarchical Attention Network for Evaluating Therapist Empathy in Counseling Session
Dehua Tao, Tan Lee, Harold Chui et al.
Hierarchical Tagger with Multi-task Learning for Cross-domain Slot Filling
Xiao Wei, Yuke Si, Shiquan Wang et al.
High level feature fusion in forensic voice comparison
Michael Carne, Yuko Kinoshita, Shunichi Ishihara
Homophone Disambiguation Profits from Durational Information
Barbara Schuppler, Emil Berger, Xenia Kogler et al.
How bad are artifacts?: Analyzing the impact of speech enhancement errors on ASR
Kazuma Iwamoto, Tsubasa Ochiai, Marc Delcroix et al.
How do our eyebrows respond to masks and whispering? The case of Persians
Nasim Mahdinazhad Sardhaei, Marzena Zygis, Hamid Sharifzadeh
How to Listen? Rethinking Visual Sound Localization
Ho-Hsiang Wu, Magdalena Fuentes, Prem Seetharaman et al.
Human-in-the-loop Speaker Adaptation for DNN-based Multi-speaker TTS
Kenta Udagawa, Yuki Saito, Hiroshi Saruwatari
Humanizing bionic voice: interactive demonstration of aesthetic design and control factors influencing the devices assembly and waveshape engineering
Konrad Zieliński, Marek Grzelec, Martin Hagmüller
Human Sound Classification based on Feature Fusion Method with Air and Bone Conducted Signal
Liang Xu, Jing Wang, Lizhong Wang et al.
Hybrid Handcrafted and Learnable Audio Representation for Analysis of Speech Under Cognitive and Physical Load
Gasser Elbanna, Alice Biryukov, Neil Scheidwasser-Clow et al.
HYU Submission for the SASV Challenge 2022: Reforming Speaker Embeddings with Spoofing-Aware Conditioning
Jeong-Hwan Choi, Joon-Young Yang, Ye-Rin Jeoung et al.
iCNN-Transformer: An improved CNN-Transformer with Channel-spatial Attention and Keyword Prediction for Automated Audio Captioning
Kun Chen, Jun Wang, Feng Deng et al.
iDeepMMSE: An improved deep learning approach to MMSE speech and noise power spectrum estimation for speech enhancement
Minseung Kim, Hyungchan Song, Sein Cheong et al.
Idiosyncratic lingual articulation of American English /æ/ and /ɑ/ using network analysis
Carolina Lins Machado, Volker Dellwo, Lei He
Impact of Acoustic Event Tagging on Scene Classification in a Multi-Task Learning Framework
Rahil Parikh, Harshavardhan Sundar, Ming Sun et al.
Impairment Representation Learning for Speech Quality Assessment
Lianwu Chen, Xinlei Ren, Xu Zhang et al.