Papers
8,761 papers found
Inclusive ASR for Disfluent Speech: Cascaded Large-Scale Self-Supervised Learning with Targeted Fine-Tuning and Data Augmentation
Dena Mujtaba, Nihar R. Mahapatra, Megan Arney et al.
Incorporating Class-based Language Model for Named Entity Recognition in Factorized Neural Transducer
Peng Wang, Yifan Yang, Zheng Liang et al.
IndicMOS: Multilingual MOS Prediction for 7 Indian languages
Sathvik Udupa, Soumi Maiti, Prasanta Kumar Ghosh
Information-theoretic hypothesis generation of relative cue weighting for the voicing contrast
Annika Heuser, Jianjing Kuang
Infusing Acoustic Pause Context into Text-Based Dementia Assessment
Franziska Braun, Sebastian P. Bayerl, Florian Hönig et al.
Instruction Data Generation and Unsupervised Adaptation for Speech Language Models
Vahid Noroozi, Zhehuai Chen, Somshubra Majumdar et al.
Integrating Speech Self-Supervised Learning Models and Large Language Models for ASR
Ling Dong, Zhengtao Yu, Wenjun Wang et al.
InterBiasing: Boost Unseen Word Recognition through Biasing Intermediate Predictions
Yu Nakagome, Michael Hentschel
Interface Design for Self-Supervised Speech Models
Yi-Jen Shih, David Harwath
Interference Aware Training Target for DNN based joint Acoustic Echo Cancellation and Noise Suppression
Vahid Khanagha, Dimitris Koutsaidis, Kaustubh Kalgaonkar et al.
Interleaved Audio/Audiovisual Transfer Learning for AV-ASR in Low-Resourced Languages
Zhengyang Li, Patrick Blumenberg, Jing Liu et al.
Interpretable Temporal Class Activation Representation for Audio Spoofing Detection
Menglu Li, Xiao-Ping Zhang
INTERSPEECH 2009 Emotion Challenge Revisited: Benchmarking 15 Years of Progress in Speech Emotion Recognition
Andreas Triantafyllopoulos, Anton Batliner, Simon Rampp et al.
Intrusive schwa within French stop-liquid clusters: An acoustic analysis
Minmin Yang, Rachid Ridouane
Investigating ASR Error Correction with Large Language Model and Multilingual 1-best Hypotheses
Sheng Li, Chen Chen, Chin Yuen Kwok et al.
Investigating Confidence Estimation Measures for Speaker Diarization
Anurag Chowdhury, Abhinav Misra, Mark C. Fuhs et al.
Investigating Decoder-only Large Language Models for Speech-to-text Translation
Chao-Wei Huang, Hui Lu, Hongyu Gong et al.
Investigating self-supervised speech models' ability to classify animal vocalizations: The case of gibbon's vocal signatures
Jules Cauzinille, Benoît Favre, Ricard Marxer et al.
Investigating the Effect of Label Topology and Training Criterion on ASR Performance and Alignment Quality
Tina Raissi, Christoph Lüscher, Simon Berger et al.
Investigating the Influence of Stance-Taking on Conversational Timing of Task-Oriented Speech
Sara Ng, Gina-Anne Levow, Mari Ostendorf et al.
Investigation of Layer-Wise Speech Representations in Self-Supervised Learning Models: A Cross-Lingual Study in Detecting Depression
Bubai Maji, Rajlakshmi Guha, Aurobinda Routray et al.
Investigation of look-ahead techniques to improve response time in spoken dialogue system
Masaya Ohagi, Tomoya Mizumoto, Katsumasa Yoshikawa