Papers
8,761 papers found
Boosting Chinese ASR Error Correction with Dynamic Error Scaling Mechanism
Jiaxin Fan, Yong Zhang, Hanzhang Li et al.
Boosting Punctuation Restoration with Data Generation and Reinforcement Learning
Viet Dac Lai, Abel Salinas, Hao Tan et al.
Branch-ECAPA-TDNN: A Parallel Branch Architecture to Capture Local and Global Features for Speaker Verification
Jiadi Yao, Chengdong Liang, Zhendong Peng et al.
Bridging Speech Science and Technology — Now and Into the Future
Shrikanth Narayanan
Build a SRE Challenge System: Lessons from VoxSRC 2022 and CNSRC 2022
Zhengyang Chen, Bing Han, Xu Xiang et al.
Bulgarian Unstressed Vowel Reduction: Received Views vs Corpus Findings
Mitko Sabev, Bistra Andreeva, Christoph Gabriel et al.
Bypass Temporal Classification: Weakly Supervised Automatic Speech Recognition with Imperfect Transcripts
Dongji Gao, Matthew Wiesner, Hainan Xu et al.
C²A-SLU: Cross and Contrastive Attention for Improving ASR Robustness in Spoken Language Understanding
Xuxin Cheng, Ziyu Yao, Zhihong Zhu et al.
CALLS: Japanese Empathetic Dialogue Speech Corpus of Complaint Handling and Attentive Listening in Customer Center
Yuki Saito, Eiji Iimori, Shinnosuke Takamichi et al.
CAM++: A Fast and Efficient Network for Speaker Verification Using Context-Aware Masking
Hui Wang, Siqi Zheng, Yafeng Chen et al.
Can Better Perception Become a Disadvantage? Synthetic Speech Perception in Congenitally Blind Users
Gerda Ana Melnik-Leroy, Gediminas Navickas
Can ChatGPT Detect Intent? Evaluating Large Language Models for Spoken Language Understanding
Mutian He, Philip N. Garner
Can Contextual Biasing Remain Effective with Whisper and GPT-2?
Guangzhi Sun, Xianrui Zheng, Chao Zhang et al.
Can Self-Supervised Neural Representations Pre-Trained on Human Speech distinguish Animal Callers?
Eklavya Sarkar, Mathew Magimai.-Doss
CAPTDURE: Captioned Sound Dataset of Single Sources
Yuki Okamoto, Kanta Shimonishi, Keisuke Imoto et al.
Capturing Formality in Speech Across Domains and Languages
Debasmita Bhattacharya, Jie Chi, Julia Hirschberg et al.
Capturing Mismatch between Textual and Acoustic Emotion Expressions for Mood Identification in Bipolar Disorder
Minxue Niu, Amrit Romana, Mimansa Jaiswal et al.
Careful Whisper - leveraging advances in automatic speech recognition for robust and interpretable aphasia subtype classification
Mario Zusag, Laurin Wagner, Theresa Bloder
CASA-ASR: Context-Aware Speaker-Attributed ASR
Mohan Shi, Zhihao Du, Qian Chen et al.
Cascaded encoders for fine-tuning ASR models on overlapped speech
Richard Rose, Oscar Chang, Olivier Siohan
Cascaded Multi-task Adaptive Learning Based on Neural Architecture Search
Yingying Gao, Shilei Zhang, Zihao Cui et al.
CASEIN: Cascading Explicit and Implicit Control for Fine-grained Emotion Intensity Regulation
Yuhao Cui, Xiongwei Wang, Zhongzhou Zhao et al.
Causal Signal-Based DCCRN with Overlapped-Frame Prediction for Online Speech Enhancement
Julitta Bartolewska, Stanisław Kacprzak, Konrad Kowalczyk
CauSE: Causal Search Engine for Understanding Contact-Center Conversations
Anup Pattnaik, Tanay Narshana, Aashraya Sachdeva et al.
CFTNet: Complex-valued Frequency Transformation Network for Speech Enhancement
Nursadul Mamun, John H. L. Hansen