Papers
Rapid Lexical Alignment to a Conversational Agent
Rachel Ostrand, Victor S. Ferreira, David Piorkowski
RASR2: The RWTH ASR Toolkit for Generic Sequence-to-sequence Speech Recognition
Wei Zhou, Eugen Beck, Simon Berger et al.
Real-Time Causal Spectro-Temporal Voice Activity Detection Based on Convolutional Encoding and Residual Decoding
Jingyuan Wang, Jie Zhang, Li-Rong Dai
Real Time Detection of Soft Voice for Speech Enhancement
Hector A. Cordourier, Georg Stemmer, Sinem Aslan et al.
Real-Time Joint Personalized Speech Enhancement and Acoustic Echo Cancellation
Sefik Emre Eskimez, Takuya Yoshioka, Alex Ju et al.
Real-Time Personalised Speech Enhancement Transformers with Dynamic Cross-attended Speaker Representations
Shucong Zhang, Malcolm Chadwick, Alberto Gil C. P. Ramos et al.
Real time spectrogram inversion on mobile phone
Oleg Rybakov, Marco Tagliasacchi, Yunpeng Li et al.
ReCLR: Reference-Enhanced Contrastive Learning of Audio Representation for Depression Detection
Pingyue Zhang, Mengyue Wu, Kai Yu
Record Deduplication for Entity Distribution Modeling in ASR Transcripts
Tianyu Huang, Chung Hoon Hong, Carl Wivagg et al.
Recursive Sound Source Separation with Deep Learning-based Beamforming for Unknown Number of Sources
Hokuto Munakata, Ryu Takeda, Kazunori Komatani
Recycle-and-Distill: Universal Compression Strategy for Transformer-based Speech SSL Models with Attention Map Reusing and Masking Distillation
Kangwook Jang, Sungnyun Kim, Se-Young Yun et al.
Reducing Barriers to Self-Supervised Learning: HuBERT Pre-training with Academic Compute
William Chen, Xuankai Chang, Yifan Peng et al.
Reducing the Prior Mismatch of Stochastic Differential Equations for Diffusion-based Speech Enhancement
Bunlong Lay, Simon Welker, Julius Richter et al.
Regarding Topology and Variant Frame Rates for Differentiable WFST-based End-to-End ASR
Zeyu Zhao, Peter Bell
Rehearsal-Free Online Continual Learning for Automatic Speech Recognition
Steven Vander Eeckt, Hugo Van hamme
Re-investigating the Efficient Transfer Learning of Speech Foundation Model using Feature Fusion Methods
Zhouyuan Huo, Khe Chai Sim, Dongseong Hwang et al.
Relation-based Counterfactual Data Augmentation and Contrastive Learning for Robustifying Natural Language Inference Models
Heerin Yang, Seung-won Hwang, Jungmin So
Relationship between auditory and semantic entrainment using Deep Neural Networks (DNN)
Jay Kejriwal, Štefan Beňuš
Relationship between LTAS-based spectral moments and acoustic parameters of hypokinetic dysarthria in Parkinson’s disease
Jan Svihlik, Vojtěch Illner, Petr Kryze et al.
Relationships Between Gender, Personality Traits and Features of Multi-Modal Data to Responses to Spoken Dialog Systems Breakdown
Kazuya Tsubokura, Yurie Iribe, Norihide Kitaoka
Remixing-based Unsupervised Source Separation from Scratch
Kohei Saijo, Tetsuji Ogawa
Remote Assessment for ALS using Multimodal Dialog Agents: Data Quality, Feasibility and Task Compliance
Vanessa Richter, Michael Neumann, Jordan Green et al.
Resolution Consistency Training on Time-Frequency Domain for Semi-Supervised Sound Event Detection
Won-Gook Choi, Joon-Hyuk Chang
Resource-Efficient Fine-Tuning Strategies for Automatic MOS Prediction in Text-to-Speech for Low-Resource Languages
Phat Do, Matt Coler, Jelske Dijkstra et al.
Respiratory distress estimation in human-robot interaction scenario
Eduardo Alvarado, Nicolás Grágeda, Alejandro Luzanto et al.