Papers
8,761 papers found
Multimodal Segmentation for Vocal Tract Modeling
Rishi Jain, Bohan Yu, Peter Wu et al.
MultiPA: A Multi-task Speech Pronunciation Assessment Model for Open Response Scenarios
Yu-Wen Chen, Zhou Yu, Julia Hirschberg
Multi-speaker and multi-dialectal Catalan TTS models for video gaming
Alex Peiró-Lilja, José Giraldo, Martí Llopart-Font et al.
MultiStage Speech Bandwidth Extension with Flexible Sampling Rate Control
Ye-Xin Lu, Yang Ai, Zheng-Yan Sheng et al.
MultiTalk: Enhancing 3D Talking Head Generation Across Languages with Multilingual Video Dataset
Kim Sung-Bin, Lee Chae-Yeon, Gihun Son et al.
MUSE: Flexible Voiceprint Receptive Fields and Multi-Path Fusion Enhanced Taylor Transformer for U-Net-based Speech Enhancement
Zizhen Lin, Xiaoting Chen, Junyu Wang
Nasal Air Flow During Speech Production In Korebaju
Jenifer Vega Rodriguez, Nathalie Vallée, Christophe Savariaux et al.
NAST: Noise Aware Speech Tokenization for Speech Language Models
Shoval Messica, Yossi Adi
Navigating the Minefield of MT Beam Search in Cascaded Streaming Speech Translation
Rastislav Rabatin, Frank Seide, Ernie Chang
Neural ATSM: Fully Neural Network-based Adaptive Time-Scale Modification Using Sentence-Specific Dynamic Control
Jaeuk Lee, Sohee Jang, Joon-Hyuk Chang
Neural Blind Source Separation and Diarization for Distant Speech Recognition
Yoshiaki Bando, Tomohiko Nakamura, Shinji Watanabe
Neural Codec-based Adversarial Sample Detection for Speaker Verification
Xuanjun Chen, Jiawei Du, Haibin Wu et al.
Neural Codec Language Models for Disentangled and Textless Voice Conversion
Alan Baade, Puyuan Peng, David Harwath
Neural Compression Augmentation for Contrastive Audio Representation Learning
Zhaoyu Wang, Haohe Liu, Harry Coppock et al.
Neural Network Augmented Kalman Filter for Robust Acoustic Howling Suppression
Yixuan Zhang, Hao Zhang, Meng Yu et al.
NeuRO: an application for code-switched autism detection in children
Mohd Mujtaba Akhtar, Girish, Orchid Chetia Phukan et al.
Neurocomputational model of speech recognition for pathological speech detection: a case study on Parkinson's disease speech detection
Sevada Hovsepyan, Mathew Magimai.-Doss
Neuromorphic Keyword Spotting with Pulse Density Modulation MEMS Microphones
Sidi Yaya Arnaud Yarga, Sean U N Wood
Noise-aware Speech Enhancement using Diffusion Probabilistic Model
Yuchen Hu, Chen Chen, Ruizhe Li et al.
Noise-robust Speech Separation with Fast Generative Correction
Helin Wang, Jesús Villalba, Laureano Moro-Velazquez et al.
Noise-Robust Voice Conversion by Conditional Denoising Training Using Latent Variables of Recording Quality and Environment
Takuto Igarashi, Yuki Saito, Kentaro Seki et al.
Non-Intrusive Speech Intelligibility Prediction for Hearing Aids using Whisper and Metadata
Ryandhimas E. Zezario, Fei Chen, Chiou-Shann Fuh et al.
Non-Linear Inference Time Intervention: Improving LLM Truthfulness
Jakub Hoscilowicz, Adam Wiacek, Jan Chojnacki et al.
No-Reference Speech Intelligibility Prediction Leveraging a Noisy-Speech ASR Pre-Trained Model
Haolan Wang, Amin Edraki, Wai-Yip Chan et al.
NOTSOFAR-1 Challenge: New Datasets, Baseline, and Tasks for Distant Meeting Transcription
Alon Vinnikov, Amir Ivry, Aviv Hurvitz et al.