Papers
Dynamic Sliding Window Modeling for Abstractive Meeting Summarization
Zhengyuan Liu, Nancy Chen
Dynamic Vertical Larynx Actions Under Prosodic Focus
Miran Oh, Yoonjeong Lee
Dysarthric Speech Recognition From Raw Waveform with Parametric CNNs
Zhengjun Yue, Erfan Loweimi, Heidi Christensen et al.
E2E Segmenter: Joint Segmenting and Decoding for Long-Form ASR
W. Ronny Huang, Shuo-Yiin Chang, David Rybach et al.
ECAPA-TDNN Based Depression Detection from Clinical Speech
Dong Wang, Yanhui Ding, Qing Zhao et al.
EDITnet: A Lightweight Network for Unsupervised Domain Adaptation in Speaker Verification
Jingyu Li, Wei Liu, Tan Lee
EdiTTS: Score-based Editing for Controllable Text-to-Speech
Jaesung Tae, Hyeongju Kim, Taesu Kim
Effect and Analysis of Large-scale Language Model Rescoring on Competitive ASR Systems
Takuma Udagawa, Masayuki Suzuki, Gakuto Kurata et al.
Effect of Head Orientation on Speech Directivity
Samuel Bellows, Timothy W. Leishman
Effects of Language Contact on Vowel Nasalization in Wenzhou and Rugao Dialects
Yan Li, Ying Chen, Xinya Zhang et al.
Effects of laryngeal manipulations on voice gender perception
Zhaoyan Zhang, Jason Zhang, Jody Kreiman
Effects of Noise on Speech Perception and Spoken Word Comprehension
Jovan Eranovic, Daniel Pape, Magda Stroińska et al.
Efficient Speech Enhancement with Neural Homomorphic Synthesis
Wenbin Jiang, Tao Liu, Kai Yu
Efficient Training of Audio Transformers with Patchout
Khaled Koutini, Jan Schlüter, Hamid Eghbal-zadeh et al.
Efficient Training of Neural Transducer for Speech Recognition
Wei Zhou, Wilfried Michel, Ralf Schlüter et al.
Efficient Transformer-based Speech Enhancement Using Long Frames and STFT Magnitudes
Danilo de Oliveira, Tal Peer, Timo Gerkmann
Eliciting and evaluating likelihood ratios for speaker recognition by human listeners under forensically realistic channel-mismatched conditions
Vincent Hughes, Carmen Llamas, Thomas Kettig
ELO-SPHERES intelligibility prediction model for the Clarity Prediction Challenge 2022
Mark Huckvale, Gaston Hilkhuysen
ema2wav: doing articulation by Praat
Philipp Buech, Simon Roessig, Lena Pagel et al.
Emotion-Shift Aware CRF for Decoding Emotion Sequence in Conversation
Chun-Yu Chen, Yun-Shao Lin, Chi-Chun Lee
Emphasis Control for Parallel Neural TTS
Shreyas Seshadri, Tuomo Raitio, Dan Castellani et al.
Empirical Sampling from Latent Utterance-wise Evidence Model for Missing Data ASR based on Neural Encoder-Decoder Model
Ryu Takeda, Yui Sudo, Kazuhiro Nakadai et al.
Enabling Off-the-Shelf Disfluency Detection and Categorization for Pathological Speech
Amrit Romana, Minxue Niu, Matthew Perez et al.
End-to-End Audio-Visual Neural Speaker Diarization
Mao-Kui He, Jun Du, Chin-Hui Lee