Daniel Povey
48 papers · 2012–2025 · 4 conferences · across top CS/AI conferences
Achievements
Jump to papers ↓+15 more ↓ Show less ↑
๐บ๏ธ Taxonomy Completionist (23) ๐งญ Keyword Pioneer ๐ Renaissance Researcher (5) ๐ Interdisciplinary Bridge ๐ฃ Hot Topic Early Bird
๐
Conference Polyglot
(4)
๐
Cross-Pollinator
(9)
๐บ๏ธ
Taxonomy Completionist
(23)
๐
Conference Loyalist
(44)
๐
Keyword Champion
(3)
๐งฌ
Topic Evolution
๐ฅ
Mega-Team
(20)
๐ฌ
Deep Specialist
(19)
๐ค
Dynamic Duo
(36)
๐
Conference Pioneer
๐ฅ
Unstoppable
(11)
โก
Prolific Year
(5)
๐
Century Club
(48)
๐๏ธ
Keyword Collector
(95)
๐
Trend Setter
Conferences
INTERSPEECH (44)
ICLR (2)
AISTATS (1)
EMNLP (1)
Top co-authors
Keywords
automatic speech recognition
(13)
deep neural network
(10)
word error rate
(6)
speaker diarization
(5)
speaker recognition
(5)
speech recognition
(5)
connectionist temporal classification
(4)
speaker embedding
(4)
probabilistic linear discriminant analysis
(4)
neural transducer
(3)
neural network
(3)
acoustic modeling
(3)
speaker verification
(3)
acoustic model
(3)
time delay neural network
(3)
lattice-free maximum mutual information
(3)
feature extraction
(3)
stochastic gradient descent
(2)
temporal modeling
(2)
speech corpus
(2)
Papers
CR-CTC: Consistency regularization on CTC for improved speech recognition
ICLR 2025
LibriheavyMix: A 20,000-Hour Dataset for Single-Channel Reverberant Multi-Talker Speech Separation, ASR and Speaker Diarization
INTERSPEECH 2024
Multi-Channel Multi-Speaker ASR Using Target Speakerโs Solo Segment
INTERSPEECH 2024
Zipformer: A faster and better encoder for automatic speech recognition
ICLR 2024
Enhancing Neural Transducer for Multilingual ASR with Synchronized Language Diarization
INTERSPEECH 2024
Improving Neural Biasing for Contextual Speech Recognition by Early Context Injection and Text Perturbation
INTERSPEECH 2024
Bypass Temporal Classification: Weakly Supervised Automatic Speech Recognition with Imperfect Transcripts
INTERSPEECH 2023
Blank-regularized CTC for Frame Skipping in Neural Transducer
INTERSPEECH 2023
GPU-accelerated Guided Source Separation for Meeting Transcription
INTERSPEECH 2023
Delay-penalized CTC Implemented Based on Finite State Transducer
INTERSPEECH 2023
Pruned RNN-T for fast, memory-e๏ฌicient ASR training
INTERSPEECH 2022
speechocean762: An Open-Source Non-Native English Speech Corpus for Pronunciation Assessment
INTERSPEECH 2021
GigaSpeech: An Evolving, Multi-Domain ASR Corpus with 10,000 Hours of Transcribed Audio
INTERSPEECH 2021
An Alternative to MFCCs for ASR
INTERSPEECH 2020
Efficient MDI Adaptation for n-Gram Language Models
INTERSPEECH 2020
Lattice-Free Maximum Mutual Information Training of Multilingual Speech Recognition Systems
INTERSPEECH 2020
Wake Word Detection with Alignment-Free Lattice-Free MMI
INTERSPEECH 2020
Neural Language Modeling with Implicit Cache Pointers
INTERSPEECH 2020
PyChain: A Fully Parallelized PyTorch Implementation of LF-MMI for End-to-End ASR
INTERSPEECH 2020
Improving Emotion Identification Using Phone Posteriors in Raw Speech Waveform Based DNN
INTERSPEECH 2019
Advances in Automatic Speech Recognition for Child Speech Using Factored Time Delay Neural Network
INTERSPEECH 2019
Multi-PLDA Diarization on Childrenโs Speech
INTERSPEECH 2019
State-of-the-Art Speaker Recognition for Telephone and Video Speech: The JHU-MIT Submission for NIST SRE18
INTERSPEECH 2019
x-Vector DNN Refinement with Full-Length Recordings for Speaker Recognition
INTERSPEECH 2019
Speaker Recognition Benchmark Using the CHiME-5 Corpus
INTERSPEECH 2019
The JHU Speaker Recognition System for the VOiCES 2019 Challenge
INTERSPEECH 2019
The JHU ASR System for VOiCES from a Distance Challenge 2019
INTERSPEECH 2019
Self-Attentive Speaker Embeddings for Text-Independent Speaker Verification
INTERSPEECH 2018
Output-Gate Projected Gated Recurrent Unit for Speech Recognition
INTERSPEECH 2018
Acoustic Modeling from Frequency Domain Representations of Speech
INTERSPEECH 2018
End-to-end Deep Neural Network Age Estimation
INTERSPEECH 2018
End-to-end Speech Recognition Using Lattice-free MMI
INTERSPEECH 2018
Recurrent Neural Network Language Model Adaptation for Conversational Speech Recognition
INTERSPEECH 2018
Semi-Orthogonal Low-Rank Matrix Factorization for Deep Neural Networks
INTERSPEECH 2018
Emotion Identification from Raw Speech Signals Using DNNs
INTERSPEECH 2018
Diarization is Hard: Some Experiences and Lessons Learned for the JHU Team in the Inaugural DIHARD Challenge
INTERSPEECH 2018
A GPU-based WFST Decoder with Exact Lattice Generation
INTERSPEECH 2018
An Exploration of Dropout with LSTMs
INTERSPEECH 2017
Phone Duration Modeling for LVCSR Using Neural Networks
INTERSPEECH 2017
Deep Neural Network Embeddings for Text-Independent Speaker Verification
INTERSPEECH 2017
Backstitch: Counteracting Finite-Sample Bias via Negative Steps
INTERSPEECH 2017
The Kaldi OpenKWS System: Improving Low Resource Keyword Search
INTERSPEECH 2017
Acoustic Data-Driven Lexicon Learning Based on a Greedy Pronunciation Selection Framework
INTERSPEECH 2017
Acoustic Modelling from the Signal Domain Using CNNs
INTERSPEECH 2016
Far-Field ASR Without Parallel Data
INTERSPEECH 2016
Purely Sequence-Trained Neural Networks for ASR Based on Lattice-Free MMI
INTERSPEECH 2016
A Coarse-Grained Model for Optimal Coupling of ASR and SMT Systems for Speech Translation
EMNLP 2015
Krylov Subspace Descent for Deep Learning
AISTATS 2012