Rohit Prabhavalkar

23 papers · 2010–2024 · 3 conferences · across top CS/AI conferences

Achievements

+12 more ↓

🐣 Hot Topic Early Bird 🧭 Keyword Pioneer 🌉 Interdisciplinary Bridge 🗺️ Taxonomy Completionist (10) 🌍 Conference Polyglot (3)

🗺️ Taxonomy Completionist (10) 🧭 Keyword Pioneer 🐣 Hot Topic Early Bird 🧬 Topic Evolution 🔬 Deep Specialist (12) 🗃️ Keyword Collector (91) ⚡ Prolific Year (6) 🚀 Conference Pioneer 💎 Century Club (23) 🔥 Unstoppable (9) 📈 Trend Setter ❓ The Questioner

Conferences

INTERSPEECH (19) NAACL (3) NIPS (1)

Top co-authors

Tara Sainath (9) Zhong Meng (6) Yanzhang He (6) Tara N. Sainath (5) Weiran Wang (5) Bo Li (4) Cal Peyser (4) David Rybach (4) Trevor Strohman (4) Zelin Wu (3)

Keywords

automatic speech recognition (11) word error rate (6) speech recognition (5) end-to-end speech recognition (5) attention mechanism (3) model compression (3) connectionist temporal classification (2) representation learning (2) semi-supervised learning (2) on-device speech recognition (2) contextual biasing (2) sequence-to-sequence model (2) recurrent neural network transducer (2) collaborative learning (1) sequence alignment (1) discriminative training (1) optimal transport (1) cross attention (1) cross-modal representation (1) gradient clipping (1)

Papers

Aligner-Encoders: Self-Attention Transformers Can Be Self-Transducers NIPS 2024 Contextual Biasing with the Knuth-Morris-Pratt Matching Algorithm INTERSPEECH 2024 Efficiently Train ASR Models that Memorize Less and Perform Better with Per-core Clipping INTERSPEECH 2024 Text Injection for Neural Contextual Biasing INTERSPEECH 2024 Massive End-to-end Speech Recognition Models with Time Reduction NAACL 2024 Deferred NAM: Low-latency Top-K Context Injection via Deferred Context Encoding for Non-Streaming ASR NAACL 2024 How to Estimate Model Transferability of Pre-Trained Speech Models? INTERSPEECH 2023 Improving Joint Speech-Text Representations Without Alignment INTERSPEECH 2023 E2E Segmenter: Joint Segmenting and Decoding for Long-Form ASR INTERSPEECH 2022 Improving Rare Word Recognition with LM-aware MWER Training INTERSPEECH 2022 A Unified Cascaded Encoder ASR Model for Dynamic Model Sizes INTERSPEECH 2022 Improving Deliberation by Text-Only and Semi-Supervised Training INTERSPEECH 2022 Dynamic Encoder Transducer: A Flexible Solution for Trading Off Accuracy for Latency INTERSPEECH 2021 Dissecting User-Perceived Latency of On-Device E2E Speech Recognition INTERSPEECH 2021 Anti-Aliasing Regularization in Stacking Layers INTERSPEECH 2020 Phoneme-Based Contextualization for Cross-Lingual Speech Recognition in End-to-End Models INTERSPEECH 2019 On the Choice of Modeling Unit for Sequence-to-Sequence Speech Recognition INTERSPEECH 2019 Two-Pass End-to-End Speech Recognition INTERSPEECH 2019 Compression of End-to-End Models INTERSPEECH 2018 A Comparison of Sequence-to-Sequence Models for Speech Recognition INTERSPEECH 2017 An Analysis of “Attention” in Sequence-to-Sequence Models INTERSPEECH 2017 On the Efficient Representation and Execution of Deep Acoustic Models INTERSPEECH 2016 Investigations into the Crandem Approach to Word Recognition NAACL 2010