Preethi Jyothi

77 papers · 2010–2026 · 10 conferences · across top CS/AI conferences

Achievements

+14 more ↓

🧭 Keyword Pioneer 🗺️ Taxonomy Completionist (16) 🌈 Renaissance Researcher (5) 🌉 Interdisciplinary Bridge 🌍 Conference Polyglot (10)

🗺️ Taxonomy Completionist (16) 🧭 Keyword Pioneer 🏃 Academic Marathon (15) 🏠 Conference Loyalist (23) 🔬 Deep Specialist (22) 👥 Mega-Team (20) 🏆 Keyword Champion (4) 🤝 Dynamic Duo (15) 🚀 Conference Pioneer 📈 Trend Setter ⚡ Prolific Year (13) 💎 Century Club (74) 🗃️ Keyword Collector (50) 🔥 Unstoppable (8)

Conferences

INTERSPEECH (23) ACL (17) EMNLP (15) EACL (7) NAACL (5) COLING (3) ICLR (2) IJCAI (2) IJCNLP (2) NIPS (1)

Top co-authors

Ganesh Ramakrishnan (16) Pushpak Bhattacharyya (13) Sunita Sarawagi (9) Ashish Mittal (8) Rishabh Kumar (5) Darshan Prabhu (4) Amrith Krishna (4) Devaraja Adiga (4) Vineet Bhat (3) Ishan Tarunesh (3)

Keywords

automatic speech recognition (25) low-resource language (14) cross-lingual transfer (13) word error rate (9) speech recognition (8) machine translation (7) zero-shot learning (6) question answering (6) named entity recognition (5) language model (5) sentiment analysis (5) domain adaptation (5) transfer learning (5) multilingual model (5) data augmentation (5) multilingual nlp (5) natural language inference (5) text generation (4) neural machine translation (4) accent adaptation (4)

Papers

Post-ASR Correction in Hindi: Comparing Language Models and Large Language Models in Low-Resource Scenarios EACL 2026 Improving Language Identification for Code-Switched Speech: The Pivotal Role of Accented English EACL 2026 SrcMix: Mixing of Related Source Languages Benefits Extremely Low-resource Machine Translation EACL 2026 AMPS: ASR with Multimodal Paraphrase Supervision NAACL 2025 DeFT-X: Denoised Sparse Fine-Tuning for Zero-Shot Cross-Lingual Transfer EMNLP 2025 LASER: An LLM-based ASR Scoring and Evaluation Rubric EMNLP 2025 Language-Specific Neurons Do Not Facilitate Cross-Lingual Transfer NAACL 2025 LexGen: Domain-aware Multilingual Lexicon Generation ACL 2025 LoFTI: Localization and Factuality Transfer to Indian Locales ACL 2025 LEVOS: Leveraging Vocabulary Overlap with Sanskrit to Generate Technical Lexicons in Indian Languages ACL 2025 CoSTA: Code-Switched Speech Translation using Aligned Speech-Text Interleaving COLING 2025 Zero-Shot Cross-Lingual Transfer using Prefix-Based Adaptation EMNLP 2025 Cross-lingual Transfer Dynamics in BLOOMZ: Insights into Multilingual Generalization EMNLP 2025 RECAST: Retrieval-Augmented Contextual ASR via Decoder-State Keyword Spotting EMNLP 2025 DIMSIM: Distilled Multilingual Critics for Indic Text Simplification ACL 2024 Translation Errors Significantly Impact Low-Resource Languages in Cross-Lingual Learning EACL 2024 MULTI-CONVFORMER: Extending Conformer with Multiple Convolution Kernels INTERSPEECH 2024 STORiCo: Storytelling TTS for Hindi with Character Voice Modulation EACL 2024 Improving Self-supervised Pre-training using Accent-Specific Codebooks INTERSPEECH 2024 Emotion Arithmetic: Emotional Speech Synthesis via Weight Space Interpolation INTERSPEECH 2024 DictDis: Dictionary Constrained Disambiguation for Improved NMT EMNLP 2024 SALSA: Speedy ASR-LLM Synchronous Aggregation INTERSPEECH 2024 WikiDO: A New Benchmark Evaluating Cross-Modal Retrieval for Vision-Language Models NIPS 2024 Parameter-efficient Adaptation of Multilingual Multimodal Models for Low-resource ASR EMNLP 2024 In-context Mixing (ICM): Code-mixed Prompts for Multilingual LLMs ACL 2024 Boosting Zero-Shot Crosslingual Performance using LLM-Based Augmentations with Effective Data Selection ACL 2024 Part-of-speech Tagging for Extremely Low-resource Indian Languages ACL 2024 Zero-shot Cross-lingual Transfer With Learned Projections Using Unlabeled Target-Language Data ACL 2023 Improving Pretraining Techniques for Code-Switched NLP ACL 2023 DITTO: Data-efficient and Fair Targeted Subset Selection for ASR Accent Adaptation ACL 2023 Adversarial Training for Low-Resource Disfluency Correction ACL 2023 Accented Speech Recognition With Accent-specific Codebooks EMNLP 2023 Speech-enriched Memory for Inference-time Adaptation of ASR Models to Word Dictionaries EMNLP 2023 DISCO: A Large Scale Human Annotated Corpus for Disfluency Correction in Indo-European Languages EMNLP 2023 In-Situ Text-Only Adaptation of Speech Models with Low-Overhead Speech Imputations ICLR 2023 Temporally Aligning Long Audio Interviews with Questions: A Case Study in Multimodal Data Integration IJCAI 2023 Unsupervised Code-switched Text Generation from Parallel Text INTERSPEECH 2023 DisfluencyFixer: A tool to enhance Language Learning through Speech To Speech Disfluency Correction INTERSPEECH 2023 Improving RNN-Transducers with Acoustic LookAhead INTERSPEECH 2023 Narrator or Character: Voice Modulation in an Expressive Multi-speaker TTS INTERSPEECH 2023 Zero-shot Disfluency Detection for Indian Languages COLING 2022 SPLICEOUT: A Simple and Efficient Audio Augmentation Method INTERSPEECH 2022 Linguistically Informed Post-processing for ASR Error correction in Sanskrit INTERSPEECH 2022 VAgyojaka: An Annotating and Post-Editing Tool for Automatic Speech Recognition INTERSPEECH 2022 CoCoa: An Encoder-Decoder Model for Controllable Code-switched Generation EMNLP 2022 Partitioned Gradient Matching-based Data Subset Selection for Compute-Efficient Robust ASR Training EMNLP 2022 Accurate Online Posterior Alignments for Principled Lexically-Constrained Decoding ACL 2022 Aligning Multilingual Embeddings for Improved Code-switched Natural Language Understanding COLING 2022 MUCS 2021: Multilingual and Code-Switching ASR Challenges for Low Resource Indian Languages INTERSPEECH 2021 Reduce and Reconstruct: ASR for Low-Resource Phonetic Languages INTERSPEECH 2021 Meta-Learning for Effective Multi-task and Multilingual Modelling EACL 2021 The Effect of Pretraining on Extractive Summarization for Scientific Documents NAACL 2021 Perturb, Predict & Paraphrase: Semi-Supervised Learning using Noisy Student for Image Captioning IJCAI 2021 Automatic Speech Recognition in Sanskrit: A New Speech Corpus and Modelling Insights ACL 2021 From Machine Translation to Code-Switching: Generating High-Quality Code-Switched Text IJCNLP 2021 Automatic Speech Recognition in Sanskrit: A New Speech Corpus and Modelling Insights IJCNLP 2021 Low Resource ASR: The Surprising Effectiveness of High Resource Transliteration INTERSPEECH 2021 Cross-Modal Learning for Audio-Visual Video Parsing INTERSPEECH 2021 From Machine Translation to Code-Switching: Generating High-Quality Code-Switched Text ACL 2021 Disfluency Correction using Unsupervised and Semi-supervised Learning EACL 2021 The Effectiveness of Intermediate-Task Training for Code-Switched Natural Language Understanding EMNLP 2021 Generating Fluent Translations from Disfluent Text Without Access to Fluent References: IIT Bombay@IWSLT2020 ACL 2020 Black-Box Adaptation of ASR for Accented Speech INTERSPEECH 2020 Caption Alignment for Low Resource Audio-Visual Data INTERSPEECH 2020 Improving Low Resource Code-Switched ASR Using Augmented Code-Switched TTS INTERSPEECH 2020 How Accents Confound: Probing for Accent Information in End-to-End Speech Recognition Systems ACL 2020 Cross-Lingual Training for Automatic Question Generation ACL 2019 Exploiting Monolingual Speech Corpora for Code-Mixed Speech Recognition INTERSPEECH 2019 Revisiting the Importance of Encoding Logic Rules in Sentiment Classification EMNLP 2018 Code-switched Language Models Using Dual RNNs and Same-Source Pretraining EMNLP 2018 Time Aggregation Operators for Multi-label Audio Event Detection INTERSPEECH 2018 Dual Language Models for Code Switched Speech Recognition INTERSPEECH 2018 Improved Accented Speech Recognition Using Accent Embeddings and Multi-task Learning INTERSPEECH 2018 Generalizing Across Domains via Cross-Gradient Training ICLR 2018 Automatic Speech Recognition Using Probabilistic Transcriptions in Swahili, Amharic, and Dinka INTERSPEECH 2016 Large-scale discriminative language model reranking for voice-search NAACL 2012 Investigations into the Crandem Approach to Word Recognition NAACL 2010