Anoop Kunchukuttan

53 papers · 2013–2026 · 9 conferences · across top CS/AI conferences

Achievements

+13 more ↓

🏃 Academic Marathon (12) 🌍 Conference Polyglot (9) 🧭 Keyword Pioneer 🌉 Interdisciplinary Bridge 🐣 Hot Topic Early Bird

🗺️ Taxonomy Completionist (58) 🌉 Interdisciplinary Bridge 🌍 Conference Polyglot (9) 🔬 Deep Specialist (24) 🧬 Topic Evolution 🤝 Dynamic Duo (24) 🏆 Keyword Champion (7) 🗃️ Keyword Collector (172) ❓ The Questioner ⚡ Prolific Year (5) 🚀 Conference Pioneer 💎 Century Club (51) 🔥 Unstoppable (13)

Conferences

ACL (17) EMNLP (13) CONLL (4) EACL (4) IJCNLP (4) NAACL (4) COLING (3) AAAI (2) AACL (2)

Top co-authors

Raj Dabre (26) Ratish Puduppully (12) Mitesh M. Khapra (11) Pratyush Kumar (10) Pushpak Bhattacharyya (10) Isao Goto (6) Ondřej Bojar (6) Mohammed Safi Ur Rahman Khan (6) Shantipriya Parida (6) Toshiaki Nakazawa (6)

Research topics

Linguistics (1)

Keywords

machine translation (14) low-resource language (11) indic language (11) multilingual nlp (8) cross-lingual transfer (7) neural machine translation (7) large language model (7) indian language (7) transfer learning (5) asian language (5) multilingual language model (4) shared task (4) multilingual model (4) automatic evaluation (4) multilingual translation (3) zero-shot learning (3) sequence-to-sequence model (3) language model (3) parallel corpus (3) instruction tuning (3)

Papers

The Reasoning Lingua Franca: A Double-Edged Sword for Multilingual AI EACL 2026 RiddleBench: A New Generative Reasoning Benchmark for LLMs EACL 2026 PRALEKHA: Cross-Lingual Document Alignment for Indic Languages AACL 2025 PRALEKHA: Cross-Lingual Document Alignment for Indic Languages IJCNLP 2025 Data and Model Centric Approaches for Expansion of Large Language Models to New languages EMNLP 2025 RomanLens: The Role Of Latent Romanization In Multilinguality In LLMs ACL 2025 Towards Building Large Scale Datasets and State-of-the-Art Automatic Speech Translation Systems for 14 Indian Languages ACL 2025 Cross-Lingual Auto Evaluation for Assessing Multilingual LLMs ACL 2025 RomanSetu: Efficiently unlocking multilingual capabilities of Large Language Models via Romanization ACL 2024 How Good is Zero-Shot MT Evaluation for Low Resource Indian Languages? ACL 2024 Findings of WMT 2024’s MultiIndic22MT Shared Task for Machine Translation of 22 Indian Languages EMNLP 2024 Synthetic Data Generation and Joint Learning for Robust Code-Mixed Translation COLING 2024 An Empirical Comparison of Vocabulary Expansion and Initialization Approaches For Language Models EMNLP 2024 CharSpan: Utilizing Lexical Similarity to Enable Zero-Shot Machine Translation for Extremely Low-resource Languages EACL 2024 An Empirical Comparison of Vocabulary Expansion and Initialization Approaches For Language Models CONLL 2024 IndicLLMSuite: A Blueprint for Creating Pre-training and Fine-Tuning Datasets for Indian Languages ACL 2024 Aksharantar: Open Indic-language Transliteration datasets and models for the Next Billion Users EMNLP 2023 IndicSUPERB: A Speech Processing Universal Performance Benchmark for Indian Languages AAAI 2023 Naamapadam: A Large-Scale Named Entity Annotated Data for Indic Languages ACL 2023 Towards Leaving No Indic Language Behind: Building Monolingual Corpora, Benchmark and Models for Indic Languages ACL 2023 IndicMT Eval: A Dataset to Meta-Evaluate Machine Translation Metrics for Indian Languages ACL 2023 Bhasa-Abhijnaanam: Native-script and romanized Language Identification for 22 Indic languages ACL 2023 Evaluating Inter-Bilingual Semantic Parsing for Indian Languages ACL 2023 CTQScorer: Combining Multiple Features for In-context Example Selection for Machine Translation EMNLP 2023 DecoMT: Decomposed Prompting for Machine Translation Between Related Languages using Large Language Models EMNLP 2023 Towards Building ASR Systems for the Next Billion Users AAAI 2022 IndicBART: A Pre-trained Model for Indic Natural Language Generation ACL 2022 Overview of the 9th Workshop on Asian Translation COLING 2022 IndicNLG Benchmark: Multilingual Datasets for Diverse NLG Tasks in Indic Languages EMNLP 2022 IndicXNLI: Evaluating Multilingual Inference for Indian Languages EMNLP 2022 Bilingual Tabular Inference: A Case Study on Indic Languages NAACL 2022 Itihasa: A large-scale corpus for Sanskrit to English translation ACL 2021 A Large-scale Evaluation of Neural Machine Transliteration for Indic Languages EACL 2021 Itihasa: A large-scale corpus for Sanskrit to English translation IJCNLP 2021 Overview of the 8th Workshop on Asian Translation IJCNLP 2021 Overview of the 8th Workshop on Asian Translation ACL 2021 Multilingual Neural Machine Translation COLING 2020 Learning Geometric Word Meta-Embeddings ACL 2020 IndicNLPSuite: Monolingual Corpora, Evaluation Benchmarks and Pre-trained Multilingual Language Models for Indian Languages EMNLP 2020 Contact Relatedness can help improve multilingual NMT: Microsoft STCI-MT @ WMT20 EMNLP 2020 Overview of the 7th Workshop on Asian Translation AACL 2020 Proceedings of the 6th Workshop on Asian Translation EMNLP 2019 Overview of the 6th Workshop on Asian Translation EMNLP 2019 Addressing word-order Divergence in Multilingual Neural Machine Translation for extremely Low Resource Languages NAACL 2019 Judicious Selection of Training Data in Assisting Language for Multilingual Neural NER ACL 2018 Utilizing Lexical Similarity between Related, Low-resource Languages for Pivot-based SMT IJCNLP 2017 Statistical Machine Translation between Related Languages NAACL 2016 Orthographic Syllable as basic unit for SMT between Related Languages EMNLP 2016 Substring-based unsupervised transliteration with phonetic and contextual knowledge CONLL 2016 Brahmi-Net: A transliteration and script conversion system for languages of the Indian subcontinent NAACL 2015 Tuning a Grammar Correction System for Increased Precision CONLL 2014 TransDoop: A Map-Reduce based Crowdsourced Translation for Complex Domain ACL 2013 IITB System for CoNLL 2013 Shared Task: A Hybrid Approach to Grammatical Error Correction CONLL 2013