Monojit Choudhury

92 papers · 2006–2026 · 10 conferences · across top CS/AI conferences

Achievements

+14 more ↓

🌍 Conference Polyglot (10) 🏃 Academic Marathon (19) 🌉 Interdisciplinary Bridge 🧭 Keyword Pioneer 🐝 Cross-Pollinator (11)

🐝 Cross-Pollinator (11) 🌈 Renaissance Researcher (9) 🗺️ Taxonomy Completionist (101) 🏠 Conference Loyalist (21) 🤝 Dynamic Duo (22) 👥 Mega-Team (69) 🔬 Deep Specialist (25) 🏆 Keyword Champion (5) ⚡ Prolific Year (15) 🔥 Unstoppable (10) 📈 Trend Setter 🗃️ Keyword Collector (267) 💎 Century Club (87) ❓ The Questioner (9)

Conferences

EMNLP (22) ACL (21) EACL (12) IJCNLP (11) COLING (9) NAACL (7) AACL (6) AAAI (2) CONLL (1) CVPR (1)

Top co-authors

Sandipan Dandapat (23) Kalika Bali (20) Sunayana Sitaram (17) Somak Aditya (11) Animesh Mukherjee (10) Sougata Saha (8) Kabir Ahuja (7) Niloy Ganguly (7) Sebastin Santy (6) Saurabh Kumar Pandey (6)

Research topics

Applications (1)

Keywords

large language model (24) low-resource language (10) multilingual model (8) cross-lingual transfer (7) few-shot learning (7) machine translation (7) multilingual language model (7) multilingual nlp (6) performance prediction (5) multilingual evaluation (5) text classification (5) zero-shot transfer (5) cultural bia (4) sentiment analysis (4) zero-shot learning (4) natural language inference (4) natural language processing (3) transfer learning (3) text generation (3) neural machine translation (3)

Papers

The Anthropology of Food: How NLP can Help us Unravel the Food cultures of the World EACL 2026 Do LLMs model human linguistic variation? A case study in Hindi-English Verb code-mixing EACL 2026 Nanda Family: Open-Weights Generative Large Language Models for Hindi EACL 2026 Viability of Machine Translation for Healthcare in Low-Resourced Languages EMNLP 2025 All Languages Matter: Evaluating LMMs on Culturally Diverse 100 Languages CVPR 2025 CULTURALLY YOURS: A Reading Assistant for Cross-Cultural Content COLING 2025 Women, Infamous, and Exotic Beings: A Comparative Study of Honorific Usages in Wikipedia and LLMs for Bengali and Hindi EMNLP 2025 An Interdisciplinary Approach to Human-Centered Machine Translation EMNLP 2025 Music for All: Representational Bias and Cross-Cultural Adaptability of Music Generation Models NAACL 2025 Libra-Leaderboard: Towards Responsible AI through a Balanced Leaderboard of Safety and Capability NAACL 2025 SMAB: MAB based word Sensitivity Estimation Framework and its Applications in Adversarial Text Generation NAACL 2025 Reading between the Lines: Can LLMs Identify Cross-Cultural Communication Gaps? NAACL 2025 Meta-Cultural Competence: Climbing the Right Hill of Cultural Awareness NAACL 2025 To Generate or Discriminate? Methodological Considerations for Measuring Cultural Alignment in LLMs IJCNLP 2025 LITMUS++ : An Agentic System for Predictive Analysis of Low-Resource Languages Across Tasks and Models IJCNLP 2025 sPhinX: Sample Efficient Multilingual Instruction Fine-Tuning Through N-shot Guided Prompting ACL 2025 LITMUS++ : An Agentic System for Predictive Analysis of Low-Resource Languages Across Tasks and Models AACL 2025 To Generate or Discriminate? Methodological Considerations for Measuring Cultural Alignment in LLMs AACL 2025 Lost in Transcription, Found in Distribution Shift: Demystifying Hallucination in Speech Foundation Models ACL 2025 User Behavior Prediction as a Generic, Robust, Scalable, and Low-Cost Evaluation Strategy for Estimating Generalization in LLMs ACL 2025 Cultural Conditioning or Placebo? On the Effectiveness of Socio-Demographic Prompting EMNLP 2024 Do Moral Judgment and Reasoning Capability of LLMs Change with Language? A Study using the Multilingual Defining Issues Test EACL 2024 Are Large Language Model-based Evaluators the Solution to Scaling Up Multilingual Evaluation? EACL 2024 Evaluating Large Language Models for Health-related Queries with Presuppositions ACL 2024 Ethical Reasoning and Moral Value Alignment of LLMs Depend on the Language We Prompt Them in COLING 2024 INMT-Lite: Accelerating Low-Resource Language Data Collection via Offline Interactive Neural Machine Translation COLING 2024 Tricking LLMs into Disobedience: Formalizing, Analyzing, and Detecting Jailbreaks COLING 2024 “They are uncultured”: Unveiling Covert Harms and Social Threats in LLM Generated Conversations EMNLP 2024 The Zeno’s Paradox of ‘Low-Resource’ Languages EMNLP 2024 Towards Measuring and Modeling “Culture” in LLMs: A Survey EMNLP 2024 Performance and Risk Trade-offs for Multi-word Text Prediction at Scale EACL 2023 Fairness in Language Models Beyond English: Gaps and Challenges EACL 2023 Ethical Reasoning over Moral Alignment: A Case and Framework for In-Context Ethical Policies in LLMs EMNLP 2023 DiTTO: A Feature Representation Imitation Approach for Improving Cross-Lingual Transfer EACL 2023 X-RiSAWOZ: High-Quality End-to-End Multilingual Dialogue Datasets and Few-shot Agents ACL 2023 Everything you need to know about Multilingual LLMs: Towards fair, performant and reliable models for languages of the world ACL 2023 Prover: Generating Intermediate Steps for NLI with Commonsense Knowledge Retrieval and Next-Step Prediction AACL 2023 Prover: Generating Intermediate Steps for NLI with Commonsense Knowledge Retrieval and Next-Step Prediction IJCNLP 2023 LLM-powered Data Augmentation for Enhanced Cross-lingual Performance EMNLP 2023 DUBLIN: Visual Document Understanding By Language-Image Network EMNLP 2023 On the Economics of Multilingual Few-shot Learning: Modeling the Cost-Performance Trade-offs of Machine Translated and Manual Data NAACL 2022 LITMUS Predictor: An AI Assistant for Building Reliable, High-Performing and Fair Multilingual NLP Systems AAAI 2022 Vector Space Interpolation for Query Expansion AACL 2022 Multilingual CheckList: Generation and Evaluation AACL 2022 The SUMEval 2022 Shared Task on Performance Prediction of Multilingual Pre-trained Language Models AACL 2022 Multi Task Learning For Zero Shot Performance Prediction of Multilingual Models ACL 2022 SyMCoM - Syntactic Measure of Code Mixing A Study Of English-Hindi Code-Mixing ACL 2022 Beyond Static models and test sets: Benchmarking the potential of pre-trained models across tasks and languages ACL 2022 Global Readiness of Language Technology for Healthcare: What Would It Take to Combat the Next Pandemic? COLING 2022 On the Calibration of Massively Multilingual Language Models EMNLP 2022 Too Brittle to Touch: Comparing the Stability of Quantization and Distillation towards Developing Low-Resource MT Models EMNLP 2022 Vector Space Interpolation for Query Expansion IJCNLP 2022 ”Diversity and Uncertainty in Moderation” are the Key to Data Selection for Multilingual Few-shot Transfer NAACL 2022 Sample-efficient Linguistic Generalizations through Program Synthesis: Experiments with Phonology Problems IJCNLP 2021 A Linguistic Annotation Framework to Study Interactions in Multilingual Healthcare Conversational Forums EMNLP 2021 Analyzing the Effects of Reasoning Types on Cross-Lingual Transfer Performance EMNLP 2021 Comparing Grammatical Theories of Code-Mixing EMNLP 2021 GCM: A Toolkit for Generating Synthetic Code-mixed Text EACL 2021 Use of Formal Ethical Reviews in NLP Literature: Historical Trends and Current Practices ACL 2021 Sample-efficient Linguistic Generalizations through Program Synthesis: Experiments with Phonology Problems ACL 2021 Use of Formal Ethical Reviews in NLP Literature: Historical Trends and Current Practices IJCNLP 2021 How Linguistically Fair Are Multilingual Pre-Trained Language Models? AAAI 2021 BERTologiCoMix: How does Code-Mixing interact with Multilingual BERT? EACL 2021 TaxiNLI: Taking a Ride up the NLU Hill EMNLP 2020 GLUECoS: An Evaluation Benchmark for Code-Switched NLP ACL 2020 The State and Fate of Linguistic Diversity and Inclusion in the NLP World ACL 2020 TaxiNLI: Taking a Ride up the NLU Hill CONLL 2020 INMT: Interactive Neural Machine Translation Prediction EMNLP 2019 Processing and Understanding Mixed Language Data IJCNLP 2019 INMT: Interactive Neural Machine Translation Prediction IJCNLP 2019 Processing and Understanding Mixed Language Data EMNLP 2019 Phone Merging For Code-Switched Speech Recognition ACL 2018 Word Embeddings for Code-Mixed Language Processing EMNLP 2018 Language Modeling for Code-Mixing: The Role of Linguistic Theory based Synthetic Data ACL 2018 Accommodation of Conversational Code-Choice ACL 2018 Estimating Code-Switching on Twitter with a Novel Generalized Word-Level Language Detection Technique ACL 2017 All that is English may be Hindi: Enhancing language identification through automatic ranking of the likeliness of word borrowing in social media EMNLP 2017 Understanding Language Preference for Expression of Opinion and Sentiment: What do Hindi-English Speakers do on Twitter? EMNLP 2016 Automatic Discovery of Adposition Typology COLING 2014 POS Tagging of English-Hindi Code-Mixed Social Media Content EMNLP 2014 Crowd Prefers the Middle Path: A New IAA Metric for Crowdsourcing Reveals Turker Biases in Query Segmentation ACL 2013 Global topology of word co-occurrence networks: Beyond the two-regime power-law COLING 2010 Large-Coverage Root Lexicon Extraction for Hindi EACL 2009 Syntax is from Mars while Semantics from Venus! Insights from Spectral Analysis of Distributional Similarity Networks IJCNLP 2009 Syntax is from Mars while Semantics from Venus! Insights from Spectral Analysis of Distributional Similarity Networks ACL 2009 Discovering Global Patterns in Linguistic Networks through Spectral Analysis: A Case Study of the Consonant Inventories EACL 2009 Invited Talk: Breaking the Zipfian Barrier of NLP IJCNLP 2008 Modeling the Structure and Dynamics of the Consonant Inventories: A Complex Network Approach COLING 2008 Social Network Inspired Models of NLP and Language Evolution IJCNLP 2008 Redundancy Ratio: An Invariant Property of the Consonant Inventories of the World’s Languages ACL 2007 Analysis and Synthesis of the Distribution of Consonants over Languages: A Complex Network Approach ACL 2006 Analysis and Synthesis of the Distribution of Consonants over Languages: A Complex Network Approach COLING 2006