Anna Korhonen

145 papers · 2000–2026 · 10 conferences · across top CS/AI conferences

Achievements

+16 more ↓

🌍 Conference Polyglot (10) 🗺️ Taxonomy Completionist (15) 🧭 Keyword Pioneer 🌉 Interdisciplinary Bridge 🏃 Academic Marathon (25)

🏃 Academic Marathon (25) 🌍 Conference Polyglot (10) 🗺️ Taxonomy Completionist (15) 🌟 Keyword Trendsetter Combo (5) 🏠 Conference Loyalist (40) 🔬 Deep Specialist (37) 🏆 Keyword Champion (2) 🧬 Topic Evolution 🤝 Dynamic Duo (74) ⚡ Prolific Year (16) ❓ The Questioner (7) 🗃️ Keyword Collector (375) 💎 Century Club (144) 📈 Trend Setter 🔥 Unstoppable (10) 🚀 Conference Pioneer

Conferences

EMNLP (48) ACL (40) COLING (19) NAACL (12) IJCNLP (9) EACL (7) CONLL (6) SEMEVAL (2) AAAI (1) NIPS (1)

Top co-authors

Ivan Vulić (74) Roi Reichart (26) Goran Glavaš (22) Edoardo Maria Ponti (20) Nigel Collier (12) Fangyu Liu (9) Lin Sun (8) Qianchu Liu (8) Diana McCarthy (8) Yi Zhu (6)

Research topics

Linguistics (1)

Keywords

cross-lingual transfer (23) large language model (17) zero-shot learning (15) word embedding (11) few-shot learning (10) multilingual nlp (10) bilingual lexicon induction (10) low-resource language (9) pretrained language model (9) task-oriented dialogue (7) transfer learning (7) representation learning (7) language model (7) contrastive learning (6) cross-lingual word embedding (6) lexical semantics (6) zero-shot transfer (5) text classification (5) cross-lingual embedding (4) natural language inference (4)

Papers

When Meanings Meet: Investigating the Emergence and Quality of Shared Concept Spaces during Multilingual Language Model Training EACL 2026 Iterative Multilingual Spectral Attribute Erasure EMNLP 2025 Explainability and Interpretability of Multilingual Large Language Models: A Survey EMNLP 2025 Quantifying Language Disparities in Multilingual Large Language Models EMNLP 2025 Threading the Needle: Reweaving Chain-of-Thought Reasoning to Explain Human Label Variation EMNLP 2025 A Rose by Any Other Name: LLM-Generated Explanations Are Good Proxies for Human Explanations to Collect Label Distributions on NLI ACL 2025 Improving Preference Extraction In LLMs By Identifying Latent Knowledge Through Classifying Probes ACL 2025 Cultural Learning-Based Culture Adaptation of Language Models ACL 2025 Large Language Models are Miscalibrated In-Context Learners ACL 2025 TopViewRS: Vision-Language Models as Top-View Spatial Reasoners EMNLP 2024 Fairer Preferences Elicit Improved Human-Aligned Large Language Model Judgments EMNLP 2024 Spectral Editing of Activations for Large Language Model Alignment NIPS 2024 Investigating the Potential of Task Arithmetic for Cross-Lingual Transfer EACL 2024 Reranking Overgenerated Responses for End-to-End Task-Oriented Dialogue Systems COLING 2024 LoSST-AD: A Longitudinal Corpus for Tracking Alzheimer’s Disease Related Changes in Spontaneous Speech COLING 2024 Self-Augmented In-Context Learning for Unsupervised Word Translation ACL 2024 Can Rule-Based Insights Enhance LLMs for Radiology Report Classification? Introducing the RadPrompt Methodology. ACL 2024 Your Prompt Is My Command: On Assessing the Human-Centred Generality of Multimodal Models (Abstract Reprint) AAAI 2024 DIALIGHT: Lightweight Multilingual Development and Evaluation of Task-Oriented Dialogue Systems with Large Language Models NAACL 2024 SQATIN: Supervised Instruction Tuning Meets Question Answering for Improved Dialogue NLU NAACL 2024 Are Large Language Model Temporally Grounded? NAACL 2024 “Seeing the Big through the Small”: Can LLMs Approximate Human Judgment Distributions on NLI from a Few Explanations? EMNLP 2024 LongForm: Effective Instruction Tuning with Reverse Instructions EMNLP 2024 TurkishMMLU: Measuring Massive Multitask Language Understanding in Turkish EMNLP 2024 SynthEval: Hybrid Behavioral Testing of NLP Models with Synthetic CheckLists EMNLP 2024 Quantifying the Dialect Gap and its Correlates Across Languages EMNLP 2023 Translation-Enhanced Multilingual Text-to-Image Generation ACL 2023 Cross-Lingual Transfer with Target Language-Ready Task Adapters ACL 2023 Multi3NLU++: A Multilingual, Multi-Intent, Multi-Domain Dataset for Natural Language Understanding in Task-Oriented Dialogue ACL 2023 Distilling Efficient Language-Specific Models for Cross-Lingual Transfer ACL 2023 Can Pretrained Language Models (Yet) Reason Deductively? EACL 2023 Probing Cross-Lingual Lexical Knowledge from Multilingual Sentence Encoders EACL 2023 Delving Deeper into Cross-lingual Visual Question Answering EACL 2023 Unifying Cross-Lingual Transfer across Scenarios of Resource Scarcity EMNLP 2023 Transfer-Free Data-Efficient Multilingual Slot Labeling EMNLP 2023 A Systematic Study of Performance Disparities in Multilingual Task-Oriented Dialogue Systems EMNLP 2023 Detecting and Mitigating Hallucinations in Multilingual Summarisation EMNLP 2023 On Bilingual Lexicon Induction with Large Language Models EMNLP 2023 Language-Agnostic Bias Detection in Language Models with Bias Probing EMNLP 2023 Survival of the Most Influential Prompts: Efficient Black-Box Prompt Search via Clustering and Pruning EMNLP 2023 BAD-X: Bilingual Adapters Improve Zero-Shot Cross-Lingual Transfer NAACL 2022 Data Augmentation and Learned Layer Aggregation for Improved Multilingual Language Understanding in Dialogue ACL 2022 Improving Word Translation via Two-Stage Contrastive Learning ACL 2022 Composable Sparse Fine-Tuning for Cross-Lingual Transfer ACL 2022 Improving Bilingual Lexicon Induction with Cross-Encoder Reranking EMNLP 2022 Measuring Context-Word Biases in Lexical Semantic Datasets EMNLP 2022 Verb Knowledge Injection for Multilingual Event Processing ACL 2021 A Closer Look at Few-Shot Crosslingual Transfer: The Choice of Shots Matters ACL 2021 Learning Domain-Specialised Representations for Cross-Lingual Biomedical Entity Linking IJCNLP 2021 Verb Knowledge Injection for Multilingual Event Processing IJCNLP 2021 A Closer Look at Few-Shot Crosslingual Transfer: The Choice of Shots Matters IJCNLP 2021 LexFit: Lexical Fine-Tuning of Pretrained Language Models IJCNLP 2021 LexFit: Lexical Fine-Tuning of Pretrained Language Models ACL 2021 Improving Machine Translation of Rare and Unseen Word Senses EMNLP 2021 MAD-G: Multilingual Adapter Generation for Efficient Cross-Lingual Transfer EMNLP 2021 MirrorWiC: On Eliciting Word-in-Context Representations from Pretrained Language Models EMNLP 2021 AM2iCo: Evaluating Word Meaning in Context across Low-Resource Languages with Adversarial Examples EMNLP 2021 Fast, Effective, and Self-Supervised: Transforming Masked Language Models into Universal Lexical and Sentence Encoders EMNLP 2021 Combining Deep Generative Models and Multi-lingual Pretraining for Semi-supervised Document Classification EACL 2021 MirrorWiC: On Eliciting Word-in-Context Representations from Pretrained Language Models CONLL 2021 Learning Domain-Specialised Representations for Cross-Lingual Biomedical Entity Linking ACL 2021 Multidirectional Associative Optimization of Function-Specific Word Representations ACL 2020 Specializing Unsupervised Pretraining Models for Word-Level Semantic Similarity COLING 2020 Emergent Communication Pretraining for Few-Shot Machine Translation COLING 2020 Manual Clustering and Spatial Arrangement of Verbs for Multilingual Evaluation and Typology Analysis COLING 2020 SemEval-2020 Task 2: Predicting Multilingual and Cross-Lingual (Graded) Lexical Entailment COLING 2020 XCOPA: A Multilingual Dataset for Causal Commonsense Reasoning EMNLP 2020 The Secret is in the Spectra: Predicting Cross-lingual Task Performance with Spectral Similarity Measures EMNLP 2020 Towards Better Context-aware Lexical Semantics:Adjusting Contextualized Representations through Static Anchors EMNLP 2020 Probing Pretrained Language Models for Lexical Semantics EMNLP 2020 SemEval-2020 Task 2: Predicting Multilingual and Cross-Lingual (Graded) Lexical Entailment SEMEVAL 2020 Improving Bilingual Lexicon Induction with Unsupervised Post-Processing of Monolingual Word Vector Spaces ACL 2020 Classification-Based Self-Learning for Weakly Supervised Bilingual Lexicon Induction ACL 2020 Investigating Word-Class Distributions in Word Vector Spaces ACL 2020 Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics ACL 2019 Enhancing biomedical word embeddings by retrofitting to verb clusters ACL 2019 A Systematic Study of Leveraging Subword Information for Learning Word Representations NAACL 2019 Investigating Cross-Lingual Alignment Methods for Contextualized Embeddings with Token-Level Evaluation CONLL 2019 Show Some Love to Your n-grams: A Bit of Progress and Stronger n-gram Language Modeling Baselines NAACL 2019 Semi-Supervised Bootstrapping of Dialogue State Trackers for Task-Oriented Modelling IJCNLP 2019 Proceedings of TyP-NLP: The First Workshop on Typology for Polyglot NLP ACL 2019 Semi-Supervised Bootstrapping of Dialogue State Trackers for Task-Oriented Modelling EMNLP 2019 Cross-lingual Semantic Specialization via Lexical Relation Induction EMNLP 2019 Towards Zero-shot Language Modeling EMNLP 2019 Do We Really Need Fully Unsupervised Cross-Lingual Embeddings? EMNLP 2019 Do We Really Need Fully Unsupervised Cross-Lingual Embeddings? IJCNLP 2019 Towards Zero-shot Language Modeling IJCNLP 2019 Bayesian Learning for Neural Dependency Parsing NAACL 2019 Cross-lingual Semantic Specialization via Lexical Relation Induction IJCNLP 2019 On the Importance of Subword Information for Morphological Tasks in Truly Low-Resource Languages CONLL 2019 Post-Specialisation: Retrofitting Vectors of Words Unseen in Lexical Resources NAACL 2018 Adversarial Propagation and Zero-Shot Cross-Lingual Transfer of Word Vector Specialization EMNLP 2018 On the Relation between Linguistic Typology and (Limitations of) Multilingual Language Modeling EMNLP 2018 Proceedings of the 22nd Conference on Computational Natural Language Learning CONLL 2018 Isomorphic Transfer of Syntactic Structures in Cross-Lingual NLP ACL 2018 Automatic Selection of Context Configurations for Improved Class-Specific Word Representations CONLL 2017 Morph-fitting: Fine-Tuning Word Vector Spaces with Simple Language-Specific Rules ACL 2017 Cross-Lingual Induction and Transfer of Verb Classes Based on Word Vector Space Specialisation EMNLP 2017 Evaluation by Association: A Systematic Study of Quantitative Word Association Evaluation EACL 2017 On the Role of Seed Lexicons in Learning Bilingual Word Embeddings ACL 2016 SimVerb-3500: A Large-Scale Evaluation Set of Verb Similarity EMNLP 2016 Anchoring and Agreement in Syntactic Annotations EMNLP 2016 Learning Distributed Representations of Sentences from Unlabelled Data NAACL 2016 Robust Text Classification for Sparsely Labelled Data Using Multi-level Embeddings COLING 2016 Survey on the Use of Typological Information in Natural Language Processing COLING 2016 Is “Universal Syntax” Universally Useful for Learning Distributed Word Representations? ACL 2016 An Unsupervised Model for Instance Level Subcategorization Acquisition EMNLP 2014 Learning Abstract Concept Embeddings from Multi-Modal Data: Since You Probably Can’t See What I Mean EMNLP 2014 CRAB 2.0: A text mining tool for supporting literature review in chemical cancer risk assessment COLING 2014 Improving Multi-Modal Representations Using Image Dispersion: Why Less is Sometimes More ACL 2014 Concreteness and Subjectivity as Dimensions of Lexical Meaning ACL 2014 Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing EMNLP 2013 Diathesis alternation approximation for verb clustering ACL 2013 Improved Lexical Acquisition through DPP-based Verb Clustering ACL 2013 A Tensor-based Factorization Model of Semantic Compositionality NAACL 2013 Improved Information Structure Analysis of Scientific Documents Through Discourse and Lexical Constraints NAACL 2013 Document and Corpus Level Inference For Unsupervised and Transductive Learning of Information Structure of Scientific Documents COLING 2012 Unsupervised Metaphor Paraphrasing using a Vector Space Model COLING 2012 CRAB Reader: A Tool for Analysis and Visualization of Argumentative Zones in Scientific Literature COLING 2012 Modelling selectional preferences in a lexical hierarchy SEMEVAL 2012 Learning Syntactic Verb Frames using Graphical Models ACL 2012 Using Argumentative Zones for Extractive Summarization of Scientific Articles COLING 2012 Multi-way Tensor Factorization for Unsupervised Lexical Acquisition COLING 2012 Hierarchical Verb Clustering Using Graph Factorization EMNLP 2011 A Weakly-supervised Approach to Argumentative Zoning of Scientific Documents EMNLP 2011 Probabilistic models of similarity in syntactic context EMNLP 2011 Latent Vector Weighting for Word Meaning in Context EMNLP 2011 Investigating the cross-linguistic potential of VerbNet-style classification COLING 2010 Exploring variation across biomedical subdomains COLING 2010 Metaphor Identification Using Verb and Noun Clustering COLING 2010 Improving Verb Clustering with Automatically Acquired Selectional Preferences EMNLP 2009 VerbNet overview, extensions, mappings and applications NAACL 2009 Automatic Classification of English Verbs Using Rich Syntactic Features IJCNLP 2008 The Choice of Features for Classification of Verbs in Biomedical Texts COLING 2008 A System for Large-Scale Acquisition of Verbal, Nominal and Adjectival Subcategorization Frames from Corpora ACL 2007 Automatic Classification of Verbs in Biomedical Texts ACL 2006 Automatic Classification of Verbs in Biomedical Texts COLING 2006 Automatic Acquisition of Adjectival Subcategorization from Corpora ACL 2005 Improving Subcategorization Acquisition Using Word Sense Disambiguation ACL 2003 Clustering Polysemic Subcategorization Frame Distributions Semantically ACL 2003 On the Robustness of Entropy-Based Similarity Measures in Evaluation of Subcategorization Acquisition Systems CONLL 2002 Statistical Filtering and Subcategorization Frame Acquisition ACL 2000 Statistical Filtering and Subcategorization Frame Acquisition EMNLP 2000 Using Semantically Motivated Estimates to Help Subcategorization Acquisition EMNLP 2000 Using Semantically Motivated Estimates to Help Subcategorization Acquisition ACL 2000