Thesis Proposal: Targeted and Unified Cross-Lingual Unlearning from Multilingual Language Models

Jan Bronec; Jindřich Helcl

2026 ACL ACL 2026

Thesis Proposal: Targeted and Unified Cross-Lingual Unlearning from Multilingual Language Models

Abstract

AbstractAs large language models (LLM) trained on massive corpora scraped from the web exhibit the capability to reproduce sensitive and copyright-protected data, the field of machine unlearning has emerged to address the arising ethical and legal concerns.While previous research has provided a unified evaluation of LLM unlearning methods, this unification remains constrained to English-only models and datasets.We aim to address the prevailing fragmentation in recent cross-lingual unlearning research by extending existing unified benchmarks with multilingual data.To that end, we plan to compile a dataset of parallel translations of question-answer pairs consisting of real-world facts and synthetic personally identifiable information.Moreover, we will focus on mitigating model degradation during unlearning by selectively editing only those layers that contain the given knowledge.

Authors

Jan Bronec , Jindřich Helcl

Topics

Natural Language Processing > Resources & Methods > Multilingual NLP Deep Learning > Models > Large Language Models Machine Learning > Learning Types > Machine Unlearning

Keywords

knowledge editing machine unlearning model degradation personally identifiable information multilingual language model cross-lingual unlearning

Download PDF

Related papers

No Reader Left Behind: Multi-Agent Summaries Everyone Can Understand 2026

One-step Nonautoregressive Natural Language Generation with Shortcut Flow Matching Models 2026

Optimizing Retrieval-Augmented Generation for E-Commerce How-To Assistance 2026

Make Mechanistic Interpretability Auditable: A Call to Develop Guidelines via Continuous Collaborative Reviewing 2026

MQM Re-Annotation: A Technique for Collaborative Evaluation of Machine Translation 2026