A Taxonomy of Bias-Causing Ambiguities in Machine Translation

Michal Měchura

2022 NAACL NAACL 2022

A Taxonomy of Bias-Causing Ambiguities in Machine Translation

Abstract

AbstractThis paper introduces a taxonomy of phenomena which cause bias in machine translation, covering gender bias (people being male and/or female), number bias (singular you versus plural you) and formality bias (informal you versus formal you). Our taxonomy is a formalism for describing situations in machine translation when the source text leaves some of these properties unspecified (eg. does not say whether doctor is male or female) but the target language requires the property to be specified (eg. because it does not have a gender-neutral word for doctor). The formalism described here is used internally by a web-based tool we have built for detecting and correcting bias in the output of any machine translator.

🌉 Interdisciplinary Bridge — Machine Learning and Natural Language Processing

🧭 Keyword Pioneer — formality bia

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Security & Privacy, Speech & Audio