 I'm gonna talk today about gender bias in machine translation and the title of my talk is that's what she said and I hope that's gonna become a bit more clear during the talk. So first of all, I wanna talk a little bit about terminology because I do think words matter. So I think as most of you agree and according to most dictionary definitions, gender is something that is non-binary and non-static. It's something that exists along a continuum because it's a socially constructed term. However, today I'll be talking about gender within the field of linguistics and there are different types of gender that actually come into play. And I'm gonna define first natural and grammatical gender to you so that it's clear what I'm talking about. So first of all, in linguistics, you have natural gender and that's actually quite easy to grasp, I think. It's either masculine, feminine, or neutral and it's a gender based on the sex or for neutral the lack of sex of a reference of a noun. So for instance, the word girl in English is feminine and you refer to it by using a feminine pronoun she and similar for boy you to see and table it. And this is the only kind of gender you have actually in English except for some exceptions. Now, many other languages, however, have also grammatical gender and grammatical gender is a bit more difficult to grasp because it's often used as a synonym for a noun class. So some languages make a distinction between masculine and feminine, some between masculine feminine and neutral but there are also languages that distinguish for instance between animate and inanimate or even languages that have more than three genders like for instance, Swahili has 16 grammatical genders which means actually that there are 16 noun classes. Now, how does grammatical gender manifest itself? It manifests itself differently in different languages. So in terms of inflections, you can have inflections on nouns, agreement with determiners, pronouns, et cetera. So there are different ways and it's also important to note maybe that grammatical gender and natural gender don't necessarily agree. An example of that would be das mitchen in German which is neutral but it refers to the girl. Same actually in Dutch, Nesche is actually referring to a feminine reference. So just an illustration and an extra illustration of that. So this is a sentence from the book from Mark Twain, a tremper broad where there is a section on the awful German language and he's explaining how awful the German language is and he's trying to illustrate that by literally translating a part of the German text and then you get something like here, the rain, how he pours and the hail, how he rattles because rain and hail and all the other nouns in German, they have a particular gender, which is hard when you're trying to learn or understand a new language. A second illustration of grammatical gender is given also in the book by David Sedaris. Me talk pretty someday, which is in this case here for this paragraph talking about an American who's trying to learn French. And so basically he's trying to hack the system because he doesn't get the grammatical gender ID or he finds it very hard because you need to learn by heart what the gender is of every word. And so his hack is that he started to refer to everything in plural because although in the singular you make a distinction with le and la, for instance, for nouns in the plural, it's always le. So you don't need to know the gender. And he's stating here in this paragraph that this has actually solved a lot of problems for him because he's not embarrassing himself anymore in front of his kids and in front of French people. But of course it kind of shifted the problem to his fridge because now whenever he goes shopping he needs to buy everything in plural. And so I think that brings us then to the actual topic, machine translation. So what's actually the issue with all of this? Well, if you take a very small, simple sentence like I'm happy, you would basically need to know the gender of the person who's saying the sentence in order to translate this correctly. So this can either become je suis heureux which is with a masculine ending or je suis heureuse when it's a woman or a girl speaking or someone who prefers the female gender. The current way that this is dealt with or not dealt with maybe is we're feeding a lot of data and the idea is that we can let the data decide. Now the problem with that is when we, for instance, analyzed one of the biggest corpora or one of the most popular corpora for machine translation, Europarl, which is the proceedings of the European Parliament, we saw that only 30% of the speakers are female. And of course this will have consequences for the translation and certain preferences of endings that you'll generate. Then aside from that, we use word embeddings. Word embeddings are something that have really revolutionized I think the field of NLP. It's based on an idea of distributional semantics that is quite old. So here you see this quote, you shall know a word by the company it keeps and that's the idea behind it is that you can define words based on the words that they appear often with. And so as I said, this has really revolutionized the field and it has led to really great improvements in the field of NLP, but it also has some side effects and you can see here, for instance, the adjectives that are the closest then to the word he and she. And as you can see, there is some bias in the words or the associations that this makes. And you see also that there are more words, for instance, associated with men than with women and the bigger the words, the stronger the connection. So what does that mean for machine translation? Well, if you look at translation, translations generated by machine translation system. So if we download the data speak, we see things like this. I'm beautiful being translated into the male form just we go in French, I'm clever also in the male form, but then when you combine it in a different way, I'm beautiful but not clever. You suddenly get a female or a translation where the speaker is female in French. Of course, I want to say here, none of these translations are wrong, but you see that there are some biases. Sometimes, however, this also leads to actual mistakes. For instance, here you see a sentence in Bulgarian. Just leave us some, which is feminine because of the ending I here. And when you translate that, you get a masculine version in French. So this is an actual mistake, probably because Google is using a pivot language like English in between. And so it loses the general information. Another example is the speaker is a woman which is translated into l'orateur et une femme where actually the speaker is in the male form and in femme is, of course, female. So there is a disconnect here, similar for speakers, my wife and some other examples, but of course it doesn't only happen for sentences where there is agreement with female form, it can also happen the other way around like the nurse is a man because in the corpora usually there are more female versions of the word nurse. So what are the proposed solutions? So my work from 2018 was actually the first that addressed this issue and we tried to append texts during training that indicated the gender of the speaker and this way we could actually control the output of the sentences that we were generating. This was only of course in terms of the gender of the speaker. So it's quite limited. Then soon after my work, there was another work by Mariussef et al who did something similar and what they did was they appended she set or he set in front of the sentences. And I think that's kind of what inspired the title of my talk. These two methods offer you some control but aside from that, there have been other solutions proposed like diversifying the data sets which can be done by for instance, duplicating sentences. It's called counterfactual data augmentation and it means that if you have a sentence in the male form you'll create also an equivalent in the female form but here also there are some issues because this doesn't really offer any control with respect to the translations. Another solution that is quite interesting is devising by using these vectors. So you look basically at the difference between words like female and male, she he and you identify that subspace which is then the bias. You can do that for multiple words, average it and then neutralize words. So for instance, the word receptionist shouldn't be biased towards a specific gender. So you can basically remove the gender dimension from that word and then you get a neutralized version of that word. So that brings me I think to the closing remarks already. So one question that I get very often is why is this relevant? And as a matter of fact, while I was preparing the talk and this particular slide, I got a question. I got exactly this question on LinkedIn because they shared a teaser about my talk. So why is this relevant? I hit the name of the person and doesn't this just reflect the world as it is? Well, I think I have some answers to that and the first one is it actually leads to translation errors. As we've seen, I've given some examples of that. Second of all, the data that we're using is outdated. We're using big data sets that are usually already older. So we're not actually encoding the present situation. Then there are examples of real world consequences of using NLP technology. For instance, there are some examples here on the slide, Amazon Scrap's secret AI recruiting tool that showed bias against women. A fourth element is that there is an indication that there is on top of the bias that is in the models, also an algorithmic bias in the sense that the system seemed to exacerbate the situation. So leading to a loss of diversity and richness in language in general, which is a paper we've published very recently. Fourth, I do believe in the power of language. So I do think language can actually affect the way we think or the other way around, which is not my idea, but something that has been discussed by linguists for a long time. And then the last point is that I think AI could and should probably be a force for inclusion. So although even if our world is biased at the moment or we communicate biases consciously or unconsciously, I think we can try and do at least a better job. And yeah, so that's it. Thank you very much for your attention.