 Vector embeddings, either context-sensitive or context-ablevious, necessarily capture the distributional properties of the language they're trained on. That is, of course, the key to their success, but sometimes they capture statistics that you don't want reflected in your models. Either because they're historical, that is that there were professions that used to be all male and you don't want to capture them, or they may build in biases that you don't want. Now the word bias is a terrible word because it means lots of different things, so we'll try and be careful about what we mean in each of these cases. So let's just look at the genderedness of different embeddings. Here's the embeddings of a bunch of different words, like nurse and secretary and carpenter engineer, and on the x-axis is the percentage occupation in the US of women in this field, secretaries very high, carpenters very low, engineers quite male-dominated, housekeepers and dancers quite female-dominated. So this is an empirical observation about the world. Now let's look at language embeddings, and we can map every embedding of each of these words on a dimension between, for example, he and she. So we'll call a vector embedding that's closer to he, more male, and one that's closer to she, more female. So what do we see? We see that nurse is, of these words, the most female embedded one and carpenter the most male embedded one. In general, what happens? The femaleness of the vector embeddings correlates fairly strongly with the femaleness of the profession empirically in the world. Some seem rather misguided. Nurses are labeled language-ly as far more female than they are, in fact, in the real world. There are a increasing number of male nurses. I think this is partially historical error in that nurses used to be more female. Other ones like secretary is fairly neutral, very slightly male. Why would this be? Even though 85%, 90% of all secretaries are female, think of how the word secretary is used in English. It's the secretary of state and the secretary of treasury. And even though lots of those are women, a bunch of those are also men. And certainly historically, most of those secretaries of state before the Hillary Clinton era, well, it's been several, 20 years now. But historically, historically, many of those were male. So this word probably reflects different meanings of secretary and different usages. So note that there is the fact that the words reflect the distribution in the world because they reflect the distribution in text. And this may be something that you want in your model, and it may be something that you don't want in your model. Similarly, Burt Beddings have bias which reflects the language they're trained on, lots of the web, lots of other things. We take a sentence, not a very nice one, sorry. Jamal raped Leslie in prison, separator, he was part of a gang. And you can see where the attention head goes. He attends in the self-attention to Jamal. Okay, makes sense, even though Jamal and Leslie are both male. But let's do a funny probe. Let's just swap the sentence and swap Leslie and Jamal and say Leslie and Jamal and now the he attends mostly to Jamal, the victim, not the perpetrator. Why is it that Burt is changing these two quasi-interchangeable names of Jamal and Leslie? Well, Jamal is much more of an African American name. It shows up a lot more with African Americans. And we get different associations of he, he was part of the gang. The model is assuming that Jamal is more likely to be part of the gang. The he is more likely to refer to the guy who's in the gang than the other guy. So black names are more associated with gangs than white names like Leslie. Okay, that reflects the language it was trained on, but perhaps not a reflection that you want to perpetrate if you're, say, writing software to decide who to hire as a programmer at Google, where you're gonna use a pre-trained model that brings in all of these biases, biases reflecting, again, the language it was trained on. So there's a fairly active, very active field of research right now on de-biasing models. And you can think of this as the simplest version, if you think of a simple vector embedding like Word2Vec or FastText, to take whatever dimension is the dimension that goes between he and she, and project it out. Take your vector embeddings and remove, in some fashion, rather the gendered direction, leaving things in. And what people find is this often helps in a number of translations. If you look at a number of languages like Spanish, which does what's called pro-drop, it drops the pronoun. So often there'll be sentences in an article about some person like Frito Kahlo that in English you would either say she or he. Okay, you'd say she for Frito, because Frito is a woman, was a woman. Chinese, it's similar. You don't necessarily need to specify if someone is male or female. If you then have a computer that translates from Spanish to English or Chinese to English, it tends to reflect the overall basis of the assumptions of the data it was trained on, the statistical background. And it forgets certain other information that any human reading it, well, forget the picture, which gives a hint as to her gender. But even though somewhere in here is enough in peace here, she was married to Diego Rivera in a time when only men and women were married in Mexico, sort of a hint she's a woman. Lots of facts in this document say that she's a woman. But nonetheless, Google Translate as of a couple years ago would translate some of these she's and some of them he's, and a modern translation system with a gender bias removal really will do a much better job at filling in all the missing she's. So de-biasing the models can help remove the base rate assumptions that most genders in some given field, painting, writing whatever are male, and replace those with with models that do a better job of taking account of the actual cues in the text.