 Hi everyone. Thanks for coming to my talk. Uh, today I'll be talking about how we can uncover bias in different machine learning models and what the implications might be. And I'm Eileen as John introduced me and I just became an assistant professor and today is my second talk as an assistant professor. Yesterday was my first one. And yes, let's see, is this working enough? Um, yesterday I talked about supervised machine learning being used on language of individuals so that we can identify these individuals based on their linguistic style. And this is a serious privacy concern. But when we look at language at the aggregate level, language of society, we see that there are some fairness problems there because this linguistic data that is coming from society also brings the biases of society, humans with it. And now we are going to be looking at this. And yesterday what I talked about was, uh, about coming up with a method to identify a style of individuals or these can also be programmers so that we can, uh, do attribution. And this has security enhancing properties, but at the same time we see that it's very privacy infringing in some real world cases. And today I'll be talking about a method for quantifying and detecting bias in linguistic data or linguistic machine learning models. And, uh, for this, I basically adapt the implicit association test for humans to machines. And given that, uh, there's the language universal that in every language semantics happens in the same context window. The method that I came up with can be, uh, used in any language basically. Okay, so, uh, in AI, under the umbrella of AI, uh, today we'll be looking at natural language processing and machine learning. And in particular, deep learning and unsupervised learning. In the past there have, there has been, uh, some work on looking at supervised machine learning to see where the bias might be happening, how it can be removed, but it's slightly more difficult to understand these in unsupervised machine learning models since they don't have direct classification outputs and so on. And again, talking about, uh, for example, individual language versus society's language, when we look at individual's language, the syntax that they use in language is the most identifying thing, uh, even in source code. Uh, but today we'll be focusing on semantics, the meaning in language at the societal level. And just a brief summary of, uh, this work on security privacy and machine learning, where I'm using stylometry, uh, which is the study of linguistic style. And since this is linguistic style, we can be looking at natural language, or we can also look at artificial languages, such as programming languages. And some examples are identifying the authors of English text, English as a second language, so that we can identify their native language or translated text and identify the native language or translator, as well as underground forum texts where underground forum users engage in business transactions. And even though, uh, these are very noisy data sets, we can still identify these authors, and sometimes these are suspect sets for, uh, uh, intelligence agencies. And because of that, uh, this work, the tools that we developed and that are open source on GitHub, any one of you can download and use, they're currently being used by the FBI or expert witnesses to be used as scientific evidence in court or European high tech crime units so that they can identify suspects online based on their language. But again, I'm reminding you, this is about individuals' language and the privacy implications that it has. And in artificial languages, focusing on programming languages, in particular Python, C and C++, source code, as well as binaries, uh, this work is being used by DARPA and being part of the Department of Defense, you might imagine why they would be interested in attribution problems, uh, for example, to find the authors of, um, malicious software or, you know, press regimes, authors of, uh, for example, censorship circumvention software. And expert witnesses, again, can now use this information as scientific evidence in court. And I've been collaborating with the US Army Research Laboratory, they are still working on, um, programmer, de-anonymization, programmer attribution. Okay, but today I want to talk more about fairness and language at, uh, societal level. And I'll start with an example. So, uh, the, do any one of you know this author, Robert Galbraith? Okay. So there was this, uh, novel, uh, crime novel called The Cuckoo's Calling by Robert Galbraith. But some people suspected that it might not be written by Robert Galbraith, but instead JK Rowling. And after performing stylometric analysis, it was shown that it was actually written by the famous JK Rowling of the Harry Potter series. And, uh, after the JK Rowling said that, okay, uh, yes, she's the author of this book, but she wanted to use a man's name because it's a crime novel, and the publisher also thought that it would sell better if it was published under the name of a man. And at the same time, they wouldn't know that it was written by JK Rowling, so she would get a more realistic evaluation of her work. So we can see even at high-profile people how bias is affecting society, even for such an important product that they are publishing. And here, uh, this reminds me of the interplay between privacy and fairness. Uh, Big Data's Evil Twins. This is what, what I call them. For example, with privacy, we can have a serious problem when sensitive information is leaked, but with fairness, when sensitive information or protected attributes are abused, then we have a fairness problem there. But, uh, in the upcoming slides, we are also going to see that privacy does not imply fairness. Okay, let's, uh, start focusing on, uh, natural language processing models, linguistic models, semantic spaces, uh, in machine learning. For example, recently, Google had this, uh, Cloud Natural Language API. Amazon, Google, many companies, researchers are making these tools available or for commercial purposes as well. And such tools are being used by developers, researchers, uh, just, uh, random folks, citizens, and so on. And, uh, we can also see that whenever we are dealing with a smart or digital application, if there is any text included in it, then usually these linguistic machine learning models are also being used. For example, web search, when you're trying to type something, and you can see some suggestions to fill that sentence, which is sequence prediction or text generation. Uh, at the context level, they are also looking at linguistic semantic spaces or machine translation, sentiment analysis, especially for market predictions to see how a commercial product is, for example, perceived. Is it negative, positive? And, uh, of course, it's using linguistic data and tokens, uh, words as tokens that are in certain context windows or named entity recognition and text generation. When you're receiving an automated call on the phone, usually that text, uh, text is automatically generated. Okay, why would this be a problem? Uh, let's look at this example. So, one of my native languages is Turkish, and Turkish is a genderless language. There are no gender pronouns. There is one pronoun and it means he, she or it. Okay. Uh, so I'm translating from English to Turkish. She's a doctor translated as he, she or it is a doctor. Taking the Turkish sentence, translating it back to English, it's translating as he's a doctor. It's not even asking if it should be a he, she or it. Okay, let's say that it's smart enough to understand that this is a human, so it shouldn't be an it. It's still saying, not saying he or she, but just choosing he as the most accurate answer. And another example to see if this is just one rare exception or is this happening at the larger scale. He's a nurse translated to Turkish. He or she is a nurse. Translated back to English, she's a nurse. Okay. Uh, you can maybe also see the difference. Doctor versus nurse, the prestige or the, uh, salaries of these people. And we see a pattern here. Uh, he or she or it is a professor. He's a professor, teacher. She's a teacher. Okay. Is this only happening in English? Uh, German is also a gendered language. Not only pronouns, but, uh, other things are also gendered. It's more gendered than English. And of course from Turkish. But again, we see that a doctor is translated as a male and a nurse is translated as a female. And then we have Bulgarian and in Bulgarian, uh, almost everything is gendered. Verbs are gendered. Adjectives are gendered. And again, we see that doctor is a male, whereas a nurse is a female. Okay. So, uh, it has been 62 years since they came up with the term, I believe it's still 62 years. Uh, they came up with the term artificial intelligence and what they, what can be done with old doomsday stories about when there is, uh, super intelligence or when machines are going to take over and so on. But they haven't been really thinking about immediate problems we might have with artificial intelligence. Okay. We know that when garbage goes into machine learning models, what comes out is usually garbage as well. Because if that's the quality of your training data, then the output would reflect the same quality. And one example here for how text is collected for generating semantic spaces and linguistic models is, uh, for example, there was the Microsoft tweet bot, Tay, and it was taken down the same day it was introduced because it very quickly turned into a very offensive, racist, biased, uh, bot. And Microsoft wasn't ready to account for such cases in the linguistic model. And this was done by fortune. It was basically model poisoning. It was adversarial machine learning. And it worked very quickly. And these were some of the examples that the tweet bot started, uh, tweeting. And of course it was taken down, but we can see how easily the bias can be embedded, uh, with a strong effect size in this example within a few hours. And how is this getting into these models? Let's take a step by step look at this. Uh, we know that humans are biased. This is not necessarily a bad thing because there are, uh, neutral biases as well. And sometimes biases are helpful in some conditions and I'm going to give some examples of those as well. But when we're biased and when we're speaking, we reflect that bias in our semantics. We have values, for example, we say that, okay, the snake is ugly or this butterfly is beautiful, for example. And these are neutral biases, but they are there. And as we are speaking, forming languages, uh, we tend to have similar patterns, uh, in same context windows. Let's say that we have a negative context window and snake tends to appear in that context window as a neutral, uh, bias, but it's still a negative bias for snakes. And that's called, uh, then distributional meaning. We see in the statistics that certain words end up in negative context, for example, and context windows. And after that, machine learning models, especially semantic spaces, look at this distributional meeting, uh, meaning and co-occurring statistics. And after that, it learns the co-occurring statistics of certain terms or certain people's names and what they are associated with. And this is reflected in machine learning models. Basically, the bias is propagated through this entire process. And at, sometimes it is even, uh, increased and augmented. It, it's not just perpetuated. How can we measure it? Because especially when we look at unsupervised learning, we don't really have something to directly control for and measure these things unless we look at the model at the construct level, like the intelligence of this model, let's say the, uh, the understanding of the world that this model has. And for humans, the implicit association test has been used as a way to measure the implicit biases that we might have, uh, Greenwald from University of Washington came up with the implicit association test in 1998. And there's a lot of criticism about this method as well, but at the same time it's revealing some patterns about the world and humans. And it's revealing subconscious biases that we might have and we might not be even aware of. And, uh, this test is basically asking you to associate certain societal groups, members or terms with certain stereotypical words. And how fast do you associate, for example, a butterfly with being negative or positive versus how fast do you associate a snake being positive or negative? And then when you're doing these associations on a computer, you are asked to click right or left to classify a positive term with butterfly or snake. And after that, there is a difference in reaction time and this differential reaction time in associating congruent, uh, stereotypes and incongruent stereotypes gives you the effect size for implicit bias that we might have. And this example is, uh, from a girl taking the implicit association test for males versus females about science versus arts. And the bias in general is that males are associated with science, whereas women are associated with arts. And you can go and take this test, uh, online at the Harvard's website, the implicit and I'm showing this because we see a few implicit association tests here that are listed. And in the experiments that I did, I took these, uh, previous experiments that have been, uh, generated by experts in societal psychology because I'm not an expert in that area at all. And I wanted to see if we can replicate these biases versus what happens when I try random things or things that are not biases so that, uh, I'm not just cherry picking the biases that I want to show you. You see that these are the main categories that were there and I'm using the same ones to see if they are reflected. Because the other thing is this test is taken by millions of people over decades. And another example is, uh, you can also take it in German if you are from Germany, let's say, because the same tests since they are based on context and words, uh, they they can exist in any language basically. And we can generate linguistic models in any language as well. So we can apply these tests in other languages for other cultures or countries. Okay, let's start looking into the details of generating language models. Another example here of, uh, where text is taken to generate these models. Basically, we can crawl the web, take all kinds of, uh, text structured, unstructured tweets here. We see some tweets from Donald Trump before he became the president, but we see a certain pattern in the text about maybe certain biases and so on. And this is blindly fed into, uh, in this case, neural networks. And the neural networks looks at the co-occurring statistics and all the data, uh, point-wise material information it has. And after that it produces a semantic space. Uh, in this semantic space, we have, uh, the common example is with 300 dimensions. We basically have a dictionary of a language and then, uh, this dictionary each word contains, each word in a language, uh, with, in a vector form, numeric vector form with 300 dimensions where each dimension is a combination of certain contexts. But when we have each of these words, we see that similar words are projected to the same points in that space. For example, things about, uh, positiveness, things about feelings, things about females and so on. And based on their vicinity, we can understand or answer many questions. And the types of, uh, semantic spaces that I focused on in my study were, uh, Wurttowek, uh, the algorithm and Glove from Stanford researchers. Wurttowek, when it was introduced, it was, it was extremely popular and the models that was produced by Wurttowek, which is, uh, from Google research, uh, it's being used by many developers, researchers or app developers and so on. So a lot of people use these for their applications and Glove, uh, is similar to that as well. It's produced by Stanford researchers around the same time and these two semantic spaces have about the same accuracy after evaluation. Even though it's not very clear how to evaluate these methods, it's like an approximation of an evaluation for these, uh, semantic spaces. And Wurttowek is based on Google news data and we would expect Google news data to be, uh, more neutral and objective because it's news data. But we see that that is not the case based on concurrent statistics and other type of data that Glove uses is Common Crawl. It's about 800 billion tokens from the internet. It's basically the crawl of the worldwide web. Basically, uh, the language of internet users, I would say in this case. Not in particular society in general but, uh, the internet using population. Okay, what can we do with these word embeddings? First of all, we can understand syntax. We can understand the meaning of a word better. Uh, we can perform analogies and, uh, get the answers or we can look for semantic similarity. Um, let's see if I have this example. No, but we can ask questions such as, okay, Rome is to Italy as Paris is to what? And it will be able to answer that. It's France. So it has some understanding of language. It has understanding of semantics but there is also knowledge and maybe even statistics embedded here. And by statistics, I don't mean the concurrent statistics. I mean statistics about the world. And looking into the details of these, uh, vectors, we have each word listed by frequency and the first word is usually A or D or just a comma and so on. And there are 300 features that represent this word in the semantic space. And in Glove, we have about two million, uh, tokens, two million words in this dictionary. And in a dictionary, you would expect much less words, but given that this has taken all the words above a certain frequency on the internet, we have words such as Obama or Michael Jackson and so on. Okay, what can we do with vector arithmetic? So when we, uh, project these vectors to 2D space, we see that for example, I hope at least the laser is working now because I didn't turn it on. Okay, now it's working. Okay, here we see that on the lower side there is brother, which is a male and here we see sister. And now we see the direction of gender here. Once we see the direction of gender, maybe we can look at kink and find what corresponds to the female version of kink, which is queen here. So basically we can perform vector arithmetics with cosine similarity most of the time or taking the principal components of these vectors and try to answer the questions that we have in syntactic form, analogy form or semantic form and so on. Uh, how can I use this information to measure bias in machine learning models? So the first thing I came up with was the word embedding association test instead of the implicit association test. And for this what I do is I try to quantify the, uh, implicit or in this case it's actually explicit and deterministic associations between societal categories and evaluative attributes which are stereotypes in this case. And what I do is I take the distance between the, uh, stereotypes and societal groups for the two groups and look at the difference in standard deviation between the means of these associations. And that gives me the effect size of a certain bias and we can also measure the statistical significance by generating a null hypothesis and then see if the result that we are getting from the effect size is significant or not. And for this the first thing I wanted to start with was looking at neutral, uh, stereotypes that are universally accepted or they are called so. For example, flowers being considered pleasant and insects being considered unpleasant. For some reason apparently most of the population just naturally intuitively have this, uh, stereotype or musical instruments being considered pleasant whereas weapons are being considered unpleasant. And since this is not dangerous for society or harmful to society it's considered neutral and in this case we see that the effect size is 1.5, uh, around 1.5 for both of them and this is a large effect size if it's above 0.8. And the highest effect size can get is two because it's bounded by the standard deviation here. And we see that these are both statistically significant and high effect sizes. And for the upcoming examples it will be the same case. They are all statistically significant with high effect sizes. Okay, let's start looking at the other major, um, implicit association test categories such as, uh, white people's names versus black people's names and trying to understand if they're considered pleasant or unpleasant and, uh, we get the congruent stereotype that white people are, uh, considered pleasant in this case. And when we look at the differences between genders we see that males are associated with career and females are associated with family and again, uh, I'm using the same words that the implicit association test is using to ask you to perform the classification association test here, the exact same words. And when we look at science versus arts, again we see that males are associated with science and females are associated with arts. Uh, I'm not going to talk about stereo threat here but at least we can see that bias is certainly perpetuated with these linguistic models and, uh, the two examples that I have with models here are from 2014, we are in 2018 and these are still the state of the art models that are being used by many people and they are not really updated on a frequent basis because they are quite large files and they require a lot of data and so on. Okay, let's look at some other, uh, health related stereotypes for example, uh, young people being considered pleasant whereas all people are being considered unpleasant or physical diseases considered controllable whereas mental diseases being considered uncontrollable and we see the stigma being, uh, reflected in these models or when we look at for example, um, heterosexual versus homosexual, uh, individuals we see the attitude towards the more straight versus trans gender and so on. And we can also perform these in German when we look at the main categories after generating the linguistic model and you can also generate your linguistic models. For example, you can download corporate online or use google ngrams data from different years, decades, countries, languages and so on to analyze what might be going on in these years and when we look at the most, uh, recent, uh, version of google ngrams for, uh, German we see that we can also replicate the stereotype, uh, or the prejudice for Turkish people. Turkish people went to Germany decades ago, millions of them as immigrants and, uh, there isn't a very positive attitude towards them and we are able to replicate that from, uh, Google ngrams data as well by performing the VEAT test, sorry the VEAT test. Okay, we saw that semantics and bias is, uh, embedded in these models. What about empirical information that doesn't depend on a context or a feeling but it's more about the statistics in the world. Can we replicate those as well or can those be a reason for these biases? Uh, for example, let's say that I would like to know how a certain name is associated with being male or female. Taylor, for example, it's an androgynous name and it's almost 50-50, uh, male and female and based on this I can do, I can perform a similar, uh, computation to see, uh, how much a word is associated with a certain stereotypical group. And, uh, based on this I went to US, uh, I, I went, I collected data from the, uh, US Census Bureau, 1990 I believe, uh, data where they included the gender, uh, of, uh, people with certain names and how many of them were out there. So I took all of these names, the most frequent ones, and tried to calculate, uh, their association with genders being female or male. And we see that this androgynous names had, uh, 0.84, uh, correlation coefficient. So it's almost about, let's say, 84 percent, uh, similarity in the statistics of the world versus the association that I'm getting only from a semantic model, which is very interesting to me. And I want to, and this is a illustration of that. We see here that, for example, Taylor is in the middle, almost white, 50 percent male, 50 percent female, from the effect size that I'm getting, uh, which is almost zero in this case. And we see that Carmen is almost 100 percent female, whereas Chris is 100 percent male. Okay, what about, uh, if we look at, uh, employment statistics, occupation statistics based on gender, and, uh, the Bureau of Labor Statistics publishes, uh, these, this information every year. And based on this, I took the data from, I'm trying to see which year this was. It might be 2017. And I took the, uh, occupation names and then tried to see their association with certain genders. And the correlation coefficient was 0.9, which is amazing. Uh, and after that, when we look at the result, we see that a programmer is, oops, where is my, it's been reflected from the mirror, I don't want to blind anyone, so I'm not going to use it. But on the upper left side, we see that programmer is almost 100 percent male, whereas a nurse is almost 100 percent, uh, female. And when you look at, for example, Google, Google Engram, searching for she's a programmer, uh, usually the result you get is zero, because things that are below a certain frequency are just, uh, cut from the Engrams. And, uh, until recently, she's a programmer with zero, and we can see that here reflected here as well. Okay, now we can understand that there are different types of bias that are embedded in semantic spaces. And there might be different reasons why these are getting in these models, but we can come up with three main categories for types of bias that we are dealing with. The first one is the vertical, uh, information, gender and occupations, for example. And this is not exactly bias, it's basically the statistic that we have in the world, and these might be caused by, um, the injustices in the past, or biases in the past, but we don't know any information about that, we just have the statistics here. And the models are learning from these statistics. But we also see that the universal biases are embedded in these models, as well as things that can be really prejudice, such as black versus white names being considered pleasant versus unpleasant. But I would like to remind that in some cases you want certain types of biases to be in your machine learning models, because they can be very useful. It depends on what kind of task you are dealing with. And some people have suggested fairness through blindness. Basically, just remove redundant, just remove protected attributes, or devise the system by completely taking the bias component from the vector space from all words. But this cannot be the right solution to this, like just turning a blind eye to this, because first of all, once we remove this information, we see that we're also removing statistical information about the world. And the other thing would be, we would end up with redundant encodings, which wouldn't have the same quality as the previous ones, and we don't know what exactly we are losing here. And another very important example is having proxies. Even if you remove protected attributes, there are still proxies for bias to take place. For example, when automated systems are deciding to give loans to certain people, the zip code is a proxy to the address. So even if you remove all protected attributes, the zip code since it's a proxy to having a certain financial status, it would give you the redlining example where certain people are just denied loans because of their zip code. So, and by the way, in law, the main discrimination criteria is using protected attributes for discrimination, and if those protected attributes are removed, and if you are proxying from the zip code, it's completely okay to be using the system like that right now. And I'm suggesting instead, fairness through awareness. First of all, we need to understand cultural bias. And then, based on this bias, we need to understand the protected attributes that come with them. And we also need to understand the machine learning task. For example, in bioinformatics or health informatics, we need to make sure that there's a certain bias. The bias for genders or certain ethnicities or racial backgrounds are taken into account, because one example is for cardiac disease. Symptoms of men and women are different. Treatment should be different as well, and we have to make sure that we are taking this into account in whatever model we are building. And I'm saying that fairness is task-specific for this purpose. And this work was published last year at Science, and there were a lot of news about it in media, and one of them said, after this random screenshot that I took, which I really liked because it says, in 2017, society started taking AI bias seriously. So in 2017, people really understood that there was a serious problem caused by these automated systems, and the bias that they are perpetuating in society that's already a huge problem to deal with. And this is not happening at the large scale. Okay, what am I going to be working on next? So this kind of covers the project that I was working on, and I'm trying to wrap up quickly so that you can also ask questions, and also I was going to mention in the beginning that this can be interactive, but I think it's too late now. Okay, I'm not going to be focusing on the singularity or transhumans or when machines are going to gain cognition because we have much more immediate problems right now. For example, focusing on computer vision and joint semantic visual spaces, and these are being current news, for example, for automated surveillance, and computer vision systems are known to have bias as well, but it's much harder to quantify those if you're not dealing with supervised machine learning, and imagine a system being biased for certain skin colors or ethnicities and how much of a problem this might become because a lot of these automated systems are being used to identify targets or they are even used in war zones or they are used all over the streets to identify, for example, anomalies for anomalous detection and so on, and we don't exactly know how these systems are working yet, and they might very well be biased because all the analyzable ones are showing bias. What about algorithmic transparency and interpretable machine learning? So, for example, with driverless cars we know that they have vision systems as well, and let's say that this is the classical trolley problem in the current sense where we have the driverless cars is going to crash into someone, there'll be an accident, it cannot avoid that, and it has to decide is it going to crash into the white male executive right across the car, or is it going to choose to run into the old black lady on the other side, and we don't know the answer to this yet, and because of that we have to be very careful about what kind of products that we are building because we don't want to be building the digital analogs of Robert Mousses' racially motivated love overpasses. So Robert Mousses is considered one of the best urban planners, he's the planner of New York City, but the overpasses that he built all around the parkways were quite low, so buses where people with lower financial status had to use public transportation, buses couldn't pass through those love overpasses, so people had to have their own cars so that they can go to the beaches in Long Island, or the parks that he built, and with these love overpasses people were basically just separated and this led to decades of segregation and we have to make sure that we are not causing the same problems again with the digital products that we are putting out there blindly, and for this I would like to keep working on a fairness framework to uncover bias in artificial intelligence and come up with ways to mitigate it while preserving the utility of systems and come up with fairness algorithms, but there are a lot of privacy and security implications here as well, for example when we have these machine learning models going from our phones to the cloud or from our fitness trackers to the cloud, can we guarantee fairness in a secure and at the same time private way? How can we avoid adversarial poisoning for these systems and so on? And there are many unanswered questions in this area and it's a very exciting area and I would like to thank all of my collaborators as well, none of this work would have been possible without them and I'm really grateful to them and now I think we have a few minutes for questions as well, so I would be happy to take questions, comments, questions, anything.