 Yn ymgyrchu'r pethau, rydyn ni'ngymrif yn gynhyrch i dynifodu llyfrgellau yn 500,000. Rydyn ni'n gynhyrch i dynion gyda'r gerdyddai sydd o gymaint camps brydau o blynedogau, o'r ffifreit, iawn o gyfroedd o blwydau o fynd links o ffermau o'r cilio, a i'r yoedd unig i gydaniau o gymdeithasol. Yn ymgyrchu'r hyn, roeddwn i ymgyrch gŷfynuig lleig arمكن i ddyn nhw anodd a addedres gener stereotyping in machine learning methods. In particular, I'm going to show how the methodology outlined by Rebecca Cook and Simone Cusack in 2010 can be used to identify gender stereotyping in machine learning, and this paper forms part, as David said, of my ESRC funded PhD research as part of the Human Rights Big Data and Technology project. So start by what are gender stereotypes? What are stereotypes? So stereotype is a belief about the characteristics of individuals and Cook and Cusack in 2010 defined gender stereotypes as a structured set of beliefs about the personal attributes of women and men and this definition implies a binary model of gender but I think that it can easily be extended to encompass individuals who identify or are in fact identified by other people outside of a simple binary. These beliefs, these stereotypes, are not the same across cultures but all societies construct i'r cyfnodd o'r rhithledig thatidau ac i'r rhithledigon ar ddau whoalogiaid yn cael fy mhag yrwyddoedd. Diolch, y llifonedd gynnig yw'r cyfinodd yn llaod arllun o'r anodd o'r resurtyn Tyndu Cymru ac o anodd rheswm ond bodهمau na'r rhithledig yn cael ei cyfrifunol hwnnw o'r anodd. Dw i'n credu gan'r rhithledig cyfnodd, oherwydd cyfnodd rhithledig yn hwnnw a'r Okefwc Cyswch, Llywodraeth replyent hwnnw o'r wneud yw'r ysgawdd ar meddwl sydd y Shryf Llywodraeth Llywodraeth yw ddechrau'n yn mynd i wneud bod yw y môn o fofod Fawr? Felly mae'n ddybyr yw'r idea i drafod yma mae'r ystafell ystafell mewn ymlaenu, ymlaenu, ymlaenu, ymlaenu mewn gyfanieg o bwysig yr hyn yn ystafell, a gwaith yn ymlaenu gweithio hyn sy'n bywch chi. Mae rhanwm sydd o fwyloes cydnod oedd yn fawr o mynd i illu. Stereotypes might be used to underpin prejudices, so that's a hostile or avertive attitude towards somebody just because they belong to a particular group or even just because they're perceived to belong to a particular group and therefore they're assumed to have the negative qualities associated with that group. So this kind of prejudice which is underpinned by stereotypes has a clearly harmful effect. It leads to discrimination or hostility to a given individual. But that isn't the only method by which stereotypes can cause harm. Stereotypes may or may not be based on observation, be based on everyday interactions or even on statistics, but even in the situations where they are, where stereotypes arise because of observation of groups, this doesn't necessarily mean when they're used in decision making of any kind, particularly about an individual, that they're legitimate to the situation at hand, that they're a legitimate thing to use in a particular decision. So where does human rights come in? Article 5A of the Convention on the Elimination of Discrimination Against Women states that state parties shall take all appropriate measures to modify the social and cultural patterns of conduct of men and women with a view to achieving the elimination of prejudices and customary and all other practices which are based on the idea of the inferiority or the superiority of either of the sexes, note the binary, there again, or on stereotype roles for men or men and women. So in other words, states parties to CEDAW, which is the majority of countries in the world, have an obligation to end practices which are based on notions of inferiority or superiority, or on stereotypes related to gender. So back to identifying gender stereotyping. How do we know that it's happening? So again, as I mentioned at the beginning, I'm going to be using CUC and CUSAC's methodology that they use to identify gender stereotypes. And the idea that they propose is that if you can name gender stereotyping, that's the first step to eliminating harm caused by prejudices and practices based on gender stereotyping. So they identify a set of indicators that can indicate the presence of gender stereotyping. They identify category-based judgment, so judgments based on membership or not of a particular category, decisions based on tangentially relevant information, selective perception and interpretation of information, and judgment based on extreme interpretation of limited evidence. So we're going to come back to that. I'm going to talk briefly about some feminist legal approaches to stereotyping and discrimination. Feminist legal scholars, as I imagine a lot of you in the room will be aware, have argued that law isn't neutral, that law is a product of social forces. It's constructed by those in power, it's built up over time and it reflects the priorities of the people who have been in power over time. Historically, those have usually been men, usually been powerful men, wealthy men and men of the dominant ethnicity in a particular country. So as a result, law isn't neutral and legislation and judicial decisions can uphold existing power structures and therefore they can reinforce existing discriminatory structures including gender stereotypes. A lot of feminist legal scholars argue that challenging discrimination in a substantive way requires that law as well as policy and practices promotes substantive equality, which requires recognising the substance of inequality, its roots, its causes and its consequences and working to address these in their totality. Which means, from a women's rights perspective, fully recognising women's circumstances, their needs, their abilities in order to have effective measures to end discrimination. As with many other disciplines, feminist legal scholars have called for a more intersectional approach to international human rights law in general in order to truly address the human rights violations experienced by individuals who are marginalised on more than one ground. So I'm going to move slightly on, talk about what is machine learning because I'm going to use these ideas about gender stereotyping to look at machine learning specifically. And Cat is already covered a little bit about this, so I apologise when repeating a little bit. Any of you who've been following a lot of the discussions in the media will have seen the word algorithm being bandied around. An algorithm isn't a new technology, the name comes from a Persian mathematician several centuries ago, I can't remember the exact date. A ninth century, I've got it in my notes, ninth century Persian mathematician Muhammad Ibn Mulsar al-Qurizmi. It's a process, an algorithm is a process, it's a set of instructions for obtaining a result. And the first algorithms and a lot of algorithms I'll show you today were originally a set of discrete intelligible steps. Sort of like a flowchart. So the idea is that you can start with a particular set of information and you can take a number of steps and you get an output. So they're a natural thing to computerise, a set of discrete steps that you can put data into and get an outcome from. But the increase in available data has allowed the development of increasingly complicated algorithms. And this is where machine learning comes in. So machine learning is the practice of using algorithms to pass data to learn from it and then to make a determination or a prediction about something in the world. So it sometimes gets called part of artificial intelligence, particularly in the popular sphere, particularly in the media. I'm going to probably not use the term artificial intelligence because I think artificial intelligence tends to mean kind of what computers can't do yet, but we think they'll be able to do very soon. So I'm not sure it's a very helpful definition to use. But I'm going to give a short overview of some specific ways in which machine learning is used. So in supervised machine learning, we start with a set of label data, a training set, which tells us something about what we're interested in. So basically a set of data points with a number of features and labels for those features. There's lots of different methods that we can use to analyse this data in order to make predictions, but what they haven't come in is they're built on this set of training data, which means data that's already been labelled by humans with the features and the variables that we're interested in. So data, like so many words that are in science, comes from Latin, comes from the Latin diary to give. But in fact it's probably more accurate to say that it's captor. You could come from the Latin capere to take because it's not actually something that is given to us by nature. It's something that somebody somewhere has chosen to collect, that it's the outcome of a conscious decision to choose and record some data but not others. So data isn't a representation of the world as it is. It's a representation of the world as it is seen by the person or the team or the organisation that has collected this data. So the data set is a collection of these observations, each with a set of features for which value of some kind is assigned. But the choice of which items to include in your data set, which features to look at and what values to assign often aren't neutral. They're made often implicitly or explicitly by the data collectors. So, for example, data sets may over or under represent certain types of data. Some of you might be familiar with a particularly well-known data set that's used for facial recognition that's called Labelled Faces in the Wild. And this is a set of photos of famous people and their names. The idea is you can use this data, it's available on the internet, to train facial recognition systems. But this data set was collected from articles put out by Yahoo News in the period I think it's about 2002 to 2004. So it's not representative of people as a whole, it's representative of who Yahoo News thought was important and newsworthy in the years 2002 and 2004. So as a result, the faces are 77% male, more than 80% white, and the most represented face is that of George W. Bush. So it's a data set, but it's not a data set that represents humanity as a whole, it's a data set that represents quite a lot George W. Bush. So any machine learning algorithm trained on this data or similar data sets will have more data to work with for white male faces and American politicians. And facial recognition has been in the news a lot recently. An independent assessment of this tool called recognition that was put together by Amazon, the company by Amazon said that it was trained on a database of one million faces, a different one, not labeled faces in the world. But an independent assessment found that it performed a lot better on white male faces than black female faces. And this can be inconvenient and demeaning if the facial analysis tool that's been developed using this data set is used to identify who is in a photograph or used to give you access to your smartphone or your tablet. But it's dangerous if the tool is being used to identify people who are accused of committing crimes if a tool is more accurate for one set of people than for others. So that's an example of machine learning that uses training data. There's other kinds of machine learning methods as well. So another type of methods called unsupervised learning starts with a data set, but instead of trying to predict something, we're trying to organize the data. So we use machine learning to structure a data set depending on the data itself. So for example, if you have a set of data you might use a type of unsupervised machine learning to cluster data into sets based on how similar observations are to each other. So if you have a set of data points, imagine them sort of put in a graph, you might use a machine learning to divide them into different groups. And in this way, clustering algorithms, which sort of look at points that are similar to each other in the data sets, it's a fairly new technology, but it's got echoes all the way back to Aristotle, all the way back to Aristotle's idea of equality. This idea that equality is when you treat things that are like, alike, and things that are unlike, unalike. And the Aristotelian models of discrimination have rightly so been come under a lot of criticism, again from a lot of feminist legal theorists, for centering what is considered normal and marginalizing what is considered different, who gets to decide what things are alike and who gets to decide what things are unalike. And these criteria are often determined by who is in power. So in law, these are legislators and judges who will sit and decide what things are alike, how alike are two things in order to determine whether somebody's been discriminated against. And in machine learning, these are the people who create the data sets and who decide whether an algorithm and the output of an algorithm are useful enough. So I'm drawing a sort of analogy between the law decision makers and the machine learning decision makers. So without careful screenings, it's very easy for training data, which I mentioned earlier, to contain gender stereotypes. It can be as simple as who is overrepresented in the data, as in labelled faces in the world, which I mentioned earlier. It's clear that decisions about who is considered newsworthy, which were made by editors at Yahoo News, influenced the content of the data set, which as we saw before is 77% male. Another example is a corpus of 3 million words that was taken from Google news articles. And a team from Boston University and Microsoft used a technique called word embedding, which is essentially representing different terms, different words in a geometric way, and then measuring how close they are using this geometric metric to analyse the data. And they found words that could be linked together in a way that demonstrated that they were connected. So for example, you could link the words Tokyo and Japan in the same way as you could link the words Paris and France. Essentially, they found a geometric way to describe how words are connected to each other. But they also found that using this methodology, certain occupational words were much more closely linked to men than they were to women. So for example, certain occupational words that were linked to men or linked to male were architect, broadcaster and boss, while words that were gendered female included homemaker, librarian and receptionist. So it's really clear that socially constructed stereotypes about which roles are appropriate for men and which roles are appropriate for women were mathematically encoded into this data set. And I think this is important because machine learning, although it sometimes gets called artificial intelligence, it isn't intelligent in the way that people are intelligent. It is essentially interpolation. It's making decisions based on information it already has. It's very, very hard to teach a machine learning system to generalise from data. And this means that it relies on what has gone before. It makes decisions based on what it already knows, and in this case it's the training data, whether it's labelled training data as in supervised learning or unlabeled training data as in unsupervised learning. So once you've got this algorithm and you've trained it, when a new piece of information comes in, it's evaluated based on the features that are designed in and the previous training data. In other words, the new piece of information, which might be a picture, it might be details about a person, is evaluated not on its own merits, but on how close it is to previous data. And this is important because going back to Cook and Q-SAC's indicators for gender stereotyping, which I mentioned earlier, it's clear that many of the features that they identify as pointing to the presence of gender stereotypes are present in machine learning systems. So category-based judgment, that's a feature of these systems. By definition, they're using existing variables to make decisions about new pieces of information. It's clearly possible for decisions made using machine learning systems to be based on tangentially relevant information. The nature of mathematical models, the nature of the systems that are being used in these machine learning systems is that they are in perfect views of the world, that they weight certain features over others and they pay attention to certain things about the world and they ignore others. Selective perception and interpretation is a feature, not a bug, of datasets which capture only some of reality, only some of what's actually there. And so, too, is judgment-based on extreme interpretation of limited evidence. And so, by its very nature, therefore, I'm arguing that machine learning can be a form of gender stereotyping. So what are we going to do about it? We've already mentioned how gender stereotyping is a problem under international human rights law, is it a technical necessity for machine learning? Do we need to abandon all machine learning, all of these systems, because they are stereotyping? Is it feasible to use these machine learning systems in a way that doesn't violate human rights? To what extent can we get rid of the harmful impacts while preserving the usefulness of machine learning as a system? Because the computer scientists for sure and the data scientists argue that it's useful. So it is possible to propose technical solutions in specific cases. For example, the example that I mentioned earlier about word embeddings also proposes a solution in the form of another algorithm to de-bias the datasets, essentially to move data points around to get rid of that geometric relationship that puts certain occupations closer to man than women. Another possible approach, which is increasingly called design justice, argues that right from the beginning, right from when we start to think about using these kind of tools, we should be evaluating the values, it's not a good phrase, that we choose to encode when we design. And think not just about avoiding or minimizing harm, but also about transformational possibilities. So not just kind of stripping out the potential problems, but actually thinking about how we can actively use these methods to promote equality. And if we don't do this evaluation work, then it's very, very likely that existing forms of oppression, existing stereotypes are going to be implicitly coded into the systems that we're building. In addition, as I've argued, there are parallels between legal theories and algorithmic processes. So we can borrow these ideas, as I've done in this paper, from feminist legal theories to challenge stereotypes and discrimination in machine learning. And we can use analyses of where power lies in machine learning systems. We can analyze what data is included, how accurate is it, what kind of accuracy is acceptable, because mathematically there are many different ways to measure whether or not something is accurate. And, importantly, what decisions are made on the basis of results obtained through machine learning, which is what Katja was talking about in her presentation. We can identify who is centered in data sets and who is treated as other. And we can situate the decisions made by machine learning in particular context. We don't have to look at them all as one big system that is called machine learning, but we can look specifically in the context. So, while I think there's potentially a lot of issues with machine learning, I'm arguing that it is possible to design systems that maybe don't violate human rights, but they require a lot more effort than is currently being put in. Thank you very much.