 It's an announcement for later that I welcome. Yeah. OK. Just a quick announcement. We thought about organizing the last coffee break and discussion session, which starts at 4 PM, taking our coffees here, and then walking to this castle, Maximilian Castle, which is a very nice park. It's big. You can even buy a coffee there, or there will be plastic cups here where you can pick up your coffee and then walk to the park, OK? So that's the plan. OK, our first speaker today is Professor Irini Tuzzi, talking about fairness beyond single identity dimension. As you notice, there has been a swap. So Irini, please. Hi, everyone. Thank you for the invitation, and thank you for joining this talk. I hope you are not disappointed by the change in the order. Let's see. So this is a very diverse crowd. I'm a computer scientist, so I'm very sorry if I'm kind of technical at some points. I tried not to be very technical, but please interrupt me at any point. And I want to talk today about fairness our learning beyond one dimension of discrimination. And I will start with a short introduction and going from mono-discrimination to discrimination in the presence of one protected attribute to multi-discrimination. If we have more than one protected attribute, it was also mentioned before in the previous part of the day as intersectional discrimination. So let's start. I think probably many of you know about this application, the Compass application in the US, which is an application. It's a tool that is used to predict the risk of recidivism for different defendants. And it has been found that the risk for white people is much lower comparing to the risk for black people. So we have an example of bias here with respect to the race. This is not the only example. We have a lot of examples also from computer vision. So I think nowadays we all use some computer vision algorithm in our, for example, driving or in our mobile and so on and so forth. And these systems perform very well in general. So on average they perform very well, sometimes close to human performance, but they don't perform equally well for all different population subgroups. For example, black females, they underperform on black females in comparison, for example, to white females. So there are a lot of such incidents. These were just two representative ones. And as a result of this, because AI and automatic decision making is used nowadays almost everywhere, even in different applications in our society, we can have different types of harms from allocation harm to representational harm. So allocation harm is what I will talk mainly about. And I think this was thus far, most of the talks were about allocation harm. So whether a system withholds certain groups and opportunity or a resource, for example, access to education, access to medication and so on and so forth. But the other one is also very important, especially nowadays with this information. So this was also mentioned before or it's social media and so on and so forth. So it seems that these systems have a bias issue, bias problems and discrimination problems. So it's really important to try to correct for these problems and create technology that does not discriminate on the basis of protected attributes. And for this, there is a dedicated field in computer science or in machine learning. And this is the field of famous hour machine learning. So this is a pretty young field. It was actually, it started in 2008 with a seminal work by Pedresky, Rugerian Turini. At that time it was called discrimination hour data mining. So you see a terminology drift also, right? It's not data mining anymore. The discussion is now more about machine learning and artificial intelligence. And the main idea or the main goal of this field is really to create technologies or AI technology that does not discriminate on the basis of protected attributes like gender, race, et cetera. And this is kind of a new topic in computer science, especially machine learning. But as we all know, it's an old topic. It's something that humans have been interested in since the beginning of the human civilization. And despite the recency of the field, there are actually many works in this field already. So we have a large variety of methods. And this is an overview of the different methods. So there are a lot of methods focusing on understanding bias. For example, what are the causes of bias, sociotechnical causes of bias? How can we observe? How can we see, observe bias in the data? So how bias is manifested in the data? What are bias definitions of bias, discrimination and fairness and so on and so forth. The second direction is about mitigating bias. So how can we mitigate bias and discrimination in these systems? And here there is a large variety of methods from pre-processing, twin-processing to post-processing. And finally, how can we account for bias either proactively or retroactively? So proactively, for example, by collecting data in a more responsible way, so part of the discussion of one group for tomorrow, or for example, by using explainable methods to understand what exactly the models have learned and whether in this decision-making process some of the protected attributes have been used. And as you can see here, these topics are closely related to other disciplines. So here you see some connection to the law, but it's not only the law, it's also social sciences, psychology, philosophy and so on and so forth. And the reason is that this is a very complex problem. And I think what our discipline has understood as far as that it's not a technical problem and it's not a problem that can be fixed only by technical means. So in it really interdisciplinary teams and we need broader discussions to understand what is bias and how to mitigate for bias. So this is the typical machine learning setup for bias, for fairness, our machine learning. And this setup is pretty basic, so it's fully supervised. So we assume that we have a data set, like in this case here, it's fully supervised, which means that we have class labels or feedback information for each and every instance. And it's a batch learning setup, which means that we assume that we have all the data in advance. So most of the methods actually focus on this basic setup. And moreover, they assume that there is a single protected attribute. So for example, only gender or only race. So they work with single protected attributes. So in this case here, we see an example of a data set. So the instances are humans and they are described in terms of different features or different predictive features, which are not protected, but also the protected feature S, which in this case is gender. And it's typically assumed to be binary, so males, females. And there is the class attribute, which in this case is also binary, so positive and negative. And the protected value for the protected attribute results in the so-called protected groups. So maybe it's females in this case and we have also non-protected groups or let's say males. Of course, this depends on the context, right? What is protected, what is not protected. So I'm just using here the gender as an example. And the goal of our classification is to learn a mapping from the predicted space to the output space, to the class space. And to this mapping should have good predictive performance. So we should be able to predict accurately for new instances of the problem. And also it should eliminate discrimination, right? Because we care about fairness aware learning. And with respect to the predictive performance, we know actually how to model that, right? So this is what typically machine learning is doing. So we have some objective function and this is typically modeled by loss function. And the question now is how we model discrimination. So again, we need some way to define what is discrimination. And in this direction, there are a lot of fairness measures used in the machine learning domain. So here is an overview, two main categories, group fairness and individual fairness. I think most of the talks thus far have been about group fairness here. Right, so in group fairness, the idea is that the protected and non-protected groups should be treated similarly by the model. Right, so both males and females should get similar responses by the model. And with individual fairness, the main idea is that similar individuals should have similar outcomes by the model. Okay, and here are just some representative measures. And there is a lot of discussion actually about which definition of fairness to use and why we have so many definitions. And although in the beginning there was a lot of confusion maybe, and now it's pretty clear that we need all these different definitions because fairness depends on context. And for the different contexts, we need different definitions of fairness. And also fairness is something that is evolving across time and space, right? So maybe there will be new definition of fairness in the future, who knows. So these are the three most popular or among the most popular measures of fairness that are used in machine learning. So statistical parity is the first one, equal opportunity and disparate mistreatment. So statistical parity is pretty simple. The main idea is that both groups, both protected and non-protected groups should have the same probability of being assigned to the positive class by the classifier, by the model, right? So you see that this measure focuses on the positive class and it focuses on the prediction. So it ignores the ground truth. So, and this means that somehow it's disconnected from the data upon which the model was trained. So a better measure in that respect is equal opportunity, which is not focusing only on the predictions, but is focusing on the errors, right? So again, it compares the performance between protected and non-protected groups, but now it compares in terms of the error. So the model should make similar errors for the two groups. Again, it focuses on the positive class and the disparate mistreatment for the third measure actually is the generalization of the equal opportunity, which also covers the negative class, right? So both classes are equally important, are important, and therefore this measure checks differences in the prediction errors between the protected and non-protected groups for both classes and then aggregates these differences. So these are some basic measures. And just to show you the problem with a small example, maybe you have seen this example before. It's the simplest example of binary classifier defined upon two predictive attributes. So you have a data set of instances. This is what you see here, the instances, and the instances are defined in this two-dimensional space of feature one, feature two. And what you see here is the class labels of the positive instances and the negative instances. And moreover, we have these protected attributes, so gender, for example, encoded by color. So the green instances are the males and the orange instances are the females. So this is a simple binary classification problem. And the goal is to build a classifier which a traditional classifier optimizes for class performance. So it tries really to find a model which predicts as best as possible, as accurately as possible for both classes. So in this case, the classifier might be like this, right? It's a linear classifier. This is the precision boundary. So everything above is positive. Everything below is negative. And what we see here now, if we consider this protected attribute, is actually that all the females, so all the orange instances, are on the negative side of the boundary, right? So they are all rejected by the model. Although if we look at the training data, we see actually that there are quite some good instances, right? So there are actually quite many women which are positive in the training set, right? So 50% of them are positive among the women. The problem is that this population is underrepresented, right? So underrepresentation is one of the causes of bias and discrimination in AI systems. So how can we correct for these problems, or how can we build a classifier that is also fair? And there are many different approaches. So this approach is either focused on the data or on the algorithm or on the model, or there are end-to-end approaches which tackle the whole pipeline from the data to the model. The main idea of the data processing approaches is that we try really to correct the data for discrimination. So this means really balancing the representation of the different groups. And this balancing, so changing the data or tweaking the data for discrimination should not really result into losing the utility of the data, right? So we want to intervene at the data, but we don't want to lose the utility of the data for the learning problem. The second category are the end-processing approaches where the main idea is that we change the objective function. So they work directly at the algorithm. So the main idea is that we change the objective function. And instead of optimizing for predictive performance, we add another term which actually accounts for fairness, right? So we want to have both fair and good predictive performance. And of course the main challenge here is how do we balance these two different objectives, let's say. And the final category are the so-called post-processing approaches, where the main idea is that we build a model in the traditional way. So we optimize for predictive performance. And then we fine-tune this model for fairness. And again, the main idea is that we want to change this model, but we don't want to change this model too much because then we'll destroy the predictive performance of the model, right? So minimal interventions to the model is also the design principle in this case. And of course you see here many different applications from which, for example, the notion of similarity could come. So a lot of work, a lot of approaches already despite the recency of the domain. However, all of these approaches somehow follow this basic, most of these approaches, to be more correct, follow this basic learning setup where we have a batch supervised learning problem with a single protected attribute, right? So they focus on mono-discrimination. In many cases, however, as we know, humans are described in terms of different identities, right? So it's not only a single dimension, a single protected attribute that characterizes an individual. We have actually many protected attributes. And our identity is defined based on these many protected attributes, right? So we have multi-dimensional identities. For example, I have a race, I have a gender, I have an age, maybe I have some disability, and so on and so forth, right? So we are multi-dimensional humans. And if we have different protected attributes, then it's clear that discrimination can be actually due to the interaction or combination of more than one protected attributes, right? So we talk about multi-discrimination. And actually multi-discrimination is not only a theoretical construct or an academic construct, it seems that people also think that often, often think that they are discriminated on the basis of different of more than one ground. So this is a study from 2015, which shows that among the participants who felt discriminated, one fourth of them, so 25% of them actually felt that they are discriminated on more than one ground, right? So this is what people feel that somehow discrimination can be due to many different grounds. And also, this is what we observe in machine learning systems, like in the example of the computer vision system that I mentioned before, right? So black females, these systems perform less well for black females, right? So in the intersection of race and gender. And so it seems that we have really such multi-dimensional problems. And now the question is, how can we work towards solving such multi-dimensional problems of discrimination? And what we do typically in computer science is that we try to carry over or adapt existing solutions from one domain to a more complex domain. So let's see whether this approach could work, right? So can we really exploit all this understanding and the solutions that we have in mono-discrimination to solve multi-discrimination problems? In order to do that, we have two main challenges. So the first challenge is how we define multi-discrimination, right? So I mentioned before some definitions of fairness, like equal opportunity, equalized odds, statistical parity, et cetera, but all of them are defined based on a single protected attribute. So how we define discrimination or fairness for more than one dimension, for multi-discrimination? And the second one is if we use existing algorithms, or if we want to learn to fight multi-discrimination, how exactly, what are the challenges that this multi-discrimination brings in for machine learning? Do we have new challenges or can we apply existing methods? And with respect to the first, what we do in this paper, so this presentation is based actually on a review paper that was recently accepted at the FACT conference. So in this paper, we jacked up all literature from computer science, from machine learning, and from law, and the reason is that one of the authors is actually a lawyer, a PhD student in law, the other two are computer scientists. So with respect to one, we look really at definitions of multi-discrimination from the law field, and with respect to the second challenge, so how to mitigate discrimination, we try to understand what are the new challenges that multi-discrimination brings for machine learning. So let's start with the first question, how multi-discrimination is defined with inspiration from the legal domain, from law? So what are the types of multi-discrimination in law? So in law, they started considering the interaction of different grounds of discrimination, already in 1989, and at that time, they used the term intersectionality, but intersectionality, intersectional discrimination is just one type of multi-dimensional discrimination. So there are actually three types. The first one is cumulative or additive discrimination, the second one is intersectional discrimination, and the third one is sequential discrimination. So let's start with the first one, cumulative or additive discrimination. So the main idea here is that you have discrimination, which is due to more than one ground, for example, due to race and gender, but however, these two grounds, so these two protected attributes can conceptually separate it. So you can see the impact of each of these attributes, and also you can see the combined impact. The second category is intersectional discrimination. So the idea here is that the grounds of discrimination can be merged into a single attribute, and you cannot really isolate the effect of each of these attributes. So it's a combination of gender and race. So this means that you need really to focus on the subgroups. And finally, we have sequential discrimination, where the idea is that discrimination occurs on the basis of the same or different grounds, so male gender, for example, race, or only gender, but these incidents occur over time in a sequence. So I have some example here, so let's start with intersectional discrimination. I'm not sure if this is the best example ever, so if you have better suggestions, please let me know. So the idea here is that we have two protected attributes, gender and religion, and these are both binary attributes here, with respect to gender, we have men and women, and with respect to religion, we have Muslim and non-Muslim people, and what we see here is the impact of the prohibition of headscarves. Specifically, this impact is for Muslim women. So this prohibitance of the headscarves affects only this subgroup, which is defined in the intersection of gender and race. So it affects only Muslim women. Whereas now if we talk about humility discrimination, here is another example. Again, we have two attributes, so nationality, Japanese and German, and sex, we have males and females, and what we see here are the four subgroups that are found, and also the average height for each subgroup, and this height comes from some data sets, so it's the mean height of 19-year-olds, from some particular year, which I cannot recall now, but it's written in the slide. And what we see here is that this height somehow is different in the four subgroups. If we look now across the nationality axis, so if you compare Japanese and German people, for example, we see that there is an effect of this height requirement with respect to the nationality, so Japanese people are typically shorter comparing to German people, and if we look at the sex axis, so if you compare males and females, we also see that females typically are shorter comparing to males. So if you have now some height requirement, for example, for a particular job like police or whatever job might require some specific height, then we can see that this requirement or has an impact across nationality and across sex, and we see also the combined impact, so the different subgroups are affected differently. So the Japanese females are actually, it's actually the group with the shortest group, right? So they are affected more. But we see different effects in the different subgroups, and this is highlighted here by color. So this is cumulative discrimination. And finally, we have sequential discrimination, right? So we have different indices, so for example, a multi-step process like a hiring process, which consists of three different steps. And in each step, we might have a discrimination incident which might be due to the same ground, for example, based on gender or due to different grounds. So here we have the example of an old woman with disability, and we see that during the CV review process, this person was discriminated due to gender. In the assessment center, it was discriminated due to disability because of the use of some automated system. And in the interview, it was discriminated maybe because of the age, right? So we have different incidents over time, so we have sequential discrimination. This is an example of sequential discrimination. So how can we mitigate now discrimination if we have more than one protected attribute? So the setup, at least for most of the methods, is pretty similar to the previous one, right? So we have, again, a fully supervised batch learning. And the only difference is that now we have more than one protected attribute, right? So we have K protected attributes. So here I have two protected attributes, gender and race. And for each attribute, again, most of the works assume that these attributes are binary. So males, females, black, white, and so on and so forth. And with respect to one protected attribute, we call these different subgroups, we call them groups. So for example, with respect to gender, we have the male group and the female group. And we call subgroups that are defined on the intersection of more than one protected attribute. So we have a subgroup, for example, of black males. So in this case, we have two attributes with two values. So we have four different subgroups, right? So groups refers to a single protected attribute and subgroups when we have intersection between different protected attributes. So how can we, and again, our goal is the same, right? So we want to learn a classifier which predicts well, right? So the predictive performance is important and also does not discriminate. So the question now is, what does it mean? Does not discriminate in the multi-dimensional setting and the main idea of the methods in the machine learning domain is to really try to make definitions from the legal domain operational, I would say, right? So this is what comes in the next slide. So how these definitions of cumulative, intersectional and sequential fairness have been made operational in the machine learning domain. So let's start with a cumulative discrimination, right? So the cumulative discrimination is the example with the height, right? So we have gender, we have race, and we see that the impact of this height requirement for the different subgroups. This is the example that I mentioned before. So how can we make this cumulative discrimination definitions operational in machine learning? The main idea is that the adopted fairness notion, for example, equal opportunity should hold for each and every protected attribute. So what we do is that for each and every protected attribute, we create maybe some sort of fairness constraints and we try to learn a model that respects all these fairness constraints. So this is a typical way and this is what the early works in this area we're doing, right, so they were formulating the problem as solving a set of fairness constraints, one for each protected attribute. So here you see four protected attributes, gender, race, age, and religion, and you see here the discrimination score according to some fairness definition for each and every of these attributes and the main idea of the early methods was really to try to optimize or to try to respect fairness constraints with respect to each and every attribute, right? So each attribute, the attributes were treated independently. And more recently there are works which formulate these as minimizing or bounding the discrimination for the most discriminated subgroup. Yes, yes. Sorry, how it looks like you mean in terms of optimization. So there are different approaches, one way to formulate this is like minimize the loss subject to these constraints and you have one constraint for each and every attribute. So some approaches they use the original loss and they try to minimize discrimination for each attribute, right? So the discrimination behavior comes in these constraints. And this depends really on the measure that you use. If you use equal opportunities related to equal opportunity, if you use equalized thought it's related to equalized thought. Was this your question? Okay, right? But this is one way, right? Solving this as a constraint optimization problem. There are also other approaches which try to include this fairness related term in the objective function. So they have a combined objective function which combines a fairness behavior and predictive behavior. And of course then the question is how you combine them because one of them might dominate the other, right? And here there are many different lines of work. Not all of them explored in the multi-dimensional setting, right? Okay. So... Just a short observation. I think you have a bit more than five minutes. Really? Okay. I will be a bit faster. So I don't explain the details but the idea is really that you can focus on the most discriminated group, right? So you can try to bound the discrimination for the most discriminated group. And this is a nice... So I mean the cumulative discrimination is really nice because you can just reuse work from the mono-discrimination domain. However, it has a lot of flows. And the main flow is really that you treat all of these dimensions independently and sometimes the joint consideration is needed because decreasing discrimination with respect to the gender might increase discrimination with respect to the race, right? So this independent treatment is not optimal. And another problem is the so-called fairness remandering problem. I'm not sure if you are aware of this but the main idea is that you might have a model which appears to be fair with respect to individual dimensions with respect to gender with respect to race but is not fair in the intersection of the dimensions, right? So you think the classifier is fair but is not fair in the intersection, okay? And here's an example of the fairness remandering problem. So I will skip this now because but I can explain if someone is interested later which shows really that this is a classifier. What you see here is the outcome of the classifier. This classifier is fair with respect to gender, is fair with respect to race but in the intersection is not fair because it rejects all the white females. All the white females are predicted as suspected, right? So it's not cumulative fairness is not focusing on the intersection of these groups and this problematic because discrimination might occur in the intersection. And that's why this other category, the intersectional discrimination is more popular in machine learning where people focus exactly on the intersection on the subgroups created by the intersection of different protected attributes. And the main idea is that subgroups should be treated similarly and there are here a lot of works not too many to be honest because it's a recent topic and this works deeper with respect to how they define the fairness for each subgroup. So which fairness definition they adopt for the subgroup and also how they compare the subgroups. Do I compare all to all? Do I compare to the overall population? Do I compare to the weakest subgroup and so on and so forth. And I think we had this discussion about what you compare to also before, right? In some of the previous talks. And one approach, the first one actually which introduced this problem of fairness and remandering uses an ocean of statistical parity. So this is the answer to question one and with respect to the subgroups, it compares each subgroups with the overall population normalized by the size of the subgroup. Okay, so this is one idea. It's not the optimal idea because the size of the subgroup comes from the population so it's biased in that sense. But it's one idea, okay? I'm not going through more methods here. There are some more in the paper but this is a new domain so of course a lot of work remains to be done. And intersectionality is nice but it's also problematic because what are the subgroups, right? How deep can you go? And the deeper you go, so the more the dimensions you have, the smaller these subgroups become and maybe even empty, right? So you have a data scarcity problem and there are actually two aspects of this data scarcity problem. You might have population imbalance so the different subgroups have different cardinalities and also class imbalance, different class imbalances within the subgroups as it's shown in this slide here. I'm sorry, I don't know why Tick-Mills took me so long so I think I have to stop here. Yeah, I go ahead, okay. Okay, thanks. So what we see here is from the adult data set, right? So probably, you know, this data set's a very popular one in the FENESA learning domain and what you see here is the distribution of the different subgroups and you see, for example, that black young females who have free protected attributes, they are tiny in comparison to white old males and it's not only about the population imbalance so they have different cardinalities, it's also about the class imbalance ratio, right? So this is the class imbalance ratio and for the black young females, it's really bad and compare it, for example, to the white old males or to the overall population, right? So you have a more severe class imbalance problem in the minority subgroups, right? And if we have data problems, such problems, this is problematic for machine learning because machine learning methods learn from data so eventually they will learn more from the majority groups, okay? And final one, sequential discrimination so this is really unexplored in the multidimensional case so there are no methods or you couldn't find any methods actually that target multi-discrimination in sequential multi-discrimination. The main idea here is that data arrives as a sequence of batches, so D1, D2, DT and so on and so forth. We have a sequence of events and we have the final outcome at the end of the process. There are some works from the mono-discrimination domain if you have a single protected attributes and this could be actually inspiration for the sequential multi-discrimination domain. So one work for example is this one where the idea is that you don't focus only on the final outcomes but you try also to take into account intermediate feedback, right? So this is what I do here with this product of the discrimination from the previous steps, okay? So this could be an idea but one major problem that we have here is that we don't have data with such intermediate feedback and maybe you have a knowledge here, I think you use a lot of, you create a lot of data, synthetic data, simulated data, that would be really great because we need really feedback in this process in order to apply such measures. So yeah, this was my talk, right? So a lot of challenges, I think it's mainly about challenges and I just outlined some of the challenges here which I think I kind of mentioned already and thank you for your attention and sorry for being a bit fast. Yeah, so thank you for the interesting talks. So just first a quick clarification. So the protected variables that are features are used in the training or not? There are different approaches. So usually they are not used in the training. Sorry, they are used in the training, they are not used during testing. Okay, because naively as a physicist view, the reason for this unfairness is that these algorithms are learning some spurious correlations, right? So, and the question is, it seems that the majority of the methods you mentioned are trying to fix a posteriori in this without really understanding why these algorithms are making these mistakes, right? So, I mean, the example of the lack of balance is a good one because then we understand why and we can act on the reasons of it, but is there a way of trying to address the root problem of maybe the missing feature, the missing correlation, instead of just tweaking it around to get the performance we want? That's a good question. I think we don't always have a good understanding where the problem comes from. So it can be the imbalance as in this case, but I think imbalance is really simply, and it's easy to also to find indicator, right? But it's not the only one. Maybe some class is more complex compared to another one. Maybe there is more variety in the class, for example, or maybe there are overlaps in the class, or maybe there are proxy attributes. There are many different reasons and there are works also which try to find really these causes of discrimination, but you are right. Many methods try to correct for the problem without really understanding what are the properties of the data which lead to the problem. And this is not always, I think, on purpose. Sometimes it's really hard to understand these properties. Of course, cardinality is something easy to measure, but it's not always easy to measure how complex it is to learn a particular class versus another class, right? But we need this feature in order to be able to find the fairness, right? So we need some way, and there are some works also, because now I mean we assume that the protected features are given, right? So you observe them. There are some works which also try to waive this assumption. Can I suggest to make short answer because we are already a bit late? Yes, yes. Sorry, I talk too much also. Thank you. I think actually my question is related to this one if not just another way of saying it, I'm not sure. But so I understand correctly, do I understand correctly that you're doing supervised learning, right? That's what you said at the beginning. So then, yes, I cannot understand how... So if you have your data that is already labeled, like I would assume that you probably have a lot of discrimination already in your training data, right? So data is one of the sources of discrimination, but of course it's the algorithm also contributing in the sense that this algorithm is trying to find some sort of patterns in the data, right? Of course, if it's in the data, it will pick it up from the data and it will pick the shortcuts from the data. I think these algorithms are pretty clever, right? So they don't... Okay, but is the goal to train a model that correctly guesses the labels that you already have, which... No, so in this case... So it's not a classical supervised learning problem where you're just trying to get the labels, right? Like you know that the labels are unfair and you're trying to propose an algorithm that would change the labels. I see, that's a good point. That's another assumption. So in this case, we also assume that we trust the labels, right? There are some works which assume that the labels are biased and they try to find the hidden unbiased labels. But here we trust the data, right? So we trust the data. So we use the training set. We don't try to change the labels. We try to build a function that predicts and predicts in a way that separates the classes well, but also keeps some notion of fairness according to what you want to have. For example, equal opportunity, right? So for example, predicts equal well for males and females as an example, right? It does not predict well only with respect to the class, positive versus negative. But you check for data that you're using to train the algorithm. No, we do interventions. I mean, either we change the data in some way, for example, sampling, or we can change the labels also, this called massaging, or we change the algorithm by adding this, what we mentioned before, adding in the objective function some normal, some term, right? This could be an idea. Or by changing the model, fine tuning the model after. So there are different ways how to enforce, let's say, fairness into the model. Yes, yes, yes, yes. Yeah, so thank you very much. I think it's an important topic. Like, so examples of unfair predictions of AI, there are plenty, right? I mean, plenty, plenty, plenty. The oldest or one like a very famous one is this one of predicting recidivism, right? Yes, yes. So meaning compass, exactly. So, and in that case, there is this one article in which they say that a very simple model can explain why those predictions are biased against young African-Americans. And so I'm wondering, so I mean, have you investigated whether the use of a simpler approach for your supervised problem would help you understand other sources of bias when it's not data related? When it's, I think when it's like just fraction, like when you say the number of ones is low in your training set, right? The most that you can expect is to have this sort of statistical calibration in which in your test set, you're going to have as many ones as you had in your training, right? You can enforce this. But I mean, in these other cases, in which it gets a little bit more complicated, I mean, have you investigated this? I mean. Quick answer. Quick answer. There are people that really are in favor of using simpler models instead of using big models. And in some applications, simpler models work well and they are explainable. We have a lab nowadays towards big and complex models, which of course they come in deep neural network, for example, of course relies more on the data because this is how all these parameters are learned, so of course it picks up more problems from the data, right? There's more prone, I would say, to such problems. But they perform also very well in other problems, right? So you cannot really, yes. But I agree with you that depending on the inductive bias of the model, some models rely more on data comparing to other models, so sometimes. Okay, so let's send the speaker again. Thank you.