 Well, hi everyone Welcome to this session. My name is Ajay. I'm delighted to be here with you We're gonna share some perspectives with you about bias a systems view of bias that helps you as Developers as designers as product leaders think about it when you build AI systems and a technique for detecting bias in data sets This is a talk by me and my colleague Ramya Srinivasan. We'll start out by talking about bias In a big picture sense and then we will in the second half of the talk speak about a data set technique Let me introduce where we work and come from because that informs our perspective about these questions We both work at Fujitsu Labs Fujitsu is Japan's number one IT services provider. It's the fifth largest IT services provider in the world we work in the California labs based in Silicon Valley and Our lab is primarily a human in the loop systems lab so we work on problems at the intersection of computing science and cognitive science Fujitsu also has many customers that are enterprises and Working with enterprises in many verticals gives us a rich set of research problems to work on and the perspective that we take In our AI research team is to look at the evolution of businesses over the next decade or so so it's our firm belief that Across the world businesses are Reorganizing themselves from how they are structured today, which is intelligence is apart So the intelligences in this room are apart from the intelligences in your pockets and on your laps the machine Intelligence and the human intelligence is not Computationally linked together in the way that it is going to get start started to link together So they will come a time a few decades from now where we will have autonomous Intelligences on the planet that we will recognize as first order in the same way We recognize our intelligence as a first-order intelligence then the next one to two decades We're going to live through an era of increasingly augmented intelligence and businesses Augmenting the sensing capabilities of us with IOT the decision-making capabilities of us with AI and the Acting capabilities of us with software and hardware robots. So when you think about this as the mega Change happening over the next couple of decades. It informs What kind of research questions you look at and specifically it will inform how we look at bias and in the design of AI systems that provide true augmentation to our intelligence Within this view what we really want our AI systems that we can manage We don't want black box AI systems that are very good at doing something But the interfaces to them are very brittle because they cannot be edited They cannot be engaged with the people outside of the folks who created them So our customers and I think global customers are expecting AI Expecting the new promise of AI to be realized by AI systems that enable true partnership with people across Across the technical ability spectrum. So to give a look at an example of how AI systems are built today Let's take a Abstracted view of building a supervised machine learning model. So let's say you work for Some company which has a customer service portal and somebody on the business or product management side comes and asks you and wants to build Some recommendation system that perhaps finds a relevant prior question to a new question that a customer asks So you might start out so this question might be given to somebody in the Data science team that may have some graduate level coursework on NLP in this case or some relevant machine learning model this person would start out by thinking of this problem as a data problem and Collecting data that they think is relevant to building the right kind of learning model They would split the data into training and test data. They would construct an AI pipeline based on their knowledge of research and industry practice Then train and test this pipeline using the data sets that they have and Essentially see if the performance of this pipeline is acceptable on the metrics that they have been given from the product side and this overall interaction between the people building these AI systems and It can essentially be summarized as data to AI pipeline train on train the AI pipeline on the data test as long as the performance is good enough deploy it so Bias creeps in in every step of this process bias is not sort of the responsibility or the The responsibility to manage or introduce only for one person in this pipeline But it's essentially a systems problem. So if you think about the interface between business or product and data science What's the word I'm looking for? articulating or perhaps Abstracting a given business problem as a data science problem that alone can bring in The perspective bias and shift the problem itself from what was the original intention Certainly, we all understand as understand as practitioners today, but there's a lot of bias in data And that is certainly a key source of introducing bias into the predictions of AI systems There's also we're all human beings. We know that machine learning is Accelerating at a rapid pace. There's about a hundred papers in some form of AI or machine learning research being published every week So we certainly do not know much of what is out there in terms of the best techniques to solve certain problems So our own human bias creeps in in terms of the limitation of our knowledge And then finally we're all working at enterprises which want to build products to serve people and there is a certain timeliness scarcity in order to build for building those things and that brings in this just make it work Kind of bias and I think it's me. I see a lot of smiling faces around the room around this kind of bias So I think you all sort of have perhaps felt or lived it so when we started out on our research project of Addressing these kinds of limitations of current AI systems We thought it would probably best to articulate the current state of AI practice Essentially dark AI not just are the models itself in mostly black box with the process for getting at those models is also like Is not not well lit is not well guided and this kind of AI is not quite suited for the kind of Empowerment of everybody with AI that is the vision of modern AI and that is certainly the possibility of modern AI So when we look at how we could actually improve things It's instructed to look at who might we want to empower with a different kind of set of AI technologies so we most We're most comfortable Perhaps in this room with the left side of this axis the left side of this human in the loop axis But we're really mostly serving at least businesses and consumers and increasingly there's regulatory attention on the products that we build And what do we do when we build or use AI services? So let's call that the co-creation axis Maybe we are selecting or building some AI at the bottom of that axis We build the AI we ask it a question. We are hearing its answer and then we are growing and deploying that AI So the set of technologies that make it possible not just for AI scientists and AI engineers to build and select there But everybody all the way from from 10 year olds to To executives who don't have technical background but have a lot of business intelligence to select and build AI systems That set of technologies in the community is is starting to be recognized as accessible AI Perhaps the most attention as a term has been put on explainable AI which is a set of technologies that allows AI systems to complement its predictions and decisions with an explanation as to why it came to that particular decision and then finally in most interesting interestingly in terms of Increasing the agency of everybody to interact with AI systems an emerging set of technologies that allows you to edit the AI directly is Termed under the umbrella of interactive AI so our research team broadly works on These three aspects of transparency for different kinds of humans in the loop So at any point in this two-dimensional grid, maybe what does the software engineer need to do to be able to? Select AI is more excessively What is the consumer need to do to one get answers explanations from AI more easily each at each spot? There is a wonderful research program instead of applied research questions that are exciting and relevant to To business processes and software Now with this big picture view, let's now talk about bias Bias for us. It's useful to look at bias in the context of Artificial intelligence itself as a evolution of how we write software So when I was going to school the classic one of the classic books for understanding how we write programs was this book called algorithms plus data structure equals programs and this was about Let's say, you know Let's say some of the more systematic ways to think about writing programs the introduction of the first higher-order Programming languages was really happening in the 60s or 20 years after that there was a book that was Telling students who were learning computing you have to think about what are the steps for your program? That's the algorithm and you have to think about how you are going to encode your data And that's the data structure and then you would take this program and the input would give you the output Today we're in a different world that equation doesn't apply as you well know because we are no longer Specifying the sequence of steps in an algorithm for somebody for for a program to give us an answer We are now building programs that learn so in order to do that we're building learning algorithms So gradient descent for example is a learning algorithm We're burning learning structures a neural network or a probabilistic graphical model or a queue network Those are all learning structures, and then we're putting into all this data to get a learned model So the equivalent as you all know of a program in software 2.0 World is the learned model the learn model usage is then you give this learn model and input and gives you an output so We never used to talk about bias here because we knew exactly what the program was going to do because no matter How complex the algorithm it was deterministic we could see it we could write it down and we could ask of ask and Think about its properties through a variety of techniques all the way from mathematical logic and formal verification To dynamic testing at runtime the same thing is now starting to happen with AI systems So there is an entire set of research techniques that are just starting to bloom to bring into the modern world the The issue of quality control over AI systems So bias is one aspect of quality control of AI and machine learning systems In order to think about whether your system is biased or not or has some other property that you want or not You have to think hard about what kind of system you want. This is not something that An algorithm will tell you You are the one you your business are the one who are building these kinds of systems So you have to think hard about what kind of system you want one thing that algorithms can do is they can assist you in coming up with specs Depending on their recognition of how the data is being used in the context of actual system features So it's our belief that AI testing just like software testing differentiates robust and resilient software systems from Those that are not so AI testing will also differentiate robust and resilient AI systems from the rest and bias testing will Become an essential part of this AI testing Specifically because the equation that we're working with here is learning algorithms plus learn learning structures plus data is the learned model So the biased part comes into testing in all three of those things So one of the takeaways from this slide is you have to think about what kind of system you want Where can we look for guidance there? Well one place where we can look for guidance is What do laws policies and regulations say because these are not just guidance these are requirements for us I think most if not all of you build systems that are globally deployed or Would be globally deployed and so an awareness about where law policy regulation already is in the major international markets Or is going is essential to think about the question of what are these specifications you would like to Meet have your AI system meet So let me talk about three categories of these emerging regulations one comes from I live and work in the United States So I can speak perhaps with a little bit more confidence regarding the emerging regulatory trends there and I will give you an outline there I think it's applicable globally just because our software is global We'll talk about some specifications of bias and some approaches that are emerging in designing ethical products So within the first topic emerging regulatory trends There's a few different places where that emergence is happening. Certainly. There's federal legislation legislation at the level of the federal government in in whichever geography you're looking at there's also state and local like legislation some states are really populace or And therefore whatever they say is the law of the land in their state Actually impacts the central law This can also be applied the EU the European Union recently had a big regulation around global data use and that actually has impacted businesses across the world because The way that the regulation is written it applies to citizens of the EU no matter where they are in the world So that's the second part. That's that's the third part international law and so on so federal legislation The takeaway at least in the United States is that over the last several decades there have been broad expansion of anti-discrimination laws Whether it's around sex religion race and other kinds of the number of protected categories are increasing This is not just true in the United States. This is also true in Asia It's also true in in the EU in India itself The Supreme Court has made some landmark legislations over the last one year itself And if you look at the last two three decades, it's the same vector happening in all geographies So awareness of these things is important when you build AI systems state and local legislation So for example, the current government of the United States a rolled back some automotive emission Requirements for automotive makers, but California just a few months ago said no We would like to keep to a certain set of emissions and all automakers decided that that's what they will agree to at the At the at the national level. So states have very broad powers San Francisco recently made some local legislations about banning the use of facial recognition for certain kind of police activity and that actually will Constrain how such technology can be used beyond just San Francisco itself International law also is an interesting place and a very relevant place to look at where these specifications for Not introducing biosignior systems come from because international treaties together with many national constitutions Essentially imply that they're the equivalent of local laws in those geographies I'll just mention that there's a lot of active research happening in the academic community both around fairness algorithmic fairness accountability and transparency. So if you want to Look for what's happening. You can look for this acronym FAT Which stands for fairness accountability and transparency as well as in the humanities in an area called critical data studies which looks at social power structures and And how technologies are are enforcing those So let's get a little bit more technical And talk about specifying bias not from the places that they are coming from regulation But also from how you would start looking at them in terms of categories of Antibias properties. So we can think of these in at least these four ways prediction bias speaks to your AI system is Systematically mispredicting with respect to an existing protected category So in hiring decisions, for example, if you are mispredicting a certain gender for a certain role That would come from any different reasons. It could be a bias in your data set But the point is that the specification you're looking for needs to not mispredict with respect to a certain protection Pre-judgment is a is a legal term which represents the situation where the person responsible for judging something has already made a decision about what they're going to do Even before going through the process and it doesn't matter if the result that they come through at the end of the process is the same as the one that decided on early on pre-judgment is Is a very clear thing that you have to make sure your automated systems don't have Perception is reality quite honestly and one of the biggest places where we're seeing debates in the media and in technology Is about the perception of bias in machine learning systems whether or not not that bias exists and you can expect I personally expect to see a lot of activity in the political sphere and the marketing sphere the technological sphere in the United States Certainly around the last election bias around the role of social media in influencing people is a hot topic of Conversation and I expect the same thing to happen as tools get used for making judicial decisions policing decisions in In India in the United States in Europe and up in China and so on The last piece from a policy perspective to be aware of as you think about designing ethical products is What are the ways in which people engage with the judicial systems to defend their rights? so Two bodies of legal work are around procedural rights and substantive rights of Procedural rights are about the processes that you follow for making sure that your rights are fairly enforced So if you feel that an automated system violated one of your rights There is actually judicial process in most geographies across the world that is available to people to seek redressal In fact, the European Union's GDPR specifically encodes in the GDPR what you can do if a Algorithmic data-driven system has made a certain decision about you So this is becoming all of what I was saying right now It sounds like what we are talking about is just what's been happening in the legal sphere, but there's emerging data regulations in different geographies the largest of which is the GDPR in EU that actually encodes in its regulation Privileges and rights for people the origins of which come from from case law and and Judicial practice the second set of things you have to think about when you build AI systems is are you doing something to the population at large? That is diminishing their material rights for example things rights to food clothing shelter social services, etc now Even a system that builds customer service portals at an airline for example If there is some substantive right that is being diminished for a certain class of people Perhaps because of the way they phrase a certain query because of perhaps the Perhaps the way colloquialisms exist in a certain area that would run into a substantive substantive rights problem In the research community is you can tell this has become rapidly rapidly important piece of research and my takeaway for you guys from all of this is Perception is reality. So when you're building software systems when you're building software systems that use artificial intelligence techniques Think about bias just in this simple way Your system is essentially distributing opportunity the person who is giving out the opportunity Let's say I am giving out jobs or if I'm if I'm a judge deciding the sentence for somebody or if I'm If I am building an airline system that decides whether or not to give you a Free upgrade from one class of seating to another class of seating. Am I blocking your opportunity? That's really what bias is about and at the same time for the person who is distributing these things bias is About risk management and maximizing overall return and that's really what your systems have to look at So, how does transparency fit in? How does building generally speaking technologies that allow transparency address bias that is that has a lot of There's a lot of legal constraints around it. So from our point of view It's essential to have these interaction points to reveal and react to bias. So we spoke spoke earlier about transparency Let's see how if we were to have transparent AI systems How a business person could interact with an AI system? So the first interaction point would not be to would not be to Abstract a given business problem into a data problem, but it would simply be to ask a trained AI model a question so accessible AI technologies enable that the expandable AI technologies allow You as a person to get an answer with an explanation as opposed to a data set or a vector With various probabilities that is hard for you to parse a person and a data scientist can map Mathematical results from prediction algorithms with their expectation of what the results should be but a person has beliefs and they are going to map what answer they got against those beliefs and Interactive AI allows you to update the AI itself or update your own beliefs and so this new loop looks like Asking the AI getting an answer with a explanation testing against your belief and updating either your beliefs or the AI the first step where you are able to ask questions of an AI and Because you don't have to be an AI scientist to be able to ask those questions Or encode those questions and mathematical structures Whereby you can get an answer means that you are going to end up with a lot more Diverse and inclusive set of questions in the very first place of bias They said bias Perception is reality. So you can ask the kinds of questions that matter to you and by doing so You are forcing in your process of building an AI system. At least an awareness of how your AI system will be used When answers from AI systems come with explanations It becomes much easier for even us that we are building those systems to recognize that there is some bias unintentionally in most cases and That bias gets revealed and in interactivity. We can react to those bias So the general body of research that is going on around these three pillars of transparency very directly Addresses these bias issues So that was the broad systems view We're now going to talk about a technique for recognizing bias in data sets and for that I'm going to hand it over to my colleague Ramya So this technique is called as topological data analysis and it stems from a branch of mathematics called as topology Which is the study of shapes? So interestingly, we know that data also has shape if you consider a simple logistic regression Model, it's basically if trying to fit a straight line to a set of data points and in higher dimensions. It's Understanding the separating hyper planes. So topological data analysis also studies the shape in data and in terms of not straight lines or hyper planes, but in terms of topological features such as clusters holes voids Across different spatial resolutions. What those topological features mean? We will see a little later, but we can think of topological data analysis as an independent and complementary tool to existing machine learning algorithms to analyze data and The benefit of this is this is applicable to sparse data sets and can also be used in conjunction with machine learning algorithms For a variety of purposes. So TDA or topological data analysis offers two fundamental advantages First it can be used as an effective feature extraction algorithm so you can take any type of input data categorical image or any other type and extract topological features from them and then feed it to supervised unsupervised or even reinforcement learning algorithms and Second advantage TDA offers is in terms of data visualizations. You have different types of visualizations We'll be seeing a couple of them. The one you see on the left is called as a mapper function Sorry, the right one is a mapper function. The left one is barcodes So these are useful for analyzing different types of data as you will see So in the rest of this talk, we will understand first. What are these topological features? How can they be identified in data sets? Basically, the idea is to identify invariant topological features. So those are the characteristic features of your data So how can we do that? And then how can we visualize them and coming back to the topic of this talk How can this be used to identify a bias in data sets and we'll see some method the results on a real-world data set So first question. So what are topological features? I mentioned these are like clusters and holes in data sets So you have set of points and cluster or a connected component is basically like Bunch in your data set as simple as the one you see here on your left These are like components which are connected within your data set So this is called as a zero-dimensional topological feature a one-dimensional topological feature is a hole Which is like characterized by a cavity in the center at high in higher dimensions You have voids like you can imagine that to be a spear with a cavity or a hole in the center and so on in higher dimensions So in order to identify these topological features We see that these data points which are your individual data samples in your training data set have to be connected in some format So that's how we can identify topological features So in order to identify these kinds of topological features We need to get some kind of structures in data like in kind in traditional machine learning We try to get in terms of so in neural networks you identify certain kinds of Patterns in your data. So in a similar way in here We identify that in terms of simplicies a simplest is just nothing but a generalization of a Triangle in higher dimensions So a point is a zero-dimensional simplex a line is a one-dimensional simplex and so on and a combination of Simplicies will give a simplicial complex. So these are the fundamental Units to identify topological features in your data set because if you have a two-dimensional simplex You can easily visualize a hole in that a three-dimensional simplex will give a void and so on now in order to Connect the data points to identify these simplicial Complicies there are different methods and the choice of a method is based on what type of data you have for example For image data you use Morse complexes for categorical data typically rips complex is used And that's what we will be using for the rest of this talk So a rips complex is nothing but if you have a set of data points like the one you can see on your right You have a bunch of data points the red dots which are zero-dimensional simplicies In order to connect them we just say that if two points are within a certain threshold We just connect them so you get some one-dimensional Simplicies like the black lines you see there because two points are connected when their Distance is less than a certain threshold and you repeat this process for all the points So if three points are close enough you also get a two-dimensional Simplicies like a triangle which is the light blue region here and similarly a tetrahedron for a three-dimensional simplex So you get a simplicial complex like this So coming back to the question of how do we identify invariant topological features so the idea is to construct these Simplicies across different spatial resolutions and the spatial resolutions are varied by varying the The distance here so here we can set a threshold Like are and say within a certain distance are we connect all the points But this distance are can be varied so that we can get different kinds of structures So that's what this does method called as persistence homologing So here if you look at this figure there are several points shown by the red dots individually and they're all disjoint initially For a radius of say 0.3 now as I increase the radius to 0.7 some of these points get connected And so you begin to see certain kinds of topological features So for example, there is a hole in the second figure in the central one on the top and as you increase the radius further you can see very clearly a central hole and The small hole that was there initially is now shrinking as the radius is further increased the central hole is Shrinking further and at this and some point the hole disappears So you see that there are certain topological features which appear and disappear But there are certain topological features that persist across different values of your radius So the ones which persist across different values of your radius are more likely characteristics of your data And that is the idea we will use to understand bias in data set also So how do we visualize this so as I said there are different methods and here We are using what's called as persistence barcode So for example, if your data set was somewhat like a circle and so the one on your left is some set of points Which represents a circle now when the radius is a small value all these points are disjoint But as you increase the radius some of these points get connected and then you can see that there will be a central hole So the barcodes actually represent the same thing it represents the duration for which a topological feature exists So the red lines that you see on the rightmost figure are the individual points like the zero dimensional simplices But as you increase the radius to some value you can see the central hole which persists for a long time That can be seen as the blue bar on the top So it distinguishing feature of your or your characteristic feature of your data is one that persists for a long time So those are the topological features of interest for us So let's connect it back to identifying bias in data set So this is like a pre-processing step bias can be identified at different points like Pre-processing in processing and post-processing meaning before you apply any classification algorithm during the time you apply and after you apply so this is before you apply So it's a data pre-processing step. So remember we just saw that persistence barcodes capture the Invariant features in your data set. So for example, if these two are some toy data sets And they represent some features in your data set say like age or gender which are protected attributes if a particular attribute had Bias in your data set for example in a loan application scenario if all the young people were denied a loan or All the females were denied a loan for some reason So if you plot the points basically the x-coordinate of this point is the protected attribute like the age or gender and The y-coordinate is the decision in this case. Let's say a loan default So if a particular group was biased We can see that there will be a cluster or a kind of Group within the data set for them. So and therefore that cluster can be easily visualized in terms of barcode which is long in length and you can also validate this in terms of Statistical tests because the topological features are not well characterized. We have to use non non statistical permutation tests And I won't go into the details of this but basically Let me just jump to the case study This was on German credit scoring data set where the goal was to predict loan default prediction And there were a thousand instances the protected attributes were age and gender And so these are the results We can easily see that barcodes of age are longer than that of gender and whenever there is a long barcode which persists for a long time It's more likely to indicate that there is bias due to that particular attribute in your data set and If you look at gender or job those are more or less uniform So there is no bias due to data set and we can also validate it with AI fairness 360 which is an open source tool from IBM and that also validates our research saying that out of five metrics Four show bias due to age, but none of them should say there is bias due to gender So so that I think I'm running out of time. So I will So these are just some FAQs for so I think at this point We'll be happy to take questions. So if you want to connect with us for or engage or work with us Please feel free to write to us here Thank you. Thank you for the talk and I think the this topic is very much, you know exciting and needed at this moment Because sooner or later we need to start using with the general Use cases and all where it matters a lot. So yes on the on the explainable and all part It is good. Do we have some sort of a library as well? Which is like added to like python and all which we can import and directly use those functions Or do we need to write on our own? You mean for explainable AI in general? Yes there are several Open-source tools, I wouldn't say libraries, but there is this a lot of work I don't know if you have heard about lime which is very popular and used across But there are several it depends on what kind of datasets you have So there are several open-source codes and probably even libraries as such Okay, thank you Thank you for the talk. My name is Ranga. So I just wanted your thoughts, Dr. Chandran. Maybe you want In the community there is a talk about the trade-off between explainable AI and performance Right. So as an example that I had a use case where you know a company wanted to roll out and offer To so many users on their platform and let's say out of one million you want to select you know a thousand people and You can build a model in very fast time. It gives you the answers But then the next question you ask is why were these people picked and if those were they interacting in the platform did their network grow and Depending on what they did on the platform. Let's say you wanted to Customize the message so that they are more likely to take that offer then those things are not possible with a With an algorithm or a model that that peaks on performance But then you need explainable AI, but to do explainable AI then you have to come to competent models or something else like that So what is your general experience and take on this trade-off that? At least is spoken about in the community. Sure. I can I can offer my perspective. So So you may have heard that there is a program sponsored by DARPA in the US. It started about two years ago It's XA it's called the XAI program There's 11 universities and institutions in the United States that are funded by it the founding question for that program Two years ago was exactly what you raised Does there need to be a trade-off between explainability and performance and over the last two years every six months or a year? They make public presentations about the progress and those 11 institutions these include places like MIT Berkeley Stanford others And so researchers are actually making rapid progress in that trade-off not being necessarily as Stark as it was. I would say three years ago itself. So that's that's one thing There is very promising news in terms of technical innovations The other thing is also if you think about the way we think about explainability is that explainability is risk management You know when you hire a new person in your team, you give them interviews You maybe ask Somebody as a reference for how does this person work? How does she respond in these situations? What are you doing there? You're essentially trying to understand that person even though you've not had a chance to Have that person be deployed in a certain scenario for your work so in life explainability is often a risk management tool and With these AI black boxes We don't necessarily need only those black boxes to be explainable to be able to do proper risk management around them We can do increasing testing of those black boxes And so there's a field of AI testing the first IEEE conference on AI testing just started April of this year. I believe and so there is More methods that do risk management rather than just explanation in order to increase our confidence to deploy these systems But have I had to receive but perhaps now have greater test coverage through new new new mechanisms Hey, thank you for the talk. It was really interesting. Can you Talk a little bit about the cases like a use case where the topological features have been extracted Have been directly integrated in the machine learning algorithms It would be nice if if you can take an example from a banking order finance to me Yeah, so the one which we discussed here was for loan default, right So as far as I'm aware most of the topological data analysis has been used in computer vision like for pose estimation structure recognition In natural language processing and mostly in time series analysis for detecting anomalies and so on You aren't aware of any work there where they have used this for banking or for detection of bias so, yeah Hi, this is Abhishek here Thank you so much for the talk. I have a very naive question actually you in the case study you showed you talked about gender and Think age as to find if there was a bias or not I'm assuming they were not part of the model which actually gave the decision This was an independent analysis to see if they are still having a bias. Yeah, if that was the case How about we just use a simple method like a correlation or a chi-square test to see if it is working Why should we go ahead and make sure we do? Our code analysis that you presented. Yes, so you cannot I mean if you just do like skewness of data Or correlation of data. It's not necessary that it will identify all the Features the advantage of using a technique such as TDA Is that it can identify it at different spatial resolution, which is not the case with like metric like skewness or correlation and Secondly the one that I compared it with the The IBM AI fairness tool here. They are also not using simple techniques like correlation or skewness. These are like Statistical parity difference parity equal opportunity difference and so on so in the fairness community Or in the bias and fairness fact community that as I mentioned, these are some known accepted What would I say metrics to analyze where so? Correlation is not definitely going to capture The ones that you need from the data Thank you for your talk Yeah, yeah, so you mentioned Your technique as you know applying after the model has been trained and has no before before yeah, so You know, how can you deal with a data that is inherently by so even in the data collection process? There might be biases, right? This is going to identify bias. This is not going to eliminate bias No, of course, but how would you I? you know One thing I didn't understand is that so if this in this case at least what you're saying is that the places for example, let's take a Variable like age sorry like gender which is sensitive, right? And we don't want to discriminate based on this particular variable So what I'm trying to say is that in the data collection process itself If there are biases which creep in then can this deal with that as well? So this is a method to identify by us in data So Data collection is not in the control of an algorithm, right? It's mostly in the control of people or whatever. So Hey, this is Teja. Thanks for a wonderful talk. So my question is mostly about the removal of bias. So for example, you mentioned that this bias identification can come in any Can come in at any instance of the pipeline like when you're building a model or we are preparing data What if it can come at a cost of a different performance of the model and this different performance of the model can hold for at least Considerable amount of time But the businesses have an incentive to invest in bias removal just because they're getting a high return on investment Because of the lack of removal of bias What happens in the scenario would they work would they be dictated by the rules and the fear of law or would they Do something else in order to ensure that the performance does not dip just because the bias is removed from the data and the models are not you know That's that's So my my both my opinion as well as what I'm seeing in the market is Is along two dimensions in the context of your question one is definitely regulation? So if there was not GDPR which became law in 2018 Certainly businesses would treat bias as they have treated security for decades I think it's only in this last decade when we've had hacks on the orders of Courage of people millions of people terms of that that businesses Recognize that they have to invest in security because of GDPR many data-driven businesses are Have a way to quantify the cost of a decision that they cannot defend the GDPR specifically puts in about 4% of annual revenues as The the fines that would be imposed in fact There are already cases of the maximum fine being imposed on some international organizations So certainly that is acting as a clear business imperative for businesses to look at In their entire process and not just in their data when bias might creep in and actually starts from just to touch on one of the earlier questions by asking a whole set of diverse questions at the product management level at the business, sorry service design level that While Discussing those questions you come up with data collection processes data sanitization processes that reduce the chances of bias I think the second quite the second aspect of the business imperative is Profit-driven more diverse and inclusive businesses that are doing a good job providing services if I can give if I can provide loans across various Attributes of people and those loans are going to be well performing because my AI systems are not Sensitive to or not overfitting with respect to certain things. That's good business for me so I think the second sort of Factor a very positive factor is it's we can serve more people and we can have more revenues because our systems are more aware of How to serve more people? Thanks. Thanks