 Hi, everyone. Thanks for attending my talk. So today, I'll be covering some aspects around explainable AI and what it's all about, why do we need it, and specific tools and frameworks which we can use to explore black box models and unbox their opacity so that we can understand the kind of decisions they are making and why are they making them. Slides and code are available here. Feel free to check it out. Anyway, they will be posted in the conference website also. So briefly, we'll cover about the importance of explainable AI, talk about the different types of model interpretation strategies, and move into hands-on. There will be a couple of notebooks, at least, which I want to cover. And let's see how much we can cover today. And then we'll end with the final scope of where we can go from here. So about myself, I'm a data scientist working at Red Hat. And I'm also an editor at Towards Data Science. And recently, I became a Google Developer Expert in machine learning. And besides that, I do some consulting also on the side. Let's move on to the importance of explainable AI. So the real problem with machine learning is kind of summarized in this comic from XKCD, if you can see. Basically, at the end of the day for us, machine learning is where we have a bunch of data. We put it through so many models. And we try to get the best model out there and then potentially use it in the future. But the implications of this can be really serious if you look at all these news articles which are collated here. You can see algorithms have started discriminating based on race, color, religion, and so on. And recently, Amazon took off their resume parsing algorithm, which was kind of biasing women versus men. And you have so many other examples out there. And if you see some of these quotations from people in the industry, even like Satya Nadella tells us that it's kind of important for us to make sure that there is no bias in our data, which is, of course, very impossible to think about it, because there will be some kind of bias always. But the real problem doesn't lie with our model, but more about the data on which it's being trained on. And also things like we should take into account diversity with regard to our data set. If we are doing some kind of image classification and we are only taking images of a particular race and so on, then it will not perform elsewhere as good as what it's performing on that data set. So we would usually do a trained test and say 90% accuracy, but is that enough? So what would be the impact of building these models without kind of doing this due diligence? So things like your applications for housing loan, let's say, you can get rejected based on if your data was trained on a very particular kind of, let's say, with regard to your race, your overall genomes and those kind of aspects, right? Where based on particular creed, cast, race, and so on, if people are getting loans based on those kind of categories and some new people come in from a completely different category, then what happens is your loan gets rejected. And the same thing happens with regard to these kind of aspects where computer vision algorithms, again, have been trained on particular stereotypical races, let's say, so that if people of a different color comes in, they are potentially not being able to predict in the right way. And the same thing happens, like if you see all these pictures are of potential CEOs which are out there, right? And all of them are belonging to a particular race. And that's how, like, even if you check stock images, right, which you use, a lot of them will be very stereotypical. And the problem is when we are training our models, we will usually train it on either open data sets because there is a lot of data out there, especially computer vision models, right? And if you train on data like this, it will be very biased to that kind of, I would say, category. And the same thing here, as you can see, this camera says did someone blink because obviously it has been trained on data which didn't pertain to, let's say, this kind of an Asian race, right? So these are some of the implications of how AI can be really having an adverse effect on the society. And this is a very famous example, as all of you might already know, that all of us have used word embeddings, or at least most of us. And when you train on a lot of data out there, which is your public data sets around Wikipedia, or let's say Google Common Crawl corpus, and so on, and we use all these pre-trained embeddings, right? I mean, of course, they work well in most of our use cases, let's say, or at least to an extent. Now, what happens here is, since the data itself has some kind of a bias in it, when you try to do these kind of similarity-based models, you will see that it will say these kind of aspects, that if you equate man and computer programmer and you say, what is woman, the vector basically points towards homemaker. And if you actually run this with the code, you will kind of see these aspects, right? So the thing here is, due to the bias in the data, the model is also getting biased, and there is not enough diversity in the source data itself, which are some of the things to keep in mind when you are really thinking about putting AI to make, let's say, critical decisions for us in the society. So those are some of the things to keep in mind. And if you run this, at least, like I had checked this a month back, if you do this Google translate from English to Turkish, you will see it will convert the he and the she, and it will flip it and say, she's a nurse and he's a doctor. So the reason why this is happening is when it has been trained on so much data, the data itself is having this kind of a bias. So it's not just about training models blindly, but also doing due diligence on your own data. That is something we sometimes neglect. I mean, it has happened to me, definitely. So this is something to keep in mind, but there are two trains of thoughts here. One part is people will say that if something is working well, I'm getting 90% accuracy, then why do we need to question how it works? So the main reason here is ultimately when you want to deploy AI in your enterprise or anywhere, the people who will be consuming it most of the time in some form or the other will be humans. And most of them may not be data scientists like us. So we like to reason and question everything. So people will be definitely questioning why is it making such a decision, especially let's say people who are from a non-technical background. So what do we tell them then? So this is where the paradigm shift is happening that if you want to deploy AI solutions, they will definitely be questioned when they make some decisions, especially in critical areas, right? And the main thing here is, like I said, the biased models may have adverse effects, but another thing to remember is you don't need to push explainable AI in each and everything, because there is this other train of thought that you have to question each and every model. Let's say I'm building a model which is providing me recommendations for songs, right? A lot of us use Spotify or some other app. Now, we don't need to question that why did it recommend this next song to me? I mean, you can always do that, but that's not a critical thing. So this is where you need to understand that explainable AI is good, but you don't need to force fit it to each and every solution. Only when I really need to understand or explain the decisions to my customer or to the society with regard to something critical being consumed, you might need to think about explainable AI. So what are the different types of ways to interpret all these models which we built? So with regard to interpretability, there are three things, the why, the what, and the how. These are the three things we really need to kind of look inside our model and this ensures the fairness, accountability, and transparency with regard to how the models are really working. So really understanding about the model interpretation, there are three things with regard to what is driving the model predictions, and we should be able to query our model and find out what are the key latent features and how they're interacting amongst each other and how are they important with regard to the model making decisions as a whole, and this ensures the fairness of the model. The other part is the accountability, right? Like when I'm making some specific predictions, why are these features coming out as the most important features and leading to that outcome? So this ensures the accountability of the model, and the last one is obviously the transparency which is the most important part. This is with regard to the local kind of interpretation which we'll cover shortly where, how can we trust the model predictions? So when we are evaluating very specific data points, right, and we are trying to predict on that, people may question that why did it say that the loan will be rejected for this person? So you should be able to explain for that pointed prediction as to what was the reasoning behind the model making that decision. So that ensures transparency of the model. And the criteria for model interpretation methods, again, there are basically three major aspects here. One is whether it's an intrinsic or a post hoc where intrinsic is basically any type of a model which is already interpretable, like linear models or tree-based models and so on. And post hoc interpretability is basically the model agnostic interpretation where you treat any model as a black box, can be an ensemble model, can be a deep learning model, and you try to infer or understand why is the model making this prediction. So that is the post hoc interpretability and model specific interpretation tools are what purely depends on a specific model. If I'm using a linear model, I can check the model coefficients. If I'm using a tree-based model, maybe feature importances based on the Gini coefficient or whatever, right, entropy. And model agnostic tools are the ones which we will talk about more today with regard to things like Lime, Shapley values and so on, which can be used to basically open up any black box model and try to make those interpretations. And with regard to the local or global, it's basically what I said with regard to getting the overall picture of why the model is making decisions as a whole versus pointed predictions of very specific data points. So that is basically the scope of the model interpretation where global interpretations is all about trying to understand how is the model making decisions as a whole for the whole data set. And local interpretations is why did the model make a specific decision for this particular data point. So you usually take a neighborhood around that point and then try to build a model around that and approximation. That's typically how the Lime technique works. And those are around the strategies and scope for model interpretation. With regard to techniques, we have traditional techniques obviously around EDA, doing some clustering, dimension reduction, looking at performance evaluation, removing some features and then again checking the performance and so on. And the other part is obviously just use interpretable models like linear or logistic regression or some tree-based models. But this is the main challenge, right? The accuracy versus interpretability tradeoff where you have on the top all the highly performing models with regard to the nonlinear models around neural networks or the ensemble methods around your gradient boosting or bagging based models. And you have all the way in the bottom the linear models right around logistic regression. So the main thing is if you want more interpretable models we may have to sacrifice the performance but if we want really high performing models which most of the time we will want, right? I mean why would we want a lower accuracy model? We will then have to think about the interpretability aspect also and how that may get affected. So what do we do then? We try to look at these kind of interpretation techniques, the model agnostic techniques around how we can apply these techniques with regard to structured data which is our tabular data sets which we get from databases, CSVs, whatever and with regard to unstructured, let's say text data or image data which we will be looking into right now. So the techniques are different depending on the type of data. So with regard to structured tabular data there are several techniques out there. The first one is you can just use interpretable models around decision trees or linear models or you use something like a rule fit model where it will fit a model and it will try to find some decision rules and it will spout out a bunch of list of rules that if this is happening and this is happening then it can be either this label or the other label. And the other one is you look into feature importance where you can get it depending on whether it's a model specific feature importance like a decision tree or you do some perturbations on all the data which is available to you. You vary the values of all the features and you see how they're affecting the model predictions and that is the model agnostic way of getting the feature importance. Third one is around partial dependence plots. Some of you might be knowing about this already where you basically take one feature which you want to see how important that feature might be and you keep all the other features constant and then you vary that feature. That is called as perturbation where we keep changing the value of that feature across different ranges and we see how it's affecting the model predictions and it's usually averaged across all the data points. Now if you want to see the effect for individual data points that is known as our ICE plot which is individual conditional expectation plot where you just take let's say data point A, B and C you keep all the features constant and you vary that one particular feature for these three. You will get three graphs basically showing how the probability value is changing based on perturbing that feature. So we look at this example through a graph also shortly and it will become more clear. The other part is basically doing local predictions and global predictions. So if you are considering building a global model and approximating it and making it interpretable, one thing which you can do as a model agnostic way is you build a model, let's say a Exhiboost model on some data and now you take in the features, you take the predictions of your Exhiboost model and you fit a approximation model on top of it known as a surrogate model. Like let's take a decision tree model and you fit it on top of that. So in that way you are kind of trying to interpret how the Exhiboost model itself is making the predictions. So that is known as a proxy model or a surrogate model. And the other aspect is the local predictions like I already mentioned about Lyme where you are doing the model agnostic explanations for local data points. So you take one data point, you take a neighborhood around it and you perturb the values basically and you see how it's affecting the model predictions for that local data point based on the neighborhood. And that is where you again fit local surrogate models and try to interpret which features are the most important in making the decisions for that model with regard to predicting the outcome of that point. And remember Lyme is again like not very robust because depending on the size of your neighborhood the outcome can always change with regard to which features are most important. So that is a caveat with regard to Lyme. That is where the Shapley values try to, I would say overcome those adverse effects of Lyme where they take into account the ordering also. So it's basically based on the conditional expectations of the features coming in and ordering here is important. So the concept here is that assuming that some features have already entered in order, in some kind of order and now another new feature is entering, what is the effect it is having on the overall model prediction? And the Shapley value just doesn't take into account, let's say ordering of B and C have entered and now feature A is entering. What is the probability of the model making a prediction? It will take into account all types of orderings and then it will average it out. So that is how you get the average Shapley scores for each and every data point and it's potentially more robust and let's say a regular Lyme based interpretation. At least that's what they claim. So let's briefly look at some of the examples around interpreting machine learning models on structured data, whatever we just covered. So you can access this code later. It's already there on GitHub, but what I'm doing here is I have loaded a very common data set. A lot of you might be knowing about it. This is the census income data set. So it has different features around age, work class, education and so on. Age is basically age of the person, the kind of work they are doing. The education number is the qualification. Higher the number, the more qualified they are. Marital status is, what is their marital status? Obviously occupation, the kind of job they are doing and so on. And obviously a lot of these are categorical variables. So we have a total of 12 or 15 base features. What we do here is we do a get dummies basically to make it one hot encoded, right? Because these are all categorical data. Now, for every category of a job or a country and so on, we have their own features, as you can see, right? So it's one hot encoded now and we have a total of 91 features. So the feature space has exploded. So model interpretation definitely becomes tougher here. What we do next is we just build an Exhiboost model here, right? We take 500 trees in that model and we fit it and we make some predictions just to check the performance of the model. So here class zero and one, what do they mean? Class zero is whether the person is making less than $50,000 and one is whether they are making more than $50,000. So based on all these features about their educational qualifications, the demographic kind of data, right? And what would be the outcome of whether they are making higher or lower money? That's the main thing. So can we understand the model decisions as to what are the reasons of, let's say, the model predicting that a person is going to make more or make less based on the given features on which it was trained. So the first thing you can do is just use interpretable models. Like in this case, you average out the feature importances across all the 500 decision trees based on which nodes it was, which feature it was split on, right? Starting from the beginning root node till the leaf nodes. You don't need to do it manually, but if you just call feature importances here, the get score, it will directly give you the feature importances. So this is basically a tree-based model. So we can get these feature importances and it tells us if the person is married, they have a decent capital gain maybe, and also their educational qualification may be higher. That's why these three features are the top three features and they are the most influential features for the model making the prediction. But what else is out there? So there is this library called ELI5. So ELI stands for explain like I'm five. That's literally the name, right? Like if I'm five years old, can you explain how the model is working? So what they do here is nothing special. This is not the black box model prediction. They will lift up the same feature importances which it gets from the tree-based model. They will just normalize the score and they will show it to you. So it works only on tree-based or linear models right now. And but this is a great way to get started on these kind of interpretations because it also offers you local interpretations where you can kind of take one particular data point. And in this case, let's say the person is earning less than $50,000 and our model has predicted also correctly. So what were the reasons why that person is earning less than $50,000? So it shows you these things that looks like this person is maybe a single person. They are having a low age as you can see. And also the hours per week which they are putting in are not a whole lot. Let's say versus other people. So these could be some of the key features which lead to the model making that prediction. So using these, you can directly kind of get an idea around the most influential features. Like in this case, the person is earning more than $50,000 and it is very evident from the first two features that their educational qualification is kind of really high and also looks like they are in a managerial position. They are married. So I mean kind of common sense but you can get all of these aspects from the model decisions. Now what else is out there? One thing is, yeah. Yes, these are local predictions. For one particular example, it's saying why are these most important? So yeah, we'll take the questions maybe after the session because there's a lot of content to cover. And then the next part is the rule fit models which is they will fit a sparse linear model and it will also introduce some nonlinearity in the form of decision tree based models and it will try to learn specific rules as to what are a combination of features and values which lead to a particular outcome. So with regard to these rule fit models, there is this package called SK Rules. It's under the scikit-learn contributors repository where you have different upstream, sorry downstream, I would say projects from scikit-learn and you can use this and what happens is you just fit this on potentially your input data. You're not fitting this on any model, by the way. This itself is the model which will build all these fixed rules for you and once you fit this on the data, it will take some time to train and then it will build a model here. So obviously not the best model but it's reaching around let's say 79% F score and it will give you all these rules once you start printing out the rules that if the age of the person is greater than 26.5, the capital gain is less than a particular amount and the loss is less than a particular amount. Their education is greater than 12.5 and let's say they are married here. So 0.5 means greater than that there is only zero or one so obviously they are married and so on. Then they can say with a 68% precision that this person is going to earn more than $50,000. So these are some rules which you can get immediately and you can explain let's say to your business or something as to what are the key influential combinations of features and the values which can lead to the model making this kind of a decision, right? Whether it's rejecting a loan or accepting a loan and those kind of aspects. But again, rule fit model is basically an approximation on your data. So it's not like you're using the Exjibus model here. Again, remember that. So this is again coming back to the first point of using models which are interpretable in nature but may not be high performing. So therein you go to model agnostic interpretation. Like we fit our Exjibus model, what can we do to understand how important a feature might be with regard to the model prediction probabilities. So this is where like I mentioned in this code the most important part is here where we have all our features and we are taking in one particular feature for which we want to analyze the dependency I would say the partial dependency plot, right? Of if I change this feature values from zero to 100, how is it affecting my model predictions? Keeping all the other features constant. So the first thing we do here is we divide our input data for that feature into bins. Let's say our input data, in this case we are doing it on education number. As you can see it's divided into 20 bins, right? With regard to the education levels. And now what happens is for each of these education levels we will run it on each and every data point. So all these 20 values are run for each and every data point. So if we have let's say 10 data points for the first data point all the features we run all the predictions like 20 predictions for all these values. And we see how is the model probability changing? Like if my education number is 10.75 what is my model probability? Giving all the other features constant what is my probability? Now I increase it to 11.5 and I see what's the probability. So we vary these ranges of that particular feature and we try to see how is the model probability predictions changing and then we get this kind of a plot that if my education number is rising if the person is more qualified the model prediction probability kind of rises. So using this partial dependence plot at the end of the day as you can see it's just simple math nothing special is going on. You just keep every feature constant just change that. And this is the concept evident line where it basically does this for each and every feature and it tries to find out which features are most influential. So partial dependence plots are a great way to understand individual features and how they are related with regard to the model predictions. The other aspect is the ICE known as individual conditional expectation plot where let's say if I have 10 data points I will have 10 plots where I vary the ranges for each and every data point and I will plot for each and every data point for the partial dependence plot I aggregated across and I just show it. So if you see this plot here which I will just down to here. So here I ran it for 10 data points basically and as you can see here the average value is our partial dependence plot. So once you average across each and every individual IC plot you get the partial dependence plot. And the person tiles here is I just sorted the data by the model prediction probability values and I took 10 data points. Like the 99th percentile was one data point or one example where the model predicted with a 99% confidence that that person will earn more than $50,000. Similarly percentile zero is where model predicted with kind of a 0% confidence that they will earn more than $50,000. So across all these percentiles the general trend of if that feature is influential or not should be. If that feature value increases basically the plot should also increase over time, right? So if that doesn't happen that means maybe that feature is not really having a consistent effect across all the data points. Maybe it just works for a subset of the data. So using IC plots you can kind of really see whether this feature is important across the model or maybe it's important only for particular data points and usually we will have more than let's say at least 10,000 samples in our data. So the best way to do this is just take percentiles of your data and then basically make the IC plot. The next part is looking at this framework called Skater. This was released by data science.com. Now it's a part of Oracle so it's open source again. What you can do here is it does these model agnostic black box predictions and also it gives you features like lime and also partial dependence plots and feature importances which you can use. So one of the ways to interpret a model is obviously the global scope of which features are important as a whole. So you can just build this interpretation object on your input data and you can pass your XGBoost model here and what it does is it will give you the feature importances here which are kind of similar to what we saw in the base XGBoost model itself. But remember these feature importances it is not getting from the Gini coefficient or something. It is doing those perturbations we talked about comparing the ranges of values for each feature and seeing how much they're affecting the model probabilities. And overall as you can see married people, age, education number, capital gain, hours per week which they're putting in our key features for earning more. And you can even get partial dependence plots. So the code for that is very similar to that what I just showed you. And if you see here for capital gain it's like if the person is maybe having more than 5,000 capital the model probability immediately shoots up to 0.75, right? So these kind of insights you can get which are very useful. And the other aspect is here with regard to age as you can see people who are above 30 and maybe less than 60 kind of having slightly more probability than the rest doesn't mean that it will always dramatically shoot up because that feature by itself will only have some kind of an effect, right? It will not always be like hitting 0.7 or 0.8. And in this way you can check this is the education number which is very similar to what we got when we did it manually if you remember. So using these kind of aspects you can visualize these effects of each feature and how they are kind of affecting the model predictions. And you can even combine multiple features together and view these kind of two way partial dependence plots where if the age increases and maybe the education number increases like people who are let's say between 30 to 40 and having let's say an education number of at least here 14 or 15, their probability immediately model probability rises to more than 0.5% at least. So those are the kind of aspects which you can see here once you build these plots. But honestly you can't go beyond two dimensions and even interpreting these 2D plots is not always very easy. So these are the limitations with regard to these things like you can only go so far as seeing one or two features at the max. And the next part is the local interpretability. If my model is saying for this person they will not get the loan, what is the reason behind it? So this is where the lime aspect comes into the picture and you can use the lime algorithm inside Skater itself. You just have to import it here and we have built a XGBoost model on our data. And now let's say we are trying to interpret why person zero is earning less than $50,000. What are the reasons behind it? So you basically pass it to the lime explainer here and then it will spout out this nice chart for you which tells that the reason is because their capital gain is like basically zero and I think maybe they are not married most probably and it's one hot encoded again. So maybe these two features are zero. So this could be that it's married civilian means the actual feature is married but it's zero for them. So using these kind of features you can get an idea of what are the key reasons because of which the model is predicting that this person is not earning more than $50,000. Right, so yeah as you can see these are all features saying that the person is married but they are zero means that person most probably is not married. And the same thing with regard to predicting whether the person is earning more than $50,000 or not. So there will be some features which will kind of push it towards why it's less than $50,000 but there will also be some aspects like in this case the person's education level is quite high which could be one of the reasons why they are earning more than $50,000. So the problem with lime is if you change your number of samples here which is your neighborhood of data the prediction interpretability can also change. So this problem is not there typically with Shapley values they are kind of at least a bit more consistent than lime and that is what we will see here. Before we go to Shapley values a quick coverage of the tree surrogate. So like I said once you have your model which you get from your Exhiboost your actual Exhiboost model once you train it you can build a tree surrogate on it where it will take in the actual features of your data it will take the predictions of the Exhiboost model and it will fit a let's say decision tree on top of it and then it will try to reason what were the reasons why the Exhiboost model is making specific predictions and it will give this kind of a nice decision tree for you which you can interpret and you can even print the rules here and see and when I fit this model on our Exhiboost model and I kind of plotted this the kind of reasonings I got was if the person is not married and their capital gain is less than around 4.6K and their education level is less than 13.5 then it's a 93.5% chance that the person is going to make less than $50,000 actually here it's 90.5% so 90.5% chance that the person will make more than oh actually it's a leaf node right so once all these conditions are satisfied we can save it a 93.5% confidence that the person will make less than $50,000 so these are some of the interpretable rules which you can get which is the reason of your Exhiboost model making those predictions but how you are getting them is you're fitting a model literally on top of another model so that's the surrogate model the last part here is using the Shapley values which I already covered around where they take into account the ordering of the features also and with the conditional expectations they will try to find out those average Shapley values for each and every data point and what you can do here is the library chap is already available online and you can just install it and you can fit it on your Exhiboost model and your data and you will get all the average Shapley values for each and every data point so using this what you can do is you can make the local predictions like why is this person earning less than $50,000 you can see here they are not married their age is 27 hours per week 38 those are the reasons why it's getting pushed towards the negative side of the Shapley values which means it's a higher confidence that the model is going to predict this person is not going to earn that much money so using these force plots you can easily get pointed explanations as to why the model is making decision A, B or C and so on and the best part about the Shapley values is you can kind of visualize and explain multiple predictions here right where you can go across all the data points and you can potentially see what is the reason why these people are going to earn more than $50,000 and the model is also predicting correctly so in most cases here the people are married so that is one of the key features which is definitely coming out with regard to why the model is making that decision so you can even inspect each and every data point directly from this plot and overall you can always get a feature importance here and you can also get a summary plot from the Shapley values and as you can see here what happens is for most of the data points basically if the person is married the Shapley value is having a positive value which means it's going to predict that the person is going to earn more than $50,000 for capital gain it's basically sparse so it's scattered across which means for smaller data points it is going to predict but for a lot of the data points also since they don't have a high capital gain they are going to have a negative impact like they won't earn more than $50,000 that is what the model is going to predict so based on the density of points you can also see potentially for a lot of the data points what would be the decision versus a smaller number of data points which are scattered and the last part is the dependence plots just like the partial dependence plots you can use the Shapley values and you can potentially show that what would be the decision again which is very similar to when you use let's say skater or a regular partial dependency plot just that it's same it's just done with the Shapley values and in this case you will see similar plots with regard to capital gain with regard to age and also education number you can see as the education number is rising basically the model is going to make more and more confident predictions that the person is going to earn more money so using these kind of features you can get an idea about how the model is making the decisions and why it's making those kind of decisions so the next part is for structured data and text data right there are two aspects to it one is using the base Shapley values itself using if you're using some kind of a linear model and if you're using some kind of a deep learning model there is something called deep Shap which decomposes the output predictions of the neural network by back propagating the contributions and correlating them to all the input features of your model so if you use these models if you see here we'll just go through this quickly so we are loading that same census income data set again and what we are doing differently this time is we are building a dense neural network basically a three layer neural network and we are passing in the data to this right so our input features are already out there the same features we are passing it to this deep learning model now how do we interpret the performance of this model so we train this for around 100 epochs and we got this kind of a performance right so how do we interpret the models you can use deep Shap here which is a variant of deep lift and if you pass the model here with the input data you will get the average Shaply values for each and every data point again and you can get similar predictions again if you give a data point it is going to tell you that what are the reasons why that person is not going to earn enough money or what are the reasons they are going to earn enough money so if you use the deep Shap from the Shap package you can kind of do this even on deep learning models for tabular data and you will get similar results with regard to why it's making those decisions and they are kind of very similar to what we got earlier just that capital gain has become the most important feature this time so it really depends on the type of model you are using all the time the models will not have the exact same feature importance and with regard to text data also you can use the exact same techniques you can use Shaply values again so what happens here is let's say we have some text data around the IMDB reviews where we are doing some pre-processing and we are converting them to TFIDF features where each word is a column or a feature and you have the TFIDF score for those features now how do you know that which features are important for the model saying whether the review is positive or negative so this is where you can potentially build a logistic regression model first and check out the performance and once you check out the performance what you can do is you can build a linear explainer model because we used a logistic regression model and you pass the model again with the input data it will average out the Shap values for each and every data point and you can check out the summary plot which will give you clear interpretations around which words are leading to a negative sentiment as you can see here versus which words are leading to a positive sentiment so using these kind of model agnostic ways you can potentially view these models and the best part about this is there is also pointed predictions around for a particular review what were the key words which led to the model predicting it as negative versus positive right and you can even build a deep learning model on this like in this case we built a stacked LSTM model on this text data now how do we interpret this there is deep shape based on deep lift which I just talked about for the previous data set you just pass your deep learning model along with the data and then you can make these pointed predictions again like what are the key data points which are leading to the model predicting whether the movie review is negative or positive but here you can't do the summary plot because it's embeddings right and every word the sequences change so we don't have that matrix anymore like the TFIDF matrix so you can't do that you'd have to do that manually and the last part which I want to cover is with regard to the image data so there are a few techniques here there is gradient explainer from SAP it covers different aspects around integrated gradients shapely values and smooth grad there is smooth grad also which uses average of the gradient sensitivity maps and some of these things you people may have used or heard of already like grad cam which uses the gradients from the class activation maps and it shows kind of a heat map of which part of the image is being activated and occlusion sensitivity is where we are kind of having an image we mask out parts of the image and see how it affects the model predictions and the activation layers is basically where we just extract specific activation maps from the CNN and we try to see which parts are being activated so looking at image data let's say we loaded the VGG16 model here which is a pre-trained model and then we load some sample images here so these are four different types of images and now we want to see what the model predicts and what is the heat map for it so we get the predictions here with regard to what the model has predicted like chain, owl, desktop computer, Egyptian cat now why did it predict so so what you do here is you just import the gradient explainer and you have to kind of visualize specific layers of the CNN you can't visualize everything as a whole that's there in things like grad cam or even in grad cam you have to visualize layers so that's there in occlusion sensitivity and so on so in gradient explainer you pass let's say the seventh layer of the CNN model and now you see here with regard to chain it's kind of looking at the overall structure for chain mail it's a kind of armor so it's looking more at this part of the image basically right and for the owl which part is getting activated it kind of shows you for the prairie chicken it's looking more at the eyes for the desktop computer as you can see it has completely ignored the person basically right so you can get these kind of aspects for the Egyptian cat it's focusing on the overall cat structure and the tabby is focusing on specific parts or patterns of the face of the cat and you can even visualize the more specific layers of the VGG16 model right because as you go deeper the model predictions kind of gets more focused on specific parts of the image not the generic features like edges, corners and so on so here you can see it's getting kind of more condensed as to which parts of the image are being leading for the model to make that prediction for the owl it's kind of the overall structure around the face but the prairie chicken only the eyes and if you see here desktop computer because it focuses on the keyboard also and the screen but the second prediction the second most probable prediction is screen and you can see it has completely ignored the keyboard part of it so that is the reasoning why it's making those predictions and based on using techniques like this we can potentially say that why it's making that predicting that class basically and the last part I'll quickly cover is this library called tfxplain which got released a few weeks back I think it works for TensorFlow 2 only and what we can do here is we need TensorFlow 2 for this to work obviously and you just install tfxplain and it has four types of techniques the first one is potentially looking at the class activation looking at the activation maps actually right activation layers so we load the exception model which is a huge pre-trained model based on an improvement from the inception model and let's say we are taking the image of the cat this model is huge so just scrolling down this is the image of the cat now it has predicted as egyptian cat, tiger cat and so on so using the class activation maps you can look at this which parts are getting activated for occlusion sensitivity didn't really work well for me because maybe the model was very confident it's a cat but grad cam is interesting so if you see here the first part is tabby the second part is egyptian right so it shows you which parts are getting activated and strangely for the tabby cat it not only looks at the cat but also the pause you see this part is getting activated because it looks at the shape of the cat basically and using that it's able to make that decision in some cases that is good some cases not that good and the last part is basically smooth grad where you can kind of see what are the data points or the pixels which get activated so the image is not showing up here but potentially if you run it you should be able to see it so i think we kind of covered all these topics today which you can go and check out and we're out of time so the last part here is basically models are not biased by themselves it also lies in the data most of these are not scalable yet so it's going to take some time till it becomes more mature and don't fit force fit ai into all the different types of problems so feel free to try out all these libraries and methods feel free to reuse the code and hopefully it can help you interpret your own models in the future so for one of your use case that was related to imdb example for that after tfidf for sentimental analysis have you used pios tagging corpus which has been created manually or some reference or like we didn't use pios tagging because the text itself was enough if we are trying to look at grammar or something then maybe it will be useful but we didn't use it okay and what about like that is my confusion so i saw a few of the stoppers over there so without even removing stoppers can we do the proper we do we remove stoppers so if you check the code and or something so that's okay so maybe but we remove the stoppers but if you don't remove then there is a problem with count based models as always like tfidf and so on thank you