 So guys, I'm sorry, but not everyone understands mathematics data science AI and we shouldn't expect them too And that's the fundamental idea behind my talk today. So good afternoon I want to welcome you all to my talk and we're gonna be talking about synergizing AI and domain expertise I'd love to present this talk as a discussion of thoughts and ideas So feel free to interrupt me at any point of time during my talk if you have any questions, right? I'd also sincerely appreciate the Europe Ithon organizing community for bringing up such an amazing event Because this is my first time at an event like this and I'm really appreciating the the bonds and networks I'm making this day, right? So, okay I would start with the inspiration and the vision that I've been having for the last two years And that's what's written on the on the slide there to build trust transparency and confidence between models and decision makers, right? It's it's a fundamental gap that is existing today between human intelligence people who are experienced in their domain Who've been doing this for the last 20 30 40 years They know the way around things and they're not comfortable when a machine makes a decision Which is not very clear with the problem statement or the environment in which the problem has been sustaining over the period of time And that's what I like to do. I like to bridge those gaps between human and machine intelligence I work as an AI scientist for polymerized. We are a product-based startup based out of Singapore We've built the next generation AI powered platform to insanely accelerate material science research and R&D across the globe I'm a Pythonista which we all are data science and machine learning to solve data-based Problems and enabling data-driven decisions Previously I've built a fully powered AI platform right now in production to solve for marketing Attribution for all of you who do not know what attribution is if I spend some amount in a particular marketing channel I want to know how much return I got from that channel that Specifically is called marketing attribution and once I know what my attribution is I'd really like to plan for the next quarter and optimize my spends to Maximize my ROI in the next quarter. So that's what we've achieved. We've got 1.5x ROIs, which is amazing Over the period of these two years while experiencing for marketing as well as for material science I've grown my interest over these following core areas The the first one of that is explainable AI because that has helped me a lot in Communicating between business decision makers and my team of AI scientists machine learning engineers and data scientists Right. I've also worked with a lot of tabular neural networks. You can consider material science and marketing They don't have a lot of images to work with so not con nets or RNNs but mostly focused on tabular neural networks and I've developed design thinking strategies the ability to checkpoint models and Also try to find loopholes in all of your neural networks. I've also done somewhat of reinforcement learning Optimization to exploit the learned patterns by the models and finally to deploy models in production scalable distributed training and tuning Cool moving on to the next slide This is my experience and my experiments with truth in the in the domain So there are a lot of situations which are high-stake decisions, right? You want to do your marketing planning for the next six months once you initialize that plan You're going all in it's like a poker game You're going all in and you cannot back off in the middle and the other problem is unlike poker The results are not immediately received. They're going to be received in days weeks and sometimes even months Timelines where if you are late, you might miss the entire opportunity or you might be in a situation where you can't back off So the the business users are risk-averse because they cannot give a machine the Responsibility of their entire business which is scary but also interesting for us designers of these algorithms So let's take an example in material science recently. We were developing a material for one of our clients based out of Japan They have a five-stage process of developing a material and in those five stages the properties of the ultimate materials are Evaluated almost seven times But the decision they take in the first stage or the course of six months is going to actually impact the material that they Develop and every individual experiment. They do is worth about 50,000 US dollars So they cannot go wrong with the understanding and analysis of the entire material They're developing the same example that I gave for marketing works well here as well. So what do we do? Let's bring an expert. He's been trained at this for 30 years. He's done PhD. He's done masters He's read the field. He has experience He meets 500 other people with similar problem-solving capabilities and he gives you a suggestion Is that always efficient? Not so much. Is that consistent? Yes, but is that revolutionally for the business in an environment that's ever-changing every day? Supposedly no So what do we do? We bring in this expert and we just let the things work the way they do But now we want to make a change we want to make sure that machines try to combine and Acknowledge the learning from the domain experts and then develop things together in synergy So the decisions are not made just by individuals and not just by machines, but a combination of the two So let's find a way which is explainable AI. So starting with Simon Sinek's favorite line. Let's start with y Okay, so if you look at the curve here, you can see there's a performance ranking on the y-axis And an explainability ranking on the x-axis if you look at a model Let's say a linear regression model. It's a super super explainable model You know how much each input variable impacts the output does it have a positive or a negative impact? So quantitative and qualitative impact and that's exactly what we need to know from the model It makes a prediction, but it also tells us how it makes a prediction But on the other end there's a neural network, right? And the moment we try to understand a neural network, we're all puzzled because there are so many variables there bias There is weights. There's all kinds of different layers And we have no clue at each layer how things are being impacted. That's when we need explainable AI There's also other models in the middle but trust me most of these models are pretty difficult to explain and even if you get close There might be some variation that you would have never known off and the entire explanation might just fail I Also use a lot of explainable AI in the last two or three years Just to make sure that when I'm tuning models and I got five of my top best models Usually what we do is we create an ensemble We take the average prediction from all the five models and that is a robust prediction of my output But when you look at those situations where these five models have a different explanation of them Then you can be pretty sure that a domain expert can filter out and map the really good models from the ones that are just lucky Or have been overfit or there's some serious programming flaw in them Right moving to the next slide Interpretability versus explainability so on the left you're going to see a linear regression model and a simple decision tree model When you look at these models the parameters and the model summary will really tell you what the models doing inside You can look under the hood of these models and you'll be fairly sure about how they're making a prediction But when you look at the models on the right the first one is a random forest Classifier, it's a combination of multiple decision trees. They're split at different levels They have different densities different depths How they reduce from the input to the output and on the right you obviously have a neural network Which is way more complicated than we can actually interpret and you can see the difference that Interpretation is where the models can actually explain themselves the models talk about how they're doing it explainable models are the models that need an algorithm a Statistical backing to tell us how they are making a prediction so that communication is what explainable AI is getting us into Talking about some few fundamental principles before moving forward to the advanced section is that there are two interesting ideas Correlation and causation a lot of people and I've seen a lot of high-end engineers often getting confused between the two Causation is what the businesses are looking for they want to know the levers that are impacting their business output their KPIs But what we usually see with a lot of models is not causation something called correlation You guys would have heard about the statistical method It's just a way to represent the statistical relationship between two independent variables and an also site issue is Correlation is only valid for linear relationships So if you have a non-linear relationship like a hyperbolic curve or a parabolic curve or an exponential curve Correlation is not able to detect the statistical similarity between those two variables So how do we move from correlation which is a statistical property of the existing data? To causation which is the real learning behind how inputs of the business are impacting the output of the business That's where explainable AI comes in handy I'm sorry. I forgot the example so a lack of RAM would cause the phone to freeze But when the phone freezes and text messages don't work Those are correlated properties text messages might not be working because of an app failure or Because of any of the other 10 reasons that all of us engineers know about but Whenever a lack of RAM is achieved a lot of things stop because the lack of RAM has caused them to stop right moving forward I'm happy to take any questions in the middle because I'm discussing and jumping over multiple issues So if there are any just let me know Okay, coming to the second important fundamental principle and a problem that explainable AI is going to solve is the Interconnected effects. Let me give you an example if you look at the image on the top There is a chemical reaction that I am trying to get into if I do it with a catalyst Two different ingredients come. It's a multi-stage process And if I do it without a catalyst a different set of chemicals are created So the addition of that catalyst Significantly changes the output of the chemical reaction, right now The addition of the catalyst also has to be in a certain quantity that satisfies the requirement of catalyst So it's an interconnected effect between the input variables. It's not just one input It's multiple inputs acting together and in a significant quantity for the ultimate result to take place That is what we claim as an interconnected effect When you go to a marketing situation, let's say I advertise on Instagram, on Facebook, on Snapchat, LinkedIn and multiple places But there's always a sweet spot of marketing where if a client or if your customer observes you in multiple places They might have a higher chance of conversion if they see you A lot in a different in a different place or if they see you a lot in a single space When I market to my consumers on Instagram a lot many number of times They might just get irritated. They might not like the brand But if they see me everywhere in the environment, they are present in they might be more willing to purchase make a purchase Right. So the interconnected effects are really really derived From situations where explainable AI really works, right? I've given the examples the validation of these experiments with Experiential learning which is learning from the domain experts and then checkpointing the models of matching those Understandings is what really drives value for the businesses Okay, let me give you a very interesting case study So anyone who's ever gone through some amount of material science or chemical engineering They know that tensile strength and elongation at break are two different properties of a material The way they work with temperature is the relationship that you see here. So if you increase temperature Your tensile strength grows down and if you have a lesser temperature while making the material your Elongation at break is low. So what we really want to achieve for great quality materials that we use in cars in Manufacturing units is a balance of tensile strength and elongation at break Now when you look at this chart, it actually is Exactly represented by the explainable AI chart that you see below If you look at this chap and algorithm that we're going to be talking about soon is an algorithm that gives you just the idea You need from the top curve and this is what one of our engineers were far of our scientists actually validated Where we started pushing explainable AI. Let's look at the chart below, right? so temperature for elongation has a curve which starts with a lot of blue scatterpoint plots and a lot of red and pink scatterpoint points towards the end if you look at the x-axis The left hand side is a negative impact and the right hand side is a positive impact and if you also look at the y-scale you will see blue represents low and Red represents high so blue represents low values of temperature red represents high values and The impact on the final property. Let's say tensile strength or elongation is on the curve So if you observe carefully temperature for elongation when it has a low value on the blue scale It has a negative impact on elongation, which is exactly represented on the chart above And if you look at temperature for tensile strength the red color the high value of temperature Gives a negative impact on tensile strength. This is what we are going to talk about This is what's interesting. This is where a synergy can be created and a domain expert can tell you that yes The learning of the model is real It's not a fluke and I think this is going to work in a real life situation, right? So scientific literature years of experience domain knowledge is going to work in synergy and Examples across domains have been already observed. I've tried it in four different projects I've consulted a lot of companies for explainable AI because of my research and talks multiple places And they've really liked an addition not as a feature, but as an idea, right? Okay, so what are model explainers model explainer is an algorithm that works with the model and the data set to give you an Explanation and deliver the learning and prediction explanations, right? So chap is the one is one of those algorithms that can work with a black box a black box is basically any function You put an input you get an output So the algorithm chap works as an explainer to derive learnings from these black box models on the below one you would also see something called as a model specific and Model agnostic explainers. These are categories when if you want to work with a specific model You need a specific explainer that's called model specific and model agnostic is where it treats the model as a function And it's going to work as long as you have a function that can do a dot predict, right? If you look at the chart below the original image is that of a watch being held by a human hand and if you look at the right Gradients can only make you reach so much. So you see that white area That's just showing that this is the part of the image But if you look at a specific algorithm called integrated gradients, which is curated for deep learning models You will see that it could actually detect the entire Time the clock in a much better way So, you know, which part of the image is impacting your prediction that the image is of a watch, right? Okay, we can start with local explanation. So these explanation algorithms can be of two kinds They can give you a local explanation. They can give you a global explanation local is for each Record in the data set, you know every input and how it impacted So you can see that for the Boston housing data set else that got us 5.79 RM got us negative minus 2.17 Knox negative minus point 73 so on and so forth for different ingredients And you can also see something that's which is the FX, which is the final prediction of 24 point oh one nine So we start with an average which is the models learning that an average housing price starts at let's say $24,000 and then every single input in the Houses characteristic add some amount of value So, you know what are at what individual variables are adding negatively and which ones of those are adding positively, right? Looking at the global explanation. This is what quantifies the impact of inputs, right? So you see again L stat is the highest most impacted Variable the input and it is also sorted in order But there's something interesting that I want all of you to observe and answer me quickly here Do you notice something interesting in the first chart? Anything that strikes Look closely at all the input variables. Do you find something interesting? Yeah, so the last one says some of four other features So what did the algorithm just do? This is going to be a hint towards feature engineering as well So if you look at the explainable algorithms and you look at the input variables and how they're causing impact You'll be able to notice that there are certain features that are adding absolutely no value So it's an amazing tool for feature engineering as well, right on the bottom You see the fully explained chart that I said from low to high low to high values of the inputs and the impact It's causing towards the negative or the positive side sorted by the strength of the impact So these are charts that explainable AI can give you and once you learn how to interpret them and Communicate them with businesses. They get amazing insights remarkable Discoveries that they can eventually plan into the business and then add more value This there's another interesting chart that I want to take you guys So there's an interaction and dependence plot. So this is not just the output prediction and its explanation It's how the two variables performance and sales Interact with each other. So a smaller value of performance higher value of sales or a higher value of Performance and smaller value of sales might not achieve the full Agreement of the business you might want to have a balance of performance as well as sales and that might be the sweet spot So when you're looking for new candidates new customers You want to evaluate their performance and sales from this dependence point of view So it can directly add a lot more value to your business Okay, coming to the algorithm a very simplistic Standard one. I don't want to go to the most complicated algorithms because that's not the intention of this talk It's just to get you excited about the future of AI explainability, right? So Shapley Shapley additive explanations, which is the long form of chap It's a game theory based intuitive approach towards explanation So it is the average marginal Contribution of an input feature among all coalitions coalitions. I would say is just permutations of all possibilities So let's say I have three ingredients to my chemical formulation. There's a polymer There's a capping agent and there's a di isocyanide. I Make all possible permutations where I switch off some ingredients. I take Combinations with respect to out of three I can take just one two of them all three of them in different combinations And you can imagine it will be a very high number of permutations that I'll have to do and once you include the argument of quantity of each quantity of polymer quantity of capping agent and quantity of di isocyanide It's going to blow up the entire space of permutations But once we have all of these permutations and we pass them to our model so that the model gives predictions I can find out the average marginal Contribution of each input feature to my final output and because of the average I will know the impact and because of the individual permutations I will know local explanations in any one experiment when I combine all of this I will be able to look at explanation curves that I have just shown five minutes ago At polymerized we've written a very interesting white paper on this you can find the link on on the bottom It's polymerized or IO slash white paper. We've written an interesting white paper for all the material scientists But I think it will also add a lot of value to Python starts here Okay, so I'll just quickly show you guys some code so you know how easy it is and like we all Love Python and open source. So I import the necessary libraries These are all standard machine learning libraries scikit-learn data sets model selection ensemble everything else I get the data set I get my x and y split the input and the output and then I do my training and testing split Which is just to look at the real performance of my model, which is out of sample testing I also build a random forest model Then I build it I fit my model I make a prediction and I look at my art to and mape scores mape is mean absolute percentage error. It's very interesting metric that we've been using at polymerized and I see a great performance then pip install chap not like it was very Unusual we all know how this works So pip install chap import chap and these are the lines that you need to code about right So explainer is equal to chap dot explainer This is going to take the model, which is the random forest regressor and the input data, which is capital X here Chap values are going to be calculated from the explainer using the input data I've added a check additivity falls because certain models and the precision of python sometimes does not match up and Then you can look at the chap waterfall plot. I have this is a local explanation So you can see I've just pinpointed the first Experiment that I conducted so I see all the variations the impact cost on the local ones I can also do a B swarm plot, which is for chap values I can actually filter the data points that I want to do explanations on so a B swarm plot gives me the global Explanations the ones that we saw on the slide deck. I can also look at the bar chart Which gives me absolute impact. So it does not have a direction positive or negative, but it has absolute impact and Also something very interesting is I can convert these and Automate them into a domain expertise synergy So let's say when I talk to my domain experts They tell me that these things work like these things with all of that So I can actually create an automated script, which post explanation can actually give me the relevance of ideas, right? I'll also like to show you something interesting if I can If I run everything You'll see the problem with machine learning models with this example if I run this the first time Let's look at the bar chart in the bottom. It takes some time This is actually showing you the number of iterations the permutations that had gone through You look L stat is at the top, right for this particular iteration, right? Second Now you see L stat is the highest impacting factor for one of the experiments one of the iterations and the other one is RM Right. So these kind of interesting insights and discoveries are easily found out with explainable AI and I'm running over time So I'd like to take some Q&A Before I have to leave Thank you very much. If you have questions, please come to the microphone So that people from home watching from home can also understand the question Any questions Random forest will give you feature importance Did you do a comparison between yeah feature importance and the shafty values and have you ever had a case where the two didn't agree? Yes, so I I would love to show you that example if I just add a random variable in the data And I add it and I train it with the random forest regressor the random forest will actually give me it as the third or second in Rank in terms of importance when you do a feature importance, but if and when we knew it's random, right? so I did that experiment and It was actually a part of my talk where I would have I was going to show how feature importance can actually fool us because Feature importance is actually driven by the changes the variance in the input data It does not actually understand the data set in that in that particular order so feature importance with the random forest regressor you can just try a random variable in the data and You will see it will give you second or third ranking in terms of importance. It's not gonna work. Thank you any other questions No, then I'd like to thank you. Thank you