 Thank you everyone for joining. As you know this is our series of conversations brought to you by Hasgeek and by Scribble data on making data science work. So some of our past conversations have had the flavor of what does it mean to actually put data science to work, to production, go earn us some money or do something good for us. And after a few of the initial conversations around how data science problems are framed, now we are moving into the aspects around being conscientious about data science and in fact today's topic is responsible AI. Of course it's a fairly broad based term and thankfully I've got Suchena to co-host with me. Suchena is an AI ethics expert. Some of you would have seen her in an earlier session with us where she was the panelist and along with her as our panelists today we have Janice and Josh who I will introduce in just a second. Welcome everybody. Thank you. You're most welcome. So let me start with Janice. So Janice is the founder and CEO at Port for Thought. He's based out of Athens in Greece. He has over 17 years of experience in evaluating large-scale software systems as a PhD researcher, as a management consultant, and also as a practice leader for an international consulting firm in Greece. Janice's PhD in computer science is from the University of Manchester. And of course we have Josh as well. So Josh Rubin is a senior data scientist at Fiddler Labs, fiddler.ai. Do check them out if you haven't already. And what I found interesting was that why his PhD from Abana Champagne is in experimental particle physics. He's extremely passionate about challenging computational problems and particularly enthusiastic about data modeling and semantic matching using deep learning. Fiddler.ai of course they play deeply in the explainable AI space which is one of the one of the branches that we will definitely be talking about today as part of our responsible AI topic. So you know what my first question just to to get everybody all warmed up and maybe even a little hot under the color what what is responsible AI in our minds? What do we think of when we when we are talking about irresponsible AI? And of course we'll touch on each of these slippers. Who wants to take this? Can I I can I can start? Please. Yeah I mean responsible AI or trusted AI there are two terms that for me they are kind of interchangeable. Means essentially the AI that has in first place three things in place with AI that is accountable. So the organizations that are using it they make sure that they can govern the AI appropriately and they do have some risk management properties in place and also that the teams that they design and implement the AI they follow best practices in order to control it and make it and make it manageable let's say from a social point of view. And the other two attributes for a responsible AI is the fairness. So we have to make sure that the models that we use and the data that we feed those models they are bias free as much as possible or we consciously know where the biases are and then we have we devise ways to mitigate this. And the third property is transparency. So is how you make AI explainable as you just said and mentioned. So which means either we we develop AI which is explainable in first place or if we use methods that are not considered as explainable for instance deep learning then we have the appropriate mechanism in place in place to provide some additional explanation the so-called post hoc explanations. So these are the three things I have more or less in mind when we talk about trusted or responsible AI. And of course we can talk about privacy and security and all those things but in my opinion these are of course related with the responsible AI but they are mostly properties that come mostly with the software itself and the data themselves not with the combination of software and data where AI applies. That makes sense. The fact that privacy and security are slightly decoupled from this but still parts of it in some sense. Yes, yes. Yeah absolutely. Josh any any thoughts around this? I think that was an excellent definition and I think you touched on most of the main circles. I might just take kind of a you know kind of a bird's eye view a little bit of the whole thing and approach it from just a slightly more kind of philosophical perspective. You know I think AI you know changes the way that we do business operations that were traditionally performed by lots of experts for lots of different functions in businesses and in society. And so I think responsible AI has a lot to do with understanding the stake of all of the stakeholders involved and making sure that everyone has the same sort of you know the models are transparent to everyone and everyone has the same controls and inputs that they originally had. So in a way it's not just a technical problem it's really applying a technology that tends to be complicated to a lot of different problems that are sort of human or societal or business or customer related problems in a new way. And so being able to communicate you know the reasons why something is behaving the way that it is and being able to put appropriate controls on things is really important. And also there's the time to mention to that making sure that something that's being used in production continues to perform the way that it was intended. So you know at Fiddler we have a sort of a monitoring product. Right. Operationalized ML where we watch things like performance and like bias you know as a function of time as the model may begin to get stale or the world may get to change a little bit from the original conditions so we see that as part of it also. I meant to ask why do you think what forces will it take for people to spend money on this? Oh that's a nice that's a good question. I think that let's start looking for the compelling factors for me to be pragmatic a regulation or a legal obligation is compelling enough for a company to invest in those things. But on the other hand that shouldn't be the case if you ask me from a commercial point of view I'd like to see organizations who are proactive and they are investing in technology that will render them let's say FAT or FACT ready or FACT compliant let's say before a legislation has passed and there are two reasons for this. First of all because. Yanis would you mind just expanding that acronym for everybody please. Yes you're right by FACT we mean fairness accountability and transparency. This is the explanation you're right. The point is that and I see it also when it comes to quality aspects for code driven the traditional software systems when companies invest in improving their own systems they're actually investing improving their own people. So if a company wants to attract talent a good way to show that they appreciate talent and they are investing is also to invest on the behavior of their AI models and the ML models. I mean recently the United States those days and also last year there were movements within big tech companies like Facebook or Google from employees that they were thinking that their companies are not acting in an ethical way or that the software they produce is is tricky. So imagine if you're a company who actively invests in those things how nice is for your employees and how let's say it's a very good way to lift the spirits within a company. So I'd say don't wait for the legislation which is the most compelling reason of course but be proactive and see it as a mechanism first of all to improve your products and your models but also to improve your own people. That's my be my message. I love that. I remember Suchana and I were talking about one other aspect of this and in fact when I was linking to this particular talk I said it might be a little bit like sourcing fair trade coffee where there will be a market where ethical or conscientious customers will pay with their dollars to go to companies that engage in best practices ethical responsible AI. The only gap and this was the conversation that Suchana and I were having is that in the case of like fair trade coffee or you know no child labor involved in the manufacture of these products that straight line is very visible to the consumer in the case of responsible AI that's a little harder to be able to qualify like there are no certifications. Right. Any thoughts around this? If I may just jump in here right so I think one of the interesting things about AI is also that it has an ability to create its own liability as it goes along right. So Yannis was talking about the way that privacy and security are somewhat decoupled from AI ethics issues and you know what came to my mind is for instance we've seen a lot of AI applications recently in fact given around for a while right that try to predict gender you know users gender from a bunch of other proxy variables in the data even if gender itself has obviously excluded from the data set right and so this you know this kind of brings us into the entry of you know even if you take a few easier obvious steps to protect people's privacy in certain ways you know de-anonymization is getting easier and easier and machine learning is obviously one of the best ways to de-anonymize data right. So that's one way in which you know as as we go along building like more and more sophisticated machine learning algorithms we are generating our own liability so to speak as we go along. Sorry I said that makes sense that was a mute. As part of the the three three branches of responsible AI that Yannis introduced right at the start I'm wondering if you can just go out of order a little bit because I would love to hear Josh talk a little bit about explainability as one of the big branches given that that is one of the core functions that he looks at day in and day out. Josh just you know maybe something about shafty values and yeah just introduce us into this world. Sure sure so there are a number of different mechanisms that one can use to try to I mean the purpose of explainability is try to expose the underlying sort of reasoning behind why a model is making a prediction that it is and there are a number of different techniques for you know trying to explain a particular model inference so we call that a point explanation so model makes one prediction you ask the question you know why for this why why was this loan rejected for example right given this set of inputs why did the model produce this output the other kind of you know another thing you can look at is kind of global feature importance so generally speaking what are the kinds of things the model cares about so so you know at fiddler you know we like to think of you know our offering as sort of a bunch of different pieces kind of orbiting in the ml life cycle you know kind of starting from you know data collection and model creation and then all the way through to operationalization and production and monitoring but what we do that I think is special and kind of I think is a great way to frame this problem is that we try to bring these explainability tools into each step of that process so you know for example in this monitoring product that I mentioned before something may happen in data over time you may see that your prediction accuracy starts to drift right you're getting a distribution of results maybe that are different than you have in the past so probably something in the real world is changing right and you don't know exactly what that is and so that's where we start to leverage these explainability tools this goes back to this kind of core of of explainability which so so for us our kind of go to first stop is usually a shaftly values based explanation method which is based in a cooperative game theory from the 1950s but has recently in the last five years basically been applied to understanding the behavior of machine learning models so the underlying concept there is that if you're trying to split some sort of reward that a team is responsible for sort of fairly among the players for what that team as a coalition has achieved by looking at the way that the by reconstructing the team in all permutations and combinations and replaying the game you can figure out what the marginal contributions of each one of the players is and that nicely includes all of the interactions between the players so you know places where two players perform great together but are lousy by themselves you know and it distributes that reward fairly when you apply it to machine learning what's done is that you the game is played by the features that are going into the model and the reward is the difference in the model prediction and so the idea then is to try to attribute a change in the model prediction directly to a sort of marginal contribution of each of the inputs and what's nice about this method is there's a black box formulation of it so you can apply it to non-differentiable models like things like like decision trees that otherwise you wouldn't have access to like gradient information like you would with a neural net um gosh what else did i mean to say there uh to do it exactly it's very computationally expensive but there are approximate techniques things like like shap and lime that work reasonably well right yeah uh indra if i may say to compliment what joe said um what do we see uh also of the our end of code for thought is that the supply values indeed provides very useful insights and sparks lots of discussions within an organization uh and there is a not and there is another technique which is called contrastives which is um let's say model agnostic and it can help you understand what are what are the limits uh where the decision where the decision can change so for instance for a credit scoring algorithm if i can uh if i if i if me uh my loan gets rejected and yours is approved but more or less our profiles were similar then using contrastives you can understand what are the the those tiny variations that change the the algorithm's decision so one can guide me for instance that my bank can guide me in order to improve my credit scoring and then get my loan accepted in the next time yeah absolutely um i wanted to find out see this explainability the way that both of you both of you explained it to me i can completely see how the how it is tied to the performance of the model meaning if i can explain it better if i can actually use these shappley values to predict uh to to take grift and and to to figure out the importance of features uh for for particular predictions my model is going to get better i can i can completely see that um and there that aligning of carrots between uh doing something that is port and port right explainable in this sense lines up with actually making my model that much better accounting for changes in the real world any thoughts on on how that intersects with fairness um yeah we we've seen cases where i mean if you if you get the results and the explanations and then you start uh inspecting uh those explanations and those supply values our experience shows that they were within the top 10 top 10 supply values you can get lots of insights so if a model has let's say 40 features the top 10 can give you enough insight in order to be able to explain a decision and there you can you can by inspecting them and examining them you can find out whether there can be potential bias so for instance if uh if a if a woman and a man they got they got different results but the decisions were more or less uh they were similarities then you can identify which features played most important role for each of them and influence the decision and then you can see uh if there is a bias in the algorithm or not uh or for instance when we we used once our explanation technique for image classification and then we identified in in several several pictures when we were experimenting that there were guys wearing makeup and the algorithm has classified them as as women and then you can see the bias because i mean if you are on tv and you usually you wear makeup the algorithm thought that the guy was a woman not a man just because of uh in the pictures that they indicated the decision were the ones uh relating the makeup area of the face and i i'd like to sort of offer something in a slightly different direction um if you can identify a place where a model is be hitting badly um if you can identify an example of an inference where the reasoning is inconsistent with human human intuition um then it's a little bit like finding a bug in code right if you're a software engineer and you if you know there's a condition where your your code is going to crash or it's going to do something wrong that gives you a thread to pull on for a region where the behavior might be sort of um you know radically unpredictable right um by creating these explainability windows by making these models more transparent um it not not just from the model debugging perspective from the data scientist but if you're able to provide explanations to somebody who's a domain expert right if there's somebody who like a um um a fraud auditor or a chief risk officer who has an intuition for how the model is supposed to behave above and beyond the data and they're able to look at a specific example and say uh this one's weird and now your information and telling me the model's reasoning and this is I agree this is weird um you know that taps into debugability what what nobody wants data scientists or anybody else in an organization any of the stakeholders is a model making inference um in some regime where it's extrapolating wildly and and to come to fairness and bias when you've identified something like that um really all bets are off for how you expect the model to behave it's it's that you know that that's so so by bringing all the stakeholders into the same room and being able to evaluate the explanations um with these explainability tools um it gives you that sort of sort of debugging edge that you need to address a wide variety of responsibility related problems I would uh I would agree with Josh I mean having all the stakeholders on the same table that's a rare moment at least in large corporations and it's very valuable when you do it uh and you get everyone on the same page um then you that that's when you can have impact within an organization and start improving things and that's what we uh also experience from our side you need all the stakeholders on the same table in order to achieve results and sometimes the explanations that the same explanations can be presented indeed in different ways uh to different stakeholders for instance for the non-technical people the the you need more visual explanations you need to visualize them or you need to uh make them want to to transfer them into natural language text for whereas when you present your results to a data science and machine learning engineer you can be as technical as you want so the challenge is then how you can translate uh your technique your your measurements your findings into the appropriate way for each uh group of stakeholders and then also to to to have them on the same table because I I mean I couldn't agree more with Josh on this sometimes I think that the the usefulness of our tooling is not the tooling per se but the the opportunity that gives to different people to sit on the same table and discuss the results that's wonderful especially because it seems to me like a wonderful opportunity to look I mean let's let's say for example a tool was able to spit out something something about drift where something is against human intuition decisions are being made that are counter to human intuition you use that opportunity to get all of these stakeholders at the table when I say stakeholders I mean people who have an interest in uh the the revenue that these models are generating people that are domain experts people that are social scientists people that philosophize about this if you're able to get them that you are potentially able to create an ethical set of principles around which these models should operate and maybe again you know those are constraints which and I would love love to hear something on constraints from you maybe those are constraints that make your model work a little bit better absolutely so I think you know one of the one of the great things that Josh was talking about you know the difference between global model level explanations based on future importance and uh you know individual inference related explanations is that both have their place in model risk management and in the entire data science the the life cycle right so I think one of the biggest benefits to these global model level future importance uh a type of explanations is that you can implement process fairness and and that's important because it lets all the stakeholders have a say on what are the predictors that should even go into the model so you can sort of intervene at the design or the solution framing stage and ask what are even fair predictors for this particular problem right do we really want to use gender to predict this particular uh you know target variable do we want to use age to predict this particular target variable so you have an opportunity to sort of troubleshoot or devide your model from a fairness perspective even before you go down the road of a lengthy prototype development process and then you know kind of learn from a contrastive account of actual explanation that your model is giving purity yeah but I think you did both and the local explanations because of that's I mean me as a as a bank client or as a citizen I'm mostly interested about the decision that uh that concerns me uh but on the global explanation it's really nice for all the stakeholders so to understand the global vision of the model but so these things are kind of come hand in hand and that is why I say you need to decide who which kind of explanation is more relevant for which stakeholder so uh sorry Josh you were going to say something because I was going to ask about tools yeah I was going to maybe I'm going to transition into that yeah we've been working with with hired.com um and so so to kind of bring it back to kind of tools um you know I I think an interesting thing that uh is is worth differentiating here particularly for a mostly data science audience um is that we find that it really helps if you can get the explanations out of the the notebook right out of the data science uh substrate and into some sort of more appropriate venue for uh you know operationalization so um shoot I had a connection to something something that that Yana said but it's it's soaked out of my mind um maybe I can prompt a little bit because it seems it seems to me like there's a question from Venkata actually he says what are the major lessons learned during the products that both of you have individually I mean been been part of the product journeys and what I thought I heard you say already is about surfacing explanations into the right format into the hands of the right people uh maybe from there yeah yeah this let's totally get this gets to both both Yana's comments um about developing the the what uh developing the right uh the right kinds of metrics um so you know one thing that we as I mentioned the thing that we we focus on is kind of the operate operationalization of the of these tools so how do you get them out of the notebook um and and and a nice thing that we found is that as you start to develop tools for other stakeholders um there's a feedback cycle about how they want to see these these things visualized about what are the right metrics for them so for the case of hired.com is a really interesting kind of case study there we started working with them around basically constructing a set of data scientists focused explainability tools around models that they used to predict the appropriateness of a resume match with a job um what they realized might be helpful is if they could feed some of this information to their curators so people who really don't have so much to do with data science at all but are ingesting these predictions from these models so all of a sudden we're at this interface between two groups of people uh the data science side and then and then someone who's in a kind of a business vertical that has some expert knowledge about you know what makes a candidate good or bad how to coach a candidate on how to change their resume to or get more experience to get better candidate um and so we developed we used the the api kind of integration with fiddler which i think is another one of our strong use cases to develop a little dashboard that was specific to the curators so instead of looking at um the model predictions in a notebook like the data scientist might or in the fiddler platform proper which is a sort of a web interface um we provided a web experience that was um you know built on integrations with fiddler that could be ingested in the workflow that the curators are doing and one thing they realized right away is when the curators are looking at the same data not just now with the prediction that says this is a great candidate for this job at a 88 percent level but um here are the reasons why the model thinks so the curator can use their expert knowledge to um you know evaluate whether that model reasoning is sound we also think of this as like the human in the loop ai case right this is the best of the human and the machine working together because um you know the explanation can help the curator do their job more efficiently and not miss important details that they might otherwise do um Josh quick sorry for the interruption but you were talking about the feedback i see you put the human in the loop the human now has the ai why the ai did made a particular recommendation how does uh the model that was making the recommendation get fed this back next time so that it's making a better yeah yeah so so i'm i'm just there uh right and so so the the important and sort of unexpectedly valuable feature is that all of the sudden the curators are then talking the same language with the data scientists so we provided what they wanted right away this was like the day two feature was can i right click this link and drop it in the email and send it to the data science team when i see something weird and they've never had that power before right they were just being fed a prediction for a particular candidate uh and they were just ingesting that right now they're seeing a reason um it's that reason has been sort of carefully sculpted for their purpose uh in terms of the visualization and the explanation type they are able to evaluate whether it's weird or not and then when something does happen that's weird they can right click a link describe why this is inconsistent with their expert knowledge and send that to the data science team and so all of a sudden we have a connection across this um you know this wall that otherwise separates two really different business verticals um where the domain expert is able to feed a kind of the responsible appropriate way to validate the model back to the data science team and they're not working in a vacuum anymore and and from what we get back everybody is really excited by that in ways that they hadn't initially appreciated um we had a similar experience in a project that we did to get uh almost a year ago maybe two years uh with a high-tech company in the united states where they're were developing a model that identifies potential uh data leakages within a company enterprise-wide network and they are the uh the the end users of the model decision decisions were the network administrators so they had to flag a certain user as potentially malicious or not and and and when they were providing the explanations with the top five top 10 reasons why this particular patient is considered as malicious they a they could do the same thing as just mentioned i mean they could relate to the to the situation and this will make them we was making them more efficient and let's say it was augmenting their capabilities and decision uh let's say making um capabilities and on the other hand it was a a good way to retrain the model because where the experts like the network at least were finding inconsistencies and that was a good feedback for the data science and machine learning team to improve the model uh so i would say uh the responsible AI is not only for for uh for aspiring trust to the end user but also to make them more efficient and augment their the way they work that is the one of the quite important takeaway that data scientists should take uh out of our call today or out of our webinar because we need to build business cases for organizations to start to continue investing in those things yeah i think the efficiency uh is is a very important reason i just to underscore that i think it's a great point i think the explanation is a first class output of the model in in a sense not you know you think about just making a prediction but there's huge value in supplying prediction plus explanation in any case with human so yeah you know it's a it's a fantastic point that both Josh and Yanis brought up and i you know i would just love to hear your thoughts on you know this notion of surfacing the uncertainty associated with a prediction you know along with supplying the prediction to the end user because explanations do fill that role to a certain degree but not completely so your explanation can in some sense be strong or it can be weak and that you know can enable your end users to think about how much how uncertain is this prediction really you know and it's not always possible to supply a sort of a simple confidence score necessarily along with the predictions so i'd love to hear your thoughts on you know how you think we can operationalize that well our team is working and we're thinking extensively on how we can evaluate the explanation so i mean how you can make sure that you can quantitatively at least assess the quality of your explanations and we read uh yeah several papers books and everyone comes down to you cannot actually do it in a very trustworthy manner you have to seek for the help of the experts so the domain your end users the curators or from higher.com or for the network admins of our clients for instance which means that they are the human in the loop principle is quite important because it will help you evaluate those explanations but we continue our research i don't know if joss has they have any concrete ways apart from a couple of metrics that exist to showcase the trustworthiness of their explanations i mean there are metrics but still we would like to give priority to the expert to the domain expert of opinions rather than the uh the the the metrics that provide some insights on the quality of the explanation yeah so i think there's i think there's a few tools here but this may be a case where um you know the the specific approach that you take uh you know depends on on the on the use case a little bit right exactly things come to mind right so um for the the sort of approximate shapley values case um there are mechanisms by which you know the shapley depends on counterfactual examples so typically there's a baseline of some sort um we have a formulation of approximate shapley values that allow us to um evaluate a prediction with respect to a distribution of counterfactuals rather than looking at the some sort of mean expectation of an input um and that means that what we actually get out the other end is um a distribution of explanations so so this is a little bit specific to um uh to fiddler but we do have a preprint out on the technique um uh it's it's it's really fantastic to hear that you're taking that approach josh because it also puts me in mind of you know this idea of data drift and so how do you on an ongoing basis with the underlying data drifting continue to rely on the quality of your explanations or even the quality of your predictions right yeah yeah um yeah i mean definitely so the one interesting piece of that you know is we can provide an an error bar basically a confidence interval with um you know our explanation or our shapley explanations but i also the space of what's a meaningful explanation is actually much broader than this this question of life you know shapley values versus contrastive explanations versus right right um really a good explanation is what's satisfying to a human being um and so sometimes simple things are really valuable it as an alternative to something like shapley which gives you a nice uh you know attribution vector that you can show as a tree or a bar chart sometimes you know in some cases something like just showing someone the you know the three most similar examples that had the same prediction from the model and the three most uh similar examples that had the prediction from the other class if you're talking about a binary classifier if you can just surface those with like similarity search techniques from the from a training data set um that immediately tells you a lot about um how well supported the prediction is in that area right if if you see three similar examples that are almost identical to the inference you just made you have a lot of confidence that the model is um is well supported by its training data in the regime where the prediction is being made if you're given three most similar examples that are kind of all over the place and wonky in some way you know that the model is interpolating or even extrapolating and it's it's not like you know the support there is missing and that your confidence should be reduced in in what the model is um imputing um so that there are different ways to get at those questions some of it just has to do with evaluating model support but I think that the deeper question is sort of like uh what's satisfying to humans and and there really are a lot of ways to get at that and it's and it's very it's very domain and uh user specific okay to everybody there's a question that I just want to address from uh from Andreas who's uh listing in Andreas asks how can intellectual property issues be dealt with when a company's ml model is shared to do their platforms to to check for whether whether or not their AI model is being responsible and I think this might have to do with uh whether it is being audited externally who is doing the auditing or if it can be done entirely in-house on-prem I think that's where the flavor of this question was going any any thoughts on that for us I mean we are we are usually signing MBAs with our clients and usually we have our own let's say uh template but usually we are flexibly enough to sign the MBA of a client in order to respect any IP property and and so on for us uh let's say that it's building the technology for uh for fat properties is our core business so it's our best interest to respect the IP of our client and make sure that it remains intact and we are uh willing let's say to uh to sign an agreement with uh with them um but I want to make another uh point to uh Sujana's comment about the metrics and how we combine them with the explanations and it is really important uh when you do monitor things uh us the people that uh the guys that people do and uh as what we are actually doing uh as well to uh take them to to to give some context to take them within context so just by following blindly a metric can actually uh derail your uh your reasoning and the insight you get so you have to always put your metrics into context and in combination to each other and it's important actually to see the trend of the metric over time because uh one thing I've seen all those years with uh software metrics in general not just for the uh for AI is that uh software developers or machine engineering engineers they will try to mimic to gain the metric that is why you have to take them into uh to combine them take them into context in order to be able to deduct to derive a story that is valuable for everyone and not just forcing people to game it or to mimic it Janice you just made a point about uh and I think Josh had done this earlier as well about uh running this over time because it's not it's not just like a point in time audit that we're that we've been talking about today I think the sense that I've gotten is that everybody who's trying to do responsible AI is best served and this is a continuous parallely running process of doing the explainability of checking for fairness continuously it's making your models that much better so it's also to paraphrase it with your coffee example it's also a matter of sustainability so you want to have sustainable uh coffee production also you want to have sustainable systems and make sure that the decisions they make through time they are as fair as possible as transparent as possible and as safe as possible for the common good let's say yeah uh that makes a lot of sense and then and at least in Europe uh sustainability is one of the and uh prosperity are uh very important pillars for the adoption of the AI from the u-countries so that's how I I connect them it sounds to me that beyond the tools that you're building that this is going to be a legitimate career option for people to be involved in uh in-house teams that are continuously looking at uh the responsiveness responsibility of the yeah the AI that is being implemented um any thoughts on emerging career paths because we have in our audience today we have a bunch of data science folks who are looking at slivers of where in the data science space either their passions lineup or their skill sets lineup so if you can just shed a little bit of light on career paths uh in this space that might be interesting okay for me it's uh it's an excellent path because you need to combine let's say technical skills but also you need to be uh to have some empathy when it comes to social context so I really love the description of the AI ethicist uh as a as a profession although it may means a lot of things so it needs to be more concrete but let's say if you want to be an FAP or a fact engineer then you have to have good technical knowledge which needs to be broad and then you have to combine also some social skills some soft skills if I may say and uh I was reading an article that for instance explainer of models or uh let's say an examiner of model decision decisions will be in a profession of the future and to me makes like perfect sense and we need more of those people to feel safer at least I really like that the idea of of that that sort of role of somebody who is either an explainer or responsible for doing the work to kind of break down the silos a little bit right the the person whose job is to adapt tools to the stakeholders outside of science um connect the dots say that again connect the dots within the organization totally it takes technical know-how and like you said it takes it takes um you know sort of incredible soft skills for doing the cross-functional work to understand and hear what it is that the other stakeholders in the organization need um and and and to uh sort of um intuit what is the most valuable way to deliver the information what's the right explainability technique what's the right way to visualize it what's the most ergonomic way to feed it into their existing familiar workflow that probably isn't the data science workflow um the other so that's that's that's one role that's super interesting to me the the a second role that I think we're starting to see is ML operations so people who are um in this monitoring kind of ongoing uh sort of time dependent diagnostic dimension I think we're starting to see that there are data scientists who are maybe not involved directly in model development but who are doing the job of monitoring what's essentially service metrics for a model over time watching dashboards and being able to do kind of on-the-spot diagnoses of where a model might be drifting and whether a model might need to be retrained um in real time and and it's super important because uh companies stand to save a lot in their bottom line if they can catch model drift before it hits um their business indicators right so companies are on a weekly or monthly basis are watching you know profits and losses and those sort of things and and and so they have kind of business people keep an eye on KPIs but those are trailing indicators to what's happening inside the business if you can catch model drift in quasi real time before you've lost you know money for a week and going back and the slow feedback cycle um if you can catch that right away and adapt there are you know strong financial incentives for companies to be able to monitor what ML is doing in real time absolutely and you just want to bong in with one quick point the earlier role right before the ML ops of the AI ethicist um and where you made the point that this person needs to have some amount of technical skill as well as that social the software skills the social cultural context because imagine I can I can completely see a situation where we see drift and then we start chasing down a rabbit hole about the technical reasons why a model is not performing as well as it is but imagine if the reasons are actually social cultural or you know there has has something to do with uh I don't know it has something to do with the race something to do with gender all of those things and you don't think about that until much later when the horses left the barn and by then the reputation of your company might potentially be in in in bad shape whereas you would you were chasing down a technical problem whereas the signs were all there for you to actually attribute it to something much larger I think if I may win you know in my experience what I found is probably the most single most valuable skill that an AI ethicist brings to the table is their ability to bridge different domains and to translate between different domains right so you know in the in the same day in the same conversation you might be talking to a lawyer and you might be talking to an ML engineer you might be talking to a product manager and and your your entire effort goes into aligning them and making sure that the values and concerns they're articulating map onto something tangible some metric that can be tracked somewhere right so that's a that you know that's a very non trivial skill to acquire and I think uh we don't we don't we don't see AI ethics curriculums in place yet in computer science programs that would you know kind of build that kind of a skill yeah there's a question unless I'm somebody wants to bung in but there's a question where we are being asked how do you see the landscape emerging over the next year in terms of a few things few prompts laws business interests jobs companies I think we talked a little bit about jobs potential career paths we talked about a little bit about business interest why it might make sense for them from a law perspective do you see anything on the horizon I mean yes we have data protection data privacy but nothing I'm not that I'm aware of in terms of responsibility in in in in in Europe at least there was a post to the sunset legislation to pass by Q3 or Q4 2020 but due to COVID-19 this has been delayed and now it is expected to be voted or passed in the EU parliament I think Q1 or Q2 2021 so I'm optimistic that this will be uh this will be uh the case in the United States the best of my knowledge sorry Yanis may know what what type of law what flavor of law about responsibility right uh responsibility it is uh for the AI AI adoption and part of it's also responsible AI um which also prescribed uh yeah but it prescribes things like a fairness or transparency or accountability so if they are in the mix in the United States I do know about the Algorithmic Accountability Act which has been it's under discussion since 2018 or 2019 if I'm not mistaken but I think there is no lots of progress since then but I'm optimistic at least that in European Union in 2021 there will be some legislation so I'm optimistic that this will start uh taking off all right we are we are in the beginning of the phase not uh yeah a little bit before the beginning I just wanted to touch upon one other point Sushana feel free to to jump in if you have some other questions I wanted to ask a little bit about um the potential for further tools to be built out we talked at one point about imagine being able to create an ethical layer a layer of principles by which a company operates that might interface with the laws themselves but that might also have to do with the culture of the company the you know a company can say don't be evil but then let that uh sort of erode over a period of time but imagine if you bake that into technology do you think that in this world of responsible AI beyond the tools you all are thinking of there is a market for more tools there is a need for other flavors of tools and if so what comes to mind I I think yeah there is there will be a room for more tooling already there are a lot of open source tools and in the discussion that we that we had before um we mentioned that this kind of tooling is an enterprise tool so that is why there is room for several ideas and several tools in the mix um for instance you can differentiate yourself using some benchmarks so some comparisons or you can provide some very cool visualizations that your competitors do not provide so I think that uh I'm quite optimistic about this type of market if it's it's a niche market of course but I'm optimistic that there will be more tooling and actually we need more tooling I don't think that just to start off like uh put out a pillar enough to fill the to fill the void and and there is already some critical mass of open source libraries that you can use actually for your own uh you can use them and you can build your you can create your own mix and your own flavor of of tooling so yeah there yeah there is plenty of room for good ideas so I'm I'm sort of optimistic in in the sense that I think right now we live in a world where everyone's riding horses on the street and a few people are going to start showing up with automobiles right uh and I think as there start to be a few visible examples of really great explainable AI there's going to be a rush of companies who have lots of incentives to see what other people are doing or other companies are doing um it's sort of like the in the hired case right like the exact solution to that problem wasn't obvious when we started um but there was kind of a pop at the end when suddenly people realized like this this is a a really valuable new workflow to us uh that we didn't quite know how to describe or to ask for so so I think I think that we're maybe at the beginning of people seeing examples of things that work really well and bring business value that they didn't realize that they needed um you know and that there'll be you know an increased amount of kind of clamor for for great tools I mean particularly as people who are kind of maybe feeling a little bit powerless in like um executive roles or governance roles or ethics roles at businesses are realizing they really can have a voice and some visibility into what this kind of scary but valuable technology is doing like as soon as those mechanisms become start to become visible um in other use cases I think a lot of companies are going to want that yeah and in fact in your analogy the the horses on the streets um I'm reminded of New York and until they saw the automobiles they were just making do with the fact that all of these horses were letting lose all of what they ate for lunch all over the streets and that is just a there was just a consequence of dealing with the kind of AI that did not think through what could be better so uh I like that analogy there's and I think that there is a lot of room for improvement and that in fact as I love saying time and time again that all of the interests of our capitalistic society can still be aligned where not just that models can become better at doing the jobs that they're meant to do but also with a little bit of care and thought and a little bit of publicity um consumers can start to demand that the companies that they are buying from be able to showcase how responsible and how fair they are in the way that they tackle their AI that they use and employ their AI I think bidding off of uh you know what what Indra was saying I think one space in the tooling domain that I would love to see maybe more tools or more startups emerge is sort of the time evolution of incentive dynamics if that makes sense so uh one example that comes to mind is the MS fairness gym right where what you do is you let a bunch of reinforcement learners play out over time to see what happens if you make an ethics policy intervention right let's say you attempt to make your ML algorithm fair or over a period of time and you set certain fairness metrics as constraints and then you let that play out over time and you see how perhaps other kinds of unfairness is accumulate in the system because you cannot simultaneously optimize for all kinds of fairness right so as these discussions get complex and more nuanced it would be really interesting to see if other players emerge in this phase where you're looking at the time evolution of your of your fairness interventions I love that concept an ML fairness gym very nice wonderful um I we are at at five minutes to the hour so I would love to take a couple of minutes to just thank everybody for for having attended this has been this has been fantastic and of course as usual for everybody listening if you visit has geek.com slash fifth elephant which is has geeks uh data conference that they've been running one of the premier data conferences in the country in India um you can visit has geek.com slash fifth elephant to see videos of fast sessions and session summaries including this one and I also want to mention that uh both Suchana and Giannis will be putting out notes that will be visible on has geek.com and has geek will be making an announcement about when those notes are available so so just before we sort of close this you know um are there sort of any open challenges or you know any any um sort of open big questions in the field of responsible AI that you think isn't being addressed or that you would love to see people working on that's the next question I mean it's like uh what do you want something close to bring you in Christmas what kind of gift you want uh for me uh I mean you said just something's how we can evaluate in a very trustworthy way the explanations so how good is our transparency mechanisms that that is something that you really like to see for instance yeah that's good I'd like to see better tooling around um fairness and making uh sort of balancing fairness interests more accessible to domain experts because I think I think it's challenging when we get asked you know can you give us a fairness monitoring product it's so specific to the use case you know if we can create a mechanism or engage our customer in a mechanism where they can um yeah I know it's just piece of possible fairness metrics is so large and there are so many kind of tensions and tradeoffs and sort of you know most of them are not satisfiable simultaneously so so I hear yes it would be good to ask specific yes sorry guys I cut out but uh back here for for the wrap up and I I like the last bit of what I heard as well I think those are closing thoughts yeah wonderful so uh for everybody listening if you have questions that you didn't get around to asking right now you can always visit the has geek page um under making data science work and you can add your comments in there which we will be sure to pass on to Yanis and Josh you can also find them on LinkedIn Yanis Kanellopoulos and Josh Rubin you can find uh has geek obviously on all of the platforms and uh both Sujana and I uh Sujana of course is a co-host is co-hosting for the first time with us today and this was Indra from scribbledata we make a feature story you should check this out sometime with that I will say goodbye to everybody that joined and thank you so much to the panelists and to my co-host thank you thank you thank you bye take care