 Hello everyone and welcome to DICE, so the 16th edition. Today we have the pleasure to have Rumi Chunara here, associate professor from NVIU, who is currently appointed at the Tandon School of Engineering at Computer Science and the School of Global Justice Health in five statistics and epidemiology. Rumi has received her PhD from the Harvard MIT Division of Health Sciences and Technology and she has received so many awards and fundings from various different resources, sources for her work. So today I'm gonna, she's gonna talk to us about a paper called Machine Learning for Health and Equity and Health and Equity for Machine Learning. All right, and so thanks Atoosa and I wanted to just point out a little bit of a whimsical title I put for the talk. Oftentimes when we work in domains like health, healthcare, we're often thinking about how we can use machine learning for these spaces, and in this talk I'm going to motivate and give an example of machine learning development which was you know inspired by some of the work from you know from the scholarship from the health and equity side as well. So machine learning for health and equity and health and equity for machine learning. So right now it's really in the milieu we know that algorithms can encode biases, right? And so I put up some of the very famous papers here that folks have likely been familiar with. Gender shades, looking at disparities in how facial recognition algorithms work for people with different skin tones, looking at that across different genders as well. In natural language processing, looking at bias that's created in word embedding. So basically because of algorithms trained on data sets where you have common you know words next to close to each other, you might get them more commonly correlated with you know with certain gender words with certain professions for example when you create word embeddings. And of course on the bottom another work not looking at bias in the data set, but what happens when you have an algorithm that's optimized for a certain outcome in this case looking at cost for health care as a proxy for the underlying actual for example health condition. And so in that work they found that because certain groups in this case Black Americans had you know used health care differently they commonly had different types of health care costs associated and therefore were not well you know the risk of different illnesses was not well captured. And there was bias in who was getting captured into further care management programs based on these risk assessment algorithms. But I want to take our thinking a little bit deeper. And so as machine learning has matured given the production of data the you know very infusion of algorithms in different parts of society different applications. We have a lot of uses of machine learning in AI in different places and there's no doubt that a lot of these projects have relevance or coming from a very you know very good intentions. For example on the top we have AI being used to predict where wildlife poachers might show up next. Some work in the middle on looking at the prediction of where folks who are under certain poverty lines might be using for example satellite imagery and that's being used to you know assign cash transfers. The bottom IBM Watson health doing a multitude of tasks and on the left is some work from Microsoft looking at public sector issues in Argentina specifically for example they looked at predicting kids who would drop out of school. And so we're now at a point where these you know uses of algorithms are not completely new. There's been deployments over you know the timescale of several years and in some of these fields whether it's ecology economics there's been much written about the effects of such efforts. And so I wanted to dive into a couple examples in a little bit more depth to kind of bring us to a point. And so first of all looking at poaching prediction right so this is of course important where we have wildlife that's extinct or endangered. However and there's a very recent report here that I wanted to highlight a lot of the issues when we use AI for example installation of cameras to detect the poaching and other aspects about the wildlife is of course the cost of the technology. So national parks really can sometimes have a hard time even paying their rangers let alone investing in new technologies and funding them sustainably. And so you can imagine these technologies would also require upkeep or require some knowledge to do the upkeep training data storage and so these are often you know costs that come at an added cost. Some of these simple even the simplest tools aren't being implemented because of some of these basic problems such as connectivity, lack of electricity, digital literacy these are all outlined in some of these works. But besides issues about affording the technology to implement and sustain these efforts those who really study this idea of poaching you know they've studied it for decades they've concluded that really community partnerships are essential to educate and inform folks about the value of wildlife and making it kind of a win-win to show them what the value is and to not and to of course decrease poaching. And so besides using AI to predict poaching some of these works call for data to better capture shifting ecologies, human migratory patterns to get a complete picture of some of the reasoning and practices around poaching. And so you may be wondering why I'm talking about poaching in sub-detail but the idea is just to highlight some of these AI efforts that have been you know in practice for a while that could have some commentary on them. The other one I wanted to discuss is this idea of looking at predicting folks who are under a poverty line and using that to allocate cash or services. And so in the economics literature there's been several evaluations of these cash transfers including the use of like new methods like satellite imagery for targeting them and its effects. First of all I'll just start by discussing this work on the top left which looked at identity verification, so using digital tools to create like digital identity numbers and tying those to citizen beneficiary systems. And in this work examined the use of the system ADAR in India and actually found that in practice it's actually unfortunately deepened exclusion of marginal and vulnerable groups, right? And so there are technical and data challenges here as well like requiring a valid phone number, maintenance of a number, keeping the same number, not sharing it with others like is a very common practice. And you know besides those technical and data challenges there's been you know evaluation of the long-term effects of these kind of programs, right? And haven't shown the benefit they haven't benefited populations in any substantial way. And even there's even been unfortunately people who've lost their lives because they were not able to access those services due to limitations on this digital system. And so this has also been flagged at the United Nations given the human rights around the right to social security and the standard of livings that are getting violated if we're only allocating these things on a specific basis. And so they've been critiqued for cash transfers based on how they kind of de-emphasize the need for like a social security net that would be more enable more resilience for people. And even with these creative methods like using satellite imagery, mobile phone data to target people living under a poverty line, these works what's been done has shown that the proportion of people reached is actually quite low in comparison to how many people are really in need. So if you look at the amounts of money transferred it usually it can in cases amount to less than a loaf of bread per day, right? Which is it doesn't aid people appropriately. But these reports also do ask not just as a dead end but what about working on how you know welfare budgets could be transformed through technology to maybe ensure a higher standard of living for people and to maybe help people with entrepreneurship or skills. And so these are some kind of ideas that might come up with a little bit more deeper knowledge of these settings. And so I highlighted these two cases just to put forward sometimes that the specific problems we work on right would really benefit from having a little bit of more embeddedness in the specific top domains. And so is there a principled path to go forward when we try and integrate maybe some new technology in one of these domains? And so I pulled out the definition of health equity from the WHO working in the sphere of health that's always helpful to see okay what do they define as the goal and then how do we get there? And so the WHO defines health equity as written here you can read through it the absence of unfair avoidable remediable differences in health among population groups and they define those here. And you can see in different fields like economics you have similar definitions of equity in that case related to how income and opportunity are distributed between different groups in society. So how do we capture this principle and represent it and manifest it kind of into approaches? So one framework that's useful is this socio-ecological framework I've put on the right hand side here that's from the public health literature which captures the individual that's in the middle concentric circle and all of the forces that act upon them right? And so when we're trying to think about equity then we can think across those whole gradients and really from a really upstream or root cause really make sure we're not going to augment anything any inequities. And so the individual there is encompassed in these multi-level factors including interpersonal like you know your family network social other social interactions organizational community and then of course the policies that also we live under and can affect us. And so this also highlights how you know working with folks who are experts in these different domains public policy sociology can also inform those efforts. And so when we think about design and use of machine learning or AI then I'd like to just wanted to stratify it across these three areas when we think about even the data that we're using who does it represent what priorities who how do we determine whose priorities which algorithms do we you know what to be able to optimize for in algorithms account for concerns of different stakeholders and then finally like we kind of I kind of spend a little bit more time motivating which tasks do we even use machine learning for. So this often this is just a little bit about my work and this this kind of gives a window into the type of problems that we often look at which are designing machine learning methods to reflect and respond to community needs leveraging domain principles to improve machine learning development performance and also machine learning to derive and integrate those like multi-level factors into our machine learning methods. So I'm going to take the next half of the talk to go a little bit more deeper dive on one particular paper that goes through this and and look forward to the discussion. So this paper is basically a new definition and demonstration of this idea of algorithmic fairness which is also come up into the milieu I'm sure folks are familiar with it that it counts for these structural factors right and so I'll go into the I won't go into too many details I'll refer you to the full paper for all the equations but I'll give it I'll give an overview of the approach for actually doing that and we basically use a path specific causal fairness framework. So before going into the actual details of the approach just wanted to say that it leverages the the idea of a causal graph which is good for making assumptions clear right it'll basically show what are the variables the nodes and what how do they relate to each other it'll require us to provide some rationale for that if that's the graph that we're assuming it also can tell us what kind of variables might be needed to be adjusted for and of course we can add to our methodological developments in this framework. Domain experts might not necessarily think in causal graphs and you can of course have unmeasured confounders I just wanted to list those and we can we can talk about them more. So first an overview of this idea of path specific fairness and I'll walk through this causal graph on the left it's it comes from a very kind of canonical example of sex bias in graduate admissions at Berkeley and this is in Sylvia Chiapa's paper as well so you can read more about it and in this setting what happened was they found that female applicants were rejected more often than male applicants for Berkeley's graduate admissions but when they dive deeper they found out the reason this was happening was because women tended to apply to departments with lower acceptance rates okay and so to depict this in the graph format we have the the admissions decision why we have the gender of the applicant male versus female as a and then the department Q is some other information that can be used for the application information as well and so given this setting we kind of we define path specific fairness let me first actually walk through the different arrows so if the decision to admit was based on gender you see the red line from A to Y we wouldn't want that we wouldn't want it to be based on gender so that's considered an unfair path however if the if the the women are can you know apply through different departments and that ends up making their admission rate lower we might consider that to be acceptable and therefore that would be a fair path okay and so path specific counterfactual fairness it states that the decision is fair towards an individual if it coincides with the one that would have been taken in a counterfactual world in which the sensitive attribute along the unfair pathways was different so we define the unfair pathways in this case the sensitive attribute would be the gender of the applicant right we wanted the application rate for men versus women for example to be you know not directly defined by gender and to be the same based on the gender and so therefore yeah so for in path specific counterfactual fairness we're looking at the sensitive attribute and the decision specifically on the unfair pathways we don't want that to be different okay so that's the idea of path specific fairness okay so I just switched the color to green it's a little bit easier to see and so that for that idea of path specific fairness has been developed previously people use that you can see you require the causal graph but it helps to really unpack where fair or unfair decisions might be okay in your algorithm we're going to extend this because as I as we introduced we have the individual and we have all the factors that they're affected by and if we look at the epic this is where we think about the epidemiology literature the literature like the social epidemiology literature that really has discussed how certain variables that we often ascribe as sensitive such as race are not just properties they're not properties of the individual they actually are socially constructed where you know you can consider someone you know of a race based on based on perceptions based on socioeconomics based on other factors that are outside of them and so if we and so as this quote from the Vander wheel and Robinson paper says if we simply ascribe race to the individual we'll miss those population level factors that affect in any outcome and so this basically leads us to define multi-level path specific counterfactual fairness and I'm not going to read through the whole definition again but just explain it in words and go through the graph which is a little bit more getting more complicated but the basic idea is that we now think about a decision being fair towards the individual if it coincides with the one you know in the alternate universe where not just the sensitive attribute but now maybe up another multi-level attribute also should be um you know the same as in the counterfactual one for the for the unfair paths now when the graph gets more complicated let's so the graph got more complicated so let's break it down a little bit so again we have the why that's the outcome and so I here would be any individual level factor it could be gender like um was the a in the previous graph and so you might have some sensitive component on whatever the individual level factors are that's the a in the red here we add a population level factor and we define this in the paper in a lot more detail but broadly I'll mention that it's a factor that's going to affect the individual level factor but also the outcome right and so some examples could be nationality right so if you have um you might want to be fair towards different nationalities as well and that could influence which genders you might see in your in your application pool another good example of this is looking at maybe a smoking prediction algorithm where you want to predict if people are smoking to assign them to some intervention and you might want to do that fairly for men versus women but we also know that like the neighborhood that you live in can also influence your propensity to smoke right maybe based on the availability of cigarettes or other factors and so you'd also want to be fair maybe to people according to these their macro properties like this neighborhood factors so those are a couple more examples of why you might want to consider these multi-level factors this graph is an exemplar of how those could be arranged and we also show how you could have unfair parts of that graph in the green but then you could have an unfair and a fair component and that's why they're dashed as well and so yeah this is just an example and I encourage you to look in the paper for like a few more examples and we kind of describe this in more detail what I'll do now just looking at the time I'll walk through the approach for actually um addressing multi-level past specific fairness and so again I took out all the equations so you can check the paper if you're interested but the broad idea is that we take our graph so again we're assuming what graphs we have hey it's going to have multi-level factors with some interactions between them and then we isolate the past specific effects so we'll take the effect we'll compute the effect so basically from the data generating process you can if you have like data for example on the different variables you can learn what the effects are and you would learn those specific to the different arrows and then the idea is to compare what the total effect is in the observed scenario and the counterfactual scenario basically when you have the sensitive attributes as one value versus the other and again you want to take into account both sensitive at the individual and population level so to actually make that a little bit more tangible I'll just show like for example you could derive the past specific effects we walk through that in the paper it'll depend on how your graph looks like you'll have to take you'll have to use properties of the directed acyclic graph to understand what are the probabilities of certain values given how the other nodes connect to it and you can compute these past specific effects on the right hand side in the table these are all the parameters based on the data generating process once you have a past specific effect for each of the arrows then we have a you we can are in our algorithm we basically compute a fair estimate which basically takes our estimate and removes the past specific effect for the unfair component okay so this is like an the overall idea in the equation we take the estimate of why and then remove the past because of all the unfair past specific effects beta is some factor that tells you how much fairness unfairness to remove and the algorithm is just in words what I just said it is what I just said in words just walking through that for each step and so the broad idea is that now as you remove more and more of the unfairness you should get to a lower difference between the unfair and the actual the unfair why and the fair why and so that's what's kind of indicated in this graph what we do in the paper just to explain what's happening in the graph a little bit better is that we show that this works in this scenario where you have multi-level variables like a population level one and an individual level one it works better when you even consider any one of those so you just have one or the other or you have two variables that aren't arranged in a multi-level fashion okay the last five minutes I'm sure we'll have questions so I just wanted to wrap up with just some other to give a flavor of some other work in the same regard thinking about the same socio-ecological framework a lot of the couple of examples I'll show are basically not looking at algorithmic fairness but also using machine learning to kind of better understand these multi-level factors that relate to individual outcomes so for example in one of them we're looking at predicting mosquito borne disease in this case dengue and we know that where you live can really affect your risk of dengue based on the different parts of the environment because different aspects of the environment will drive mosquito populations more or less and so that is a known factor though not well captured usually because it's really hard to on a very you know large scale kind of understand what those factors are around in different places so what we did was use some image segmentation from satellite images is kind of the overall pipeline we segment the images according to these different kinds of features that really are known to affect mosquito populations and then we use that we aggregate that information by place and use that in a time series modeling unit to then be able to better predict where and when dengue might occur and this is this a lot of details from the paper but I'll just show it in any case you don't need to worry about the whole table at the bottom but the images are basically kind of what we're segmenting so in our paper we also look at doing this across different geographies because we want to make sure that if we can do it in London it works also well in Pakistan where things might look differently from the from the satellite images so you see the examples of satellite images in the top two rows and then those are the features that we're looking at like buildings or roads or trees and the bottom is a segmented image so basically we create a created image that we now know has the x percentage of buildings in it or running water or stagnant water and then we can use that in our prediction model and we show in the bottom the bottom table that doing that improves the prediction we basically get a better r squared on our model and that's in the bold row when we incorporate all these landscape features let me skip down to one more example where this is here in New York where we're looking at better measures of discrimination that people face that's another kind of multi-level factor that can get under the skin for individuals it's well known in the social behavioral sciences but typically it's hard to measure like how much discrimination is faced on a daily basis so we use the cultural milieu so what's happening online as a proxy for cultural racism by different place and these are this is New York City that's kind of got those zip codes outlined there and we created a clustering algorithm to count for the fact that different places have different amount of tweets coming in and so then we're using that to actually see and we have some preliminary work looking at how that actually affects kind of quality of life measures for different minority groups so based on what spaces they frequent how much of this cultural racism they have and how that affects what's the different health outcomes all right so thanks to my wonderful students who do all the heavy lifting and we have a lot of fun and I'm happy to now to speak with you all and have a good discussion thanks okay great thank you very much Rumi for the wonderful talk so we're gonna give you a round of applause from where we are and now I would like to ask the panelists to raise their hands if they have questions and also the other attendees to write their questions okay so we're gonna start with Brian from the School of Philosophy at ANU hi Rumi thanks for the talk um so I just had a question about uh the slide on multi I think it was called multi-attribute uh algorithm fairness is that right so um so as I recall the the claim was something like a decision or something like that an algorithm is fair in a given case if the decision that was reached whether it's a prediction or some action coincides with the one the path that would have been taken uh had the individual in question had a different group membership and moreover had the kind of you know population level uh you know kind of average average uh um properties that people in that uh population have right so I guess as an example I would just want to check my understanding for starters that I suppose we're predicting whether people will default on a mortgage uh for a purpose of deciding whether to offer them a loan we've got a an african-american applicant and we ask whether that you know maybe that person was denied a loan uh for a given amount so we're asking was that decision fair or unfair and I take it this test would be like is that decision the one that would have been reached had that person been white but not holding holding fixed all their finances we want to know is it the same prediction that or that would have happened had they been white and moreover had kind of uh average levels of wealth income credit history etc better prevalent in the white population is that kind of correct yeah it's broadly correct I think here we're specifically you're right in that we're looking at basically we're looking at variants that's not just captured in like the race variable and specifically looking at what are called macro this kind of population or macro attributes which actually isn't a new definition by us it's actually kind of in the epi literature as well so some of the things you mentioned could fall under that some might not but it basically um that's right and then I would just highlight that it's yeah specifically these macro attributes which have you know affect the individual and the outcome yeah okay because then I'm just a little curious about whether I if I'm understanding it right I wonder whether that's a good test for the algorithm itself being fair or or even just this particular decision being fair or whether it's really a test for kind of just whether there's some unfairness somewhere right maybe in this particular decision or maybe just in society as a whole because offhand I mean my take would be in in that case of the mortgage case right you know if this person really just clearly does not have the finances to repay like a million dollar home loan or whatever then I would say it's not unfair that the algorithm makes that prediction and recommends denying the loan rather the unfairness just lies elsewhere right it's unfair that African-Americans have much lower average levels of wealth etc etc than whites but but that the decision itself isn't unfair the algorithm might not be unfair it's just if it fails this multi-attribute counterfactual test all we know is just that there is some unfairness somewhere in the system but it might be you know prior to the algorithm upstream of the algorithm is that is that right or do you disagree with that yeah I mean I guess I guess the reason we do this is actually to bring that to the forefront right that the algorithm could be fair unfair but it's not sufficient to stop there right and so doing this could you could imagine just theoretically but also in practice really highlight what might be happening a little bit more so right and so that's kind of the goal of doing this is to kind of bring those other unfair unfair things that are happening to the forefront otherwise it's kind of like perpetuating you know issues yeah okay yeah I just I would also want to like want us not to ever lose track of where the unfairness is right because exactly that might matter for how we want to intervene right if the if the unfairness is the background conditions of society maybe we want to hit that not by tweaking the algorithm but by something else some other public policy whereas if it's unfairness in the algorithm then we need to tweak the algorithm absolutely and we wouldn't know where those things might be if we don't account for them right like yeah okay cool okay um I just want to have a quick follow-up on this uh so uh so let me there's uh this uh sets up endogenous and exogenous variables when we are using the causal modeling framework and I think um I like one point that Brian is making um like also resonates with me because it seems that um some of these unfair background conditions are really um or should be captured in the set of exogenous variables because there are the background conditions of a society or an institution um it's not really a property of an algorithm it's not that if we consider those kinds of variables um and then see the the impact of those on algorithmic outcomes then and then intervene on them then we are making the algorithm fair it's more about how should we change the background conditions such that the whole decision making process is more just or fair and I wonder if if there are some limitations on the framework that you are kind of um suggesting if we really take it that many of these important structural background conditions are really in the set of uh exogenous variables yeah Tusa thanks for your comments and I follow it up yeah I definitely hear what you're saying and can you um can you can you a little bit repeat what would specifically want me to comment on I just yeah I wanted to know that yeah so I um I wanted to ask if if you think that um your specific multilevel definition of algorithmic fairness account of actual uh terms uh has some limitations in incorporating some structural variables because of the reasons that Brian mentioned and I also wanted to reiterate that sure so is it there is there a limitation around which structural variables can be included in this framework yeah right yeah I know that's a great question so I guess there's conceptually and methodologically limitations so actually in the paper we talk a little bit more about this um what types of properties can be included what's included as a macro property some um methodological limitations are that if there's any feedback loops you can't include this in this kind of approach um so you can imagine something like um um mean neighborhood income might not work in this framework because that's affected by the individual income there's a there's a basically there's an arrow from i to p so there's some details on the methodological limitations that you have to pay attention to in this sense I think conceptually you can create the graph as you wish right that's like kind of the that's the kind of concept around causality um and so it's it is again even creating the graph is creating a picture of the world that you want to look at right and so you can imagine some things are less easy to even to maybe quantify right um that you might not put in but I guess the the idea is to open the imaginary for people making algorithms to what should we be including and um in considering yeah okay thanks very much so we also have a finger on this from leshing and then I move to the next question if there's no other finger leshing um yeah um thanks for the discussion Rumi so I was wondering um what the example model would do if in the toy case let's say there is um yeah individual attributes and then their zip code is treated as one of the group attribute and turns out there are two zip codes where people's income were basically completely low non-overlapping one is one group is lower one zip code is lower than the entire other zip code um in the case of loan decision and correcting for this multi-level um yeah past specific fairness what would the outcome be does that make sense as I understand your question um leshing I think it's um it's actually a prime example of what would be like say you had two two people um maybe you want to be you know they're maybe this they're both women in the same same gender women in two different zip codes but they're um they're from like a place with that's better off or worse off and so you if this would include this would bring up the consideration of whether each loan uh you know uh prediction should be equal across just gender or also that that background factor and so that would differentiate whereas if you didn't include those factors you might just um you might do it equally across those so this case you might um give more credence to you know the where the population level attribute is disadvantaged compared to the um you know instead of just the individual one I kind of rambled at the end but um your example is exactly what would make a difference in this case versus the um non-multilevel case then if we look at the um final outcome attribute which is giving the loan or not um then the overall objective would dba from let's say minimizing loan default rate after correcting for the group well well I think that's up to the premise right like that's up to the premise I mean if you wanted to um if low I'm not really familiar with loan prediction algorithms by the way but if if loan prediction algorithms were indeed to be fair even on a factor like race then that itself is a there's a lot of variance within that right so you wouldn't guarantee you're going to have equal amounts of not of loan defaulting if you just take into account race because the race is a proxy for things like socioeconomics education level and so you're you're inherently not accounting for that if you don't take into account those factors all right thanks okay thanks Lachi so the next question is uh from Sarah Logan hi Rumi can you hear me okay my zoom is playing up this morning hence my video I'm so sorry I blame my colleague college IT support thank you for your presentation it's fascinating I just had a few a question about the last paper you mentioned about the offline consequences of online hate speech um I'm super interested in I just wanted a little bit more detail if you are able to um so what are the offline consequences that you are investigating so what in particular are they and how is that data collected like what is it yeah so it's um it's a pretty new space but people have started to look at the offline consequences we we did one study I think so of course with the various levels of you know um rigor in terms of the actual like what's the like the direct connection um but um so first of all just say a lot of people look at hate speech online and they define it in different ways so you might have um you know just derogatory speech you might have speech directed to uh towards certain groups different types of groups and there's different definitions specifically in one of the papers we did for I'll give you an example we looked at race-based discrimination so anything that's negative to a um a group defined by a race category and also sexual orientation discrimination and you can also subdivide those more by something that's targeted like at that group or it's just like a report of something happening so there gets to various details within that kind of definition as for how those connect to like offline consequences yeah there's a few works including ours one of um one of the papers we looked at just as like an initial look was between um in the United States like 100 different cities looking at that kind of definition from um and specifically Twitter data by city and the hate crimes that are reported through the FBI database um so we looked at something like that there have been other folks looking at um other kind of more health and wellness related outcomes I think there's a paper at the state level looking at these things at the state level social media and health and wellness for different groups we're trying to unpack it a little bit more specifically in the ongoing work which I kind of alluded to so we have a cohort of individuals that um of collaborator mine is social behavioral scientists they look at um discrimination and um like and health outcomes in a longitudinal manner um you know to see what happens um prior to to the other and so in that group we're also are looking at where they've gone in terms of um the discrimination as a proxy for cultural racism in different places and we're also expanding that to ask them with you know permission they gave us permission to access to their own social media feeds to see like their online um discriminatory experiences kind of thinking of like online microaggressions and then we have like surveys and other measures from them to see how those measures are like kind of discrimination as stress manifest in their like stress levels etc yeah okay thank you that's super helpful thanks yep thanks so the next question is from Seth okay so um it's it's super windy out here in um in rural New South Wales so my connection might also be bad if you can't hear me um I may not be able to see um hear you so send me a message in the chat the to user if if it's not working um so my question is about the sort of reliance on um I'll stop my video my question is about the reliance on causal graphs and on where the causal models come from and to the degree to like how you go about sort of making them non arbitrary and I guess the sort of follow-up question on that is you know one of the things we sort of tend to posit in a lot of areas of social science is that you know causal relations aren't necessarily kind of um acyclic in the way one might want for these models to work um social structures cause people to behave in certain ways people then constitute the social structures that influence them um so how does one account for that possibility of kind of mutual influence that one that one gets so let me know if that got through yeah thanks Seth I heard that and I hope you can um hear me clearly on the response so thanks for those questions right so yeah I definitely hear you and I put up that slide on the causal graphs at the beginning because I think it's important to go over there's different um whenever you model something whenever you capture data you're making some assumptions about things right so um that's uh you know that's one framework that helps us at least make those assumptions clear now in our case specifically I tend to I work a lot in the health space and so a lot of the times we have used causal graphs in different looking at different health processes and outcomes and a lot of our work to derive the graphs comes from the work in epi or you know biomedicine where those um those relationships have been studied um which you know what are the mediators what are the moderators and then on the clinical side people look at you know even different mechanisms and so the in the studies where we've done that we've we've referred to those as as the graphs and um this this happens often even I'll give you an example in our paper that I discussed we did use a simulated graph where we just assumed this is the structure this is the data generating process we also used the the UCI adult data set which is fairly common and people have kind of um uh we in that case we used a graph that others had kind of decided on um you know and for example in the in the API work I cited so so you know a combination of the scientific knowledge as well as um uh consistency previously and I think you're right absolutely your um these are very clean uh models we're making that don't incorporate maybe some more realistic assumptions and so uh yes I mentioned that we don't account for um feedback you know um between those properties in this setting I think that's an interesting problem going forward even methodologically also yeah other interesting things like that are when you have um interference there's other um you know um properties that are not often that don't you know come through with the assumptions we make but they then you know should be studied separately and it could be interesting methodological um ways to look at I don't have one I think for this case I haven't come across anything that looks at that feedback in this specific case I think it would be interesting for us to look at especially because things like socioeconomics and like you mentioned naturally have that right so I guess that the the one thought that I had just following on from that is um you know if you if you think about certain kinds of um epidemiological interventions or there are going to be some cases that are more or less complex is what I want to say um and often the the social ones even something as um like university admissions will involve many many many different factors um so I guess I'm curious to know how do the algorithms perform once we once we introduce um sort of significant levels of complexity into the course yep that's a great question as well I think that's um important so I can tell you about some ongoing work in those regards like and and what people are doing um of course like looking more like the causal discovery space um where um for yeah I think so and in this work specifically right so we looked at these this kind of size of graph with specific variables in it of course as you know complexity increases I think I maybe I'll give one idea of what we're looking at to go forward is basically considering um if you have knowledge of the local graph structure basically to reduce some of that complexity right so um that would have you know that's one direction we're looking at to see if well if you have a larger graph if we just look at the local structure how does that you know um uh what kind of balance can you give on your results in that case how much does that affect things if you have specific types of missing data missing mediators how does that also affect what might your outcome be so we're kind of looking at those scenarios to think about more complex structures as well that's great so the next question is from leshing hi Rumi so this um so uh I wanted to ask a general question that may be a bit cheesy as well so in the machine learning for um health and medicine case there has been um like high-profile failure cases such as in New York the IBM Watson Health and the Memorial Sloan Catering Cancer Center announced that they're not going to go forward more could you comment generally um given that your work is maybe more on the epidemiology space than the diagnostic case like what are the sort of examples of dos and don'ts on machine learning for health okay yeah that's a great question leshing I think I think you make a good point that yeah there's a lot of talk around machine learning for health care but you know what's the good you know are there examples of actually implemented algorithms that are helpful or not you're right I think um first I'll start to say kind of the broad strokes that people look at when they when they motivate these things are often yeah making decisions from sequential data with missing data you know to help smooth over those things to help make more um decisions specific to an individual's like you know phenotypes and these are the kinds of things that people you know discuss when they motivate it now how much of that has been implemented I think there's a few high-profile examples one probably would be sepsis though I would say there's varying degrees of success on that as well but you know there's a few good examples of that I think those kinds of things you can imagine there's you know a very high temporal resolution sequential data that's you know that it's amenable in that case of course all of these things are always integrated into clinical practice so it's not like autonomously making any decisions but that's been helpful and then I think on the drug discovery side also kind of the phenotype being clustering has been has been kind of maybe more shown useful in that sense so um so then your last part of the question was about like do's and don'ts right and so I think um I think probably this kind of actual integration into practice is the biggest gap right and so you know how do people how does like a you know for example a practitioner whether it's a clinician whether it's a public health practitioner you know integrate this kind of thing into their practice knowing that it's going to be a combined you know effort by whatever computational output and person decision you have made and there's a lot of work going on in that space too um from all the way from the explainable AI space to also like you know thinking about more HCI approaches to kind of you know visualize and and help folks make sense of all of this so yeah does that get to your all your question yes sure thank you great thanks and our next question is from Pamela hi yeah thank you um one thing that uh the sort of issues of complexity or like different ways that causal uh graph could go or um thinking about different groups that you might consider one one thing that that makes me wonder about is whether uh it might be worth distinguishing different types of fairness relative to certain assumptions that are made or certain groups that are considered and that makes so that just like a more general question that I'm hoping you can comment on like what is the importance of having a single unified notion of fairness because I feel like it is important but I mean what I'm interested to know what your views are like should we just have a proliferation of different types of fairness like you know with certain qualifications like I made these assumptions they only considered these groups like harbored in this way with these proxies versus having a single notion and and whether you think there is like a a true fairness thing in the world that we're sort of approximating with these with these different things I'm just curious to know your thoughts thanks Pamela yeah I appreciate your kind of um step back question um yeah I mean I first of all I think like anything um you know and it kind of motivated at the beginning really we're you know we're like if I'm a computer scientist I'm working on some societal area then there's some expert there who knows more about it than me right so I think that kind of collaboration is necessary in all cases you know like you know steps of the process one can imagine that from the examples I gave I might come up with like an algorithm that predicts poaching fairly but then it's not really fair to the community right like if um you know through those examples I gave and so having that context is so important so in that sense I think that kind of answers your other question as well right whether is there one specific notion of fairness I mean in the essence right it has like a maybe a statistical definition of fair between you know two property two you know two elements having their properties equal but how that gets manifested you're right it's going to depend on the setting and what um what view you know is being looked at like I think that poaching example I gave is probably the one I would repeat where you could make an algorithm that's fair you know but then that might not work out to be you know equitable to the people who live around there or you know to their um to their lives so um uh I think overall I agree with you I think it's it's very contextual um and usually leveraging that domain expertise is I think important it's a quick follow-up on that yeah thank you so I mean one way of making putting this more philosophically is whether you think that fairness is like um uh is ultimately a relativistic or contextual property like that there's never a case where something is just fair it's always fair relative to a particular group that's considered or a particular I don't know assumptions that are made or or whether I mean I sound you did answer most of my question but I'm just I'm just curious if like if it even makes sense to say that something is fair or whether whether we're always um yeah whether whether it's always fair relative to something not just because we like have to do that because it's useful in particular cases and there are contextual considerations that we might want to mention but whether like it's like impossible for there just to be something fair full stop anyway I think there can be justice um full stop and and that can then if you if you um agree with if you like take that priority assumption then how that manifests will then you know um be clear um but when we you know when you're talking about specific um you know settings or um specific scenarios then and they use use two words interchangeably what I would separate them out then I would say that things are contextual but maybe not relative right um it just that I'm not sure what you're getting at with the semantics but um yeah I think context is important yeah okay thank you all right thanks and then we have another question from Seth okay I'm going to type I'm I'm putting it into the chat as well just in case just in case it doesn't work so this is a question for sort of extensions of this approach um and one of the things I'm really interested in is predictive models that um aim to sort of predict future health outcomes for people that might be connected to insurance for example um where we want to be able to distinguish between outcomes for which people are responsible and outcomes for which they're not so you know very a simple version of this might be predicting an outcome that's connected to um a habit like smoking and predicting one that's connected to say genetic factors um but suppose so the question is does the causal graph approach to sort of as you've deployed it here would that help us in sort of separating out um outcomes to which people are responsible and not responsible when we're making predictions about their behavior in the future um this would also connect somewhat to the the loan um case um and you know the further sort of thought is you know so we can obviously do we can do causal graphs for those these sorts of outcomes the the ways in which one's choices might lead to a certain negative outcome and ways in which other factors might lead to that negative outcome but in order for machine learning to play a role here do we also need to have a particular kind of data um in order for to sort of operationalize those causal graphs and would you know so suppose we're we're making predictions based on big data and we're just kind of gathering together you know browser behavior internet of things your fit bit all of this kind of this big mishmash of stuff um would we need to sort of curate that in some way to make it to enable us to operationalize the causal graphs that we would uh we would deploy so it's a very big general question about sort of forward-looking stuff but um there is a there's a closely related problem um um for predicting outcomes that will be responsible for yeah thanks Seth um right so let me yeah so let's so I'll just um from the public health perspective some I'm not sure there would be any like people you know I'm not sure there's a premise of like things that people are not people are responsible for but that's perhaps a difference um discussion um but definitely so if you want to block out from the prediction right so yeah I think the approach is um very much tuned towards like yeah looking at the specific set of effects that you're interested in right like that's I think that's why I think the graph kind of is handy because you can separate those out with some kind of prior decision um or knowledge and um what kind of data would right and so um right and so I think this to to get to your bottom part of your question um yeah this this approach would condone a more specific integration of those different data like you just described right and so um you know that that of course people have have kind of come to um looking at is helpful to avoid spurious correlations like proxy variables etc and so this approach does take it more systematically does that kind of get all the things you were looking at Seth yeah that definitely helps thanks very much for me yeah all right great thank you very much for me for the wonderful presentation and everyone who participated in the Q&A