 Let's just go ahead and get started. Nora was pointing out to me during the break that there were a few questions around what some of these different percentage settings are in the analysis functions. So let's clear that up a little bit. Okay. So there is a little bit of basic statistics that we all need to know in order to fully appreciate what these kinds of scatter plots are actually showing us. So right now I am on my analysis type. I have between indicators, overall results, and my criteria is set to 33%. This is the default setting. When you open this, it'll set to 33%. My data is looking at A&C one visits versus A&C four visits. My period is set to years, all of data in 2020. And my org units right now I'm looking at districts. Okay. So what is this 33? You see over here, sorry, the chart is kind of spilling off itself, but you see plus 33 minus 33. And if you remember that my analysis type is showing the criteria is 33. Well, what this is, is this dark line that runs through the middle here. This dark line that runs through the middle is your average. Okay. So it's looking across all of the values that have been reported for A&C one and A&C four in all districts. Each dot here you see is a district. And so it's an aggregate, it's an average over the last, over all of the values that have been put in this month. And you can see that this, we are average, we are taking an average of the districts right now. And that's the line that, that's the black line you see running through the middle. Okay. Now what we see these lighter gray color lines, these are what we consider kind of our outlier thresholds. And in this, in this case, we're saying that if the value is greater than 33% different than the average of all facilities or all districts, or less than 33% the average of all facilities, or sorry, excuse me, districts. So we're kind of defining this threshold to be 33% different than the average. And what does that give us? So if we have 33%, we see that we actually have three outliers here. We have district D2, district B2. And then on the, these are both above 33% different than the average amongst all of our districts. And then we have district C6. This is slightly below 33% average of all of our districts. Now what happens if I change this criteria to say 50%? I'll have to come in, change it to 50%, click analyze again. And great. So now the lines that we're seeing, the gray lines represent the outlier thresholds of anything that is greater than or less than 50% different from the average across all of our districts. So you still see we have district D2, district B2. But that district C6 now is considered within 50% different than the average across all districts. Okay. If we come in here and say, let's just say 70% and 70% is getting pretty high versus just going out. Yeah. So 70% is like everything is okay, right? 70% is we don't even see the gray lines anymore. They're not even shown. So really typically what we want to do if we're really using this practically, the analysis type, we want to keep it probably certainly less than probably 40%. That's really actually is going to show us those data values that really need to be corrected. The things that we need to show up, 33% is just the standard default. Now let me show you what kind of this means a little bit more clearly. There was a question about Z scores. What is a Z score? Well, essentially a Z score is a way in which we calculate the difference between a value versus the average of the entire sample. So in this case, we have all of the districts. And if we look at all the values across all those districts, there is an average. There's an average for the A and C1 across those districts. There's an average for the A and C4 across those districts, right? And we want to find those districts that fit outside that average. And so when we talk about Z scores, what we're really talking about is the number of standard deviations from the average. A standard deviation is a calculated value that tells you statistically how different a value is against an average. And this can all be graded across what we call a bell curve. So a bell shaped curve here. If we are one standard, so here in the middle, we have zero. Zero is our average. If we have one standard deviation, that from the average, both one standard deviation high, one standard deviation low, that means that 68% of the values will fall within one standard deviation, OK? If we go to standard deviations, that means that 95% of all of the values will fall within that two standard deviations. Three standard deviations, 99.7%. This is basically comparing the data, comparing the districts against one another and saying how similar are those districts to the average? And if they are very far away from the average, then they will fall out on these, what we call the tails of the bell curve, either high or either low. Why is this useful? Well, it's a very common and foolproof way to tell us what districts or what values are not following the same pattern as everything else, all right? For looking at national level data, we expect all of our facilities to be kind of doing similar things month after month after month, right? If we're looking at routine data, all of a sudden one month they don't do something. The value is way outside, then using these standard deviation bell curve Z score process will let us clearly see that. And that's basically what these scatter plots and this WHO app is really building behind. I won't spend too much more time on this, but I think one of the key takeaways here is that basically any time you have data that falls outside of two standard deviations, it's almost guaranteed to be an outlier. Two standard deviations or even maybe 1.5 standard deviations is definitely a value that you need to follow up on. You need to check because either you have some kind of serious problem in that health facility. Say you're looking at malaria test positivity, right? And you have a value that comes in that's two standard deviations from the average. What does that mean? Maybe they have a malaria outbreak or maybe they have a data entry error, right? There's two different. There's multiple ways to interpret it, but either case it is a value that you need to pay attention to. You need to follow up on. OK, Nora, any other questions before we get going again with the videos? The last video. No, Scott, there'd be no further questions. There was a comment from someone about your sound, but your sound for me is satisfactory. So I'm not quite sure where the problem is. OK, yeah, we're dealing with bandwidth all over the world right now. So I'm sure I'm going to come in and out for some folks. But of course, these sessions are being recorded and the videos are already there. So if you're struggling to hear me now, please tune in, watch the videos and we'll post the recording of the sessions as soon as we can. All right, we are going to skip back over to our platform, our Academy platform, and we have one more video to watch. Now, this video is a little bit different than the others. This video is telling us how do we actually train people on how to use this application? So certainly, hopefully by now, you've already seen this application can do a lot. It can tell you where your outliers are. It can show you the internal consistency, the consistency over time has very clear dashboards. You know, it makes the data, many of the data quality problems very obvious. So then the next challenge is how do we actually get people to use this application? And this is what the training this video is about is how do we actually train folks at district level how to use this application so that they can use it routinely and follow up on data quality issues as they come in. All right. Again, I'm not going to ruin too much of Bob's story here. Let's just go ahead and get started and do present how you can quickly train staff. OK, great. All right. So that wraps up the last video for today. Hopefully, you appreciate that this app is very much usable by folks at the lower levels of district and maybe even facility level and that it can be very useful to them. You can build it into your standard operating procedures and their their routine processes to be able to use this application to perform, you know, monthly or even maybe more frequent data quality checks. All right. So, Nora, are there any questions from the community practice right now in this video? There is a question about the difference between standard and modified Z scores. OK, yeah. So this one we have to delve back down into a bit of statistics. A modified, well, to put it shortly, I mean, a modified Z score allows you for waiting and for some skew of this bell curve, right? So you may have a skew towards one side or the other. And modified Z scores allow you to factor that in. A standard Z score gives you this nice parabolic bell curve. This is going to be for most of your applicable for most of your data. You could have situations where modified Z score makes more sense in facilities that have very high patient loads. So most of your facilities are having relatively constant, but few patients. But then you have maybe one facility that receives 10 to 20 times more patients than the average facility. Well, in that case, that facility is always going to be an outlier by and large because it's just so much bigger. And then that in that situation, you might be able to use a modified Z score to balance out that a little bit. I, you know, we don't I don't think we have time to go into a detailed explanation of all the very the different statistical methods. But suffice to say that I think this is something that you can follow up on on your own and and by and large, most data is going to be fine by applying these standards at score. Scott, in the Q&A select channel, there are two questions. The first one is a standard acceptable rate of data completeness for decision making. And the second one is about WHA DQ2. Does it does it include the core module of DHH2 translations? Yeah, I think I'll start with the translation question first. So the reality of the situation, actually, is that this app is actually quite old at this point, this WHO data quality app. It's actually been around for quite a few years now, even though it's not, you know, many countries are just starting to use it. The app has been around for a while, and it's actually written in older technology. It's not written in the in the updated technology that we're using or language programming languages that we're using in other DHH2 applications. That's why you see it has a very different user experience. It looks very different than most applications within DHH2. And that's because the base technology that was used to build this app is really no longer supported by most of DHH2 functionalities. Unfortunately, that does mean that there are some limitations for translations with this application. This application does not use the same translation libraries as other applications in DHH2. And again, it's because that base technology is old now. So there are some strings, we call them strings, but there are some things that will translate. Most things, unfortunately, probably will not translate well. It's something that you'd have to figure out. It's almost a language by language difference. So some things that may work in French may not work in Spanish. It just depends on how much base translation there has been for DHH2 for these different languages. That being said, we are in the process of rewriting this app. We are updating this app entirely from the ground up into an updated code base that will use the latest translation functionalities of DHH2. So this time next year, when you're looking at this app, it will have a very much more, it'll have a different kind of user experience. It'll look just much more like the other DHH2 applications that you're used to using, dashboards, data visualizer, all the various analytics tools. It'll look similar to those, and it will support translations much better. So we appreciate the app right now has certainly a lot of limitations, but we are working very hard to update it and get it to a place where it's going to be able to do the translations. Okay, and then the other question was on, is there a minimal reporting rate that's necessary for being able to use the data? Well, there are different statistical methods that you can employ to say, how much data do I need to have before I can make a decision, before I can actually start to use this data. We really won't go into that. I don't think many people are actually like doing statistical, this kind of statistical analysis on their data month to month, but it does kind of factor out that you need to have probably around like 80 to 85% reporting rates to actually have a sufficient volume of data to be able to say that this is the general trends happening in a district, in a country, and be able to make kind of, what's the right word? Maybe like statistically valid decisions based upon it, to define interventions that are going to actually potentially have some impact. You need to have probably around 80 to 85% reporting rates. Now, there is another caveat to this though, that do I have all the data from the various sources in one place? So I can have 85% facility reporting rate, but what about the situation where I have maybe only 20% community health worker reporting rate and both the facilities and the community health workers are reporting on malaria. What do I do in that situation? Well, the reality of the situation, what we've seen happen several times is that if you have one element, one reporting unit like a community health worker that's under reporting and a health facility that is reporting adequately, you actually are getting only a fraction of your data. You're not getting the whole picture. So if you have 85% reporting rates at health facilities, which is generally quite, is fairly acceptable, and then you only have 20% malaria reporting rates at community health worker, you're probably missing at least maybe 15 to 30% of your actual malaria data. So is that enough? That's probably too much data to be missing to really consider reliable data. So if you're capturing the same data in multiple places, something like malaria or especially between facilities and community health workers, you really have to make sure that both of those are adequate reporting rates to say that I have sufficient data. I can make well justified decisions based upon it. While it is asking in the new release, will event and tracker data be considered? This is a really tough question. I mean, essentially right now, the aggregate and the tracker data in DHIS2 live in two different silos. There's two different data models here for it, and there's two different data tables. What we are trying to do is break down these silos and have the data all mixed together in one place. If you start to see that happening in indicators and predictors, the next release of the data visualizer app will have a universal search so that you won't have to select program indicator or data element or event data item. You can just search and you'll see all of those mixed together. We're starting to break that down, but as it stands right now in the new version of this application that we're putting out, we unfortunately will still only be pulling in the aggregated data, the data from the aggregate data model. But again, we have tools and we'll cover these on Monday on how to get data from the aggregate or from the tracker data model into aggregate so that you can use them here. While it says, can the analytics tables be used? Potentially, yes. It's a fairly complex situation. I won't get into the technical details. For most folks probably are not so concerned about that, but we can have a conversation offline and start to appreciate how you pull. It's not so much the analytics tables that's an issue. It's how the calculations are done in the application that's more of the issue. But again, we can talk about that offline. Any other questions? Nora? No. Yes, there's one under general from Ali. I have one question regarding the WHO Data Quality app. We can modify the tool for physical data quality activities in the field. In some countries, the data is not coming digitally, but to ensure data quality, the program, manager conducts physical data quality activities. I think that the point is that the WHO DQ tool is set up at a national level. And it uses the current data that's in the system. I don't think you can modify it for physical data quality activities, because as you see for the outliers, it looks at data over an extended period of time. Right, yeah. That's exactly right. But I would say that this application is built on an existing Excel WHO tool. And so if you actually go to, yeah, okay. So here I'm on the WHO's website for tools for data analysis. And in their DQR module to desk review, they have a link to some Excel tools that are available. These Excel tools are built for folks who aren't using DHS2 or situations where you're not able to use this application in DHS2. You can use these Excel tools. Maybe that's helpful to you. And these Excel tools, if you get the data into them, will produce outputs that are very similar to what you're seeing in this application in DHS2. It's a bit more of a manual process, of course, but you can do similar things outside of DHS2 using these WHO tools. Thank you.