 All right, so in this video we are going to cover the second part of detecting outliers in dashboard using validation rule notifications. The way that we're going to do this is by building predictors and then tying those predictors to our validation rule notifications. But in this video we're just going to go over how to configure the predictors. Before we get into the nitty-gritty of building predictors, let's just quickly take a look at some examples of the kinds of outliers that we're going to be able to detect with these predictors and automatic validation notifications. Here is a very common picture that we see in a lot of countries. We see we have various data elements, sometimes indicators, progressing over a series of months as a line chart, and then we see one month for one of these data values that is significantly higher than all of the others. You see most of them have a fairly linear trend and you see this one peak jutting out right in April 2018. This is clearly an outlier. This data does not follow the normal trend. This data is so high above the normal accepted value range that you can clearly see illustrated here that it's almost certainly a data quality mistake. Now these kinds of data quality mistakes can throw off national statistics. So if every month we're expecting about 54,000, in this case BCG doses, and then in one month you report 170,000 BCG doses, now of course that could be from a massive, massive outreach campaign, maybe some door-to-door vaccination efforts, but by and large almost always these are going to be data entry errors where maybe a person at a health facility or a hospital, maybe they meant to put in 7,000 doses given and they put in 70,000 doses given on accident. It's a very easy thing to do. You just add a couple of extra zeros as you're putting data in and you go to the next cell and you don't look back at, you know, it's very, very common. And so we see a clear example of this here where you see a very common trend across all these various data elements. They're not changing or fluctuating that much, month to month, but one month is really jutting out. And if we were to leave this data entry error in place, then for April 2018 you would probably see that the BCG dose coverage was well over 100%, maybe 200%. And that would make folks question the quality and the validity of all of the other data. If you have one month that's 200% BCG coverage, can we trust the rest of the data? So the point is that it's very, very important to identify and correct these kinds of data entry errors. And the way that we're going to start showing you how to do today is have DHIs to do this for you, have DHIs to automatically detect these kinds of data entry errors, these kind of data quality problems, and send you a notification when it has detected them so that you can correct it immediately. You do not want these kinds of data entry errors to linger in the system for very long. So on this slide, we're going to look at just a couple of other examples of some clear outliers. The chart at the top is looking at BCG doses to measles one doses dropout rate. So that is the proportion of children who are receiving the BCG dose and then not receiving the first measles dose. And we expect this to be zero. We expect them every single BCG dose to be the same value as the measles dose. We expect every person to get both vaccinations. And we can see that in some org units, and these bars here are representing org units, that we have some locations where the children are receiving significantly higher BCG vaccines versus measles vaccines. Now, of course, this is should not be the case. We should always see that the number of BCG vaccines are the same as the number of measles vaccines. This top chart is showing us the BCG to measles one dose dropout rate. And what we expect to see is that the number of BCG doses given is equal to the number of measles doses given. Here on the top chart, we have a target line of zero. Again, that's indicative of the BCG vaccines doses given should equal the measles. We should be a one to one relationship there. Then we're giving ourselves an acceptable maximum range of 20%. And that means that we are allowing in this particular country scenario, a dropout rate of 20% from BCG vaccine to measles one, meaning that 20% of the children who got the BCG vaccine will not get the measles one dose. And they're saying that that's an acceptable range. Of course, not ideal, but an acceptable range. But you see we have two bars here. These bars are representing health facilities where we have a 72.1% dropout rate and a 73.4% dropout rate. Now, what does this mean? This means that we have 72.1 and 73.4% of the children who received the BCG vaccine, not receiving the measles vaccine. Now, obviously, this is a big problem. This is either a data quality mistake, or this is some kind of breakdown in clinical service delivery. We should never see a dropout rate this high. So this is telling this program manager that, hey, we've got problems in these two health facilities, in that the majority of children who are receiving the BCG vaccine are then not receiving the measles. And so I need to follow up. Hopefully, in this scenario, you find that it's a data quality problem. If it's not a data quality problem, then you have some serious problems with your measles one doses in these two health facilities. So those are the bars that are going above the acceptable maximum range. Let's look now, though, at these few bars that are going below the zero target line. So they're going down. These are negative values. And we see that there are two here that are really quite big. We have one that's negative 74.5 and one that's negative 70 or 271.4. What does this mean? This essentially means that we have more children receiving the measles vaccine than we have receiving the BCG vaccine. Now, that is extremely unusual, basically impossible from a clinical perspective. This is almost certainly a data quality issue in these two health facilities where the bar is going lower than the target line. And you see that we actually have three others that health facilities that are reporting that the measles one is higher than the BCG. Now, there should be an acceptable range for this too. We appreciate that in many countries, you're reporting on two different cohorts of children between the BCG vaccine and the measles one vaccine. Maybe you have some kind of seasonal birth patterns or trends in your country. But a value of negative 74.5 and negative 271.4 should be considered extremely unacceptable no matter how much seasonality you have to your birth rates in your country. It's basically guaranteed that these are data quality problems that someone accidentally put in far too few BCG vaccines or far too many measles one doses. So these are definitely something that needs to be followed up on. And the point of this chart at the top here is showing you that you can see these very, very, very clearly these bars that are way too high and these bars that are way too low. And you're going to be able, if you're looking at this, if you're a program manager to immediately follow up on these, they should be as obvious as they possibly can be. Again, examples of outliers, especially that negative 271.4, an extreme outlier. All right, let's take a look at the next chart below this. This next chart below this is looking at new malaria cases under five versus malaria new malaria cases five and above. The under five are showing up here as the green portion of the stacked column, excuse me, this is a stacked bar chart. And the blue portion is showing the five and above proportion of the population in the stacked bar chart. And you see that we have set our range for this chart to be up to 100%. So here we have essentially a clear representation of the proportion of total new malaria cases being disaggregated by under five and over five. Now, we expect in most countries that have malaria to have about 30% of our new malaria cases to be under five. Okay, that's a very, very common trend that we see around the world where we have endemic malaria. So you see that at the 30% line, that's what we have. We have that that a black line going through 30% that says expected range under five minimum. Now, we're giving ourselves a little bit of a buffer because it can vary based upon your population distribution in terms of age, but we then give ourselves an acceptable upper range of 35%. And you see again at the 35% line, we have a strong black line cutting through that. And it's telling us that between 30 and 35%, we should see that under five population end and the over five population begin. Now, what you see here, if you look through the months, kind of going from top to bottom, as you scan through, you see that in November 2018, our new malaria cases under five are around 65%. You see that green at November 2018, you see that green line going way, way, way, way past the acceptable ranges and going far into what in the other months is just the blue, just the over five. And this is another clear indication of an outlier. So November 2018, there is clearly an outlier in your new malaria cases under five. Someone accidentally put in a value that was way too big. And you can see that it's throwing off our national statistics, it's throwing off our national distribution and disaggregation of our age breakdown of the new malaria cases. So if you were to look at this, you would say that somehow in November 2018, 65% of our malaria cases were under five population. Now, that's just not clinically possible. Unless you, I just, I can't even imagine a scenario in which that's clinically possible. So that means that you have an outlier, you have a day equality problem in November 2018. And if you look at the values that we have reported for our new under five cases, most months, they're ranging between, yeah, about 100,000, maybe 150,000, something like that. But here in November 2018, we're up above nearly, we're nearly 300,000 cases. Whereas our over five is still in the acceptable average that we're seeing. And of course, you have to appreciate that malaria is seasonal. So it's going to, what we're not seeing here is the seasonal curve. But we're still seeing that the proportion is way out of whack. It's just not possible. So again, another clear indication of an outlier. The point of these two charts really is to show that we can see outliers in a lot of different ways, right? In the first chart, we saw that we can see an outlier stick up in a kind of month to month trend. Here in the, on the first chart here, we're able to see that outliers being illustrated by a relationship between two different data elements. So for example, here, a BCG to measles one dose dropout rate. So we're comparing these two data elements together to see outliers in one or the other. And then on this malaria example on this bottom chart, we're seeing outliers represented by disaggregations of our clinical data by age. You maybe also can do this by gender. And so these are just to show you that we can find outliers in a lot of ways, but this requires you to look at the dashboards, to look at your charts, to have these, that you're routinely looking at them. Of course, people are not always routinely looking at their DHIS2 dashboards. And so the most effective way to make sure that people address these kinds of outliers that are throwing off natural statistics is to configure DHIS2 to be able to automatically detect it and to send a notification to the user who can correct it. Okay. So let's take just before we get into the crazy predictor stuff, let's just do a quick overview of how do we identify these outliers on a standard dashboard and isolate the actual outlier itself. Well, we have some steps for this. And I'm going to show these to you just in a second. The first step is that we want to isolate the period. They usually, we just want to drill into one month. Then we want to change the chart type to a bar chart. Then we need to isolate the data element. Then we need to isolate the facility level to see where the data is coming from. And then we change the layout so that org units is in category, periods in filter, and data is in series. Now, I just did that really fast. Let me just show it to you in real life here. So I'm going to go to the demo site for the academies. Here we are. I'm going to log in as demo and then district one, hashtag. Yeah. Okay. So here we have a similar chart to what I just showed you. And I think even Bob mentioned this in the videos. Where is the outlier here? Well, there's probably two outliers. This point here back May 2019, A&C first visits, and this one here A&C first visits from January 2020. Now, how do we figure out where this data is coming from? Where's the facility that's causing this outlier? Now we need to do a quick recap of how to use the data visualizer app. So what I'm going to do is just show you the whole process here. I am going to, I can't explore this on the dashboard. So I'm going to click the open in visualizer app button. And that's going to take me into the data visualizer application. So now we're in the application and we can start to drill into this data. What was my first step? My first step was to isolate the periods. So we know that the outlier is in January 2020. I am going to go to my period selection. I'm going to deselect all these periods. You see we have 13 months here selected. I'm going to deselect these. I'm going to go to fixed periods and I'm going to say January 2020. Turn that one on. And here we go. Just some dots because, you know, we have a trend line, but we don't have a trend if it's just one period. Here's still my outlier way up here. Then we need to change the chart type to a bar chart. So I'm going to go over here to the chart types. Change it to a column. You could also change it to bar. I'm going to choose column. And there we go. So now we actually see all the data next to each other. And still here's our outlier. All right. Now we want to isolate that data element. The data element that has the outliers A and C first visits. So I'm going to come to data. And I'm going to turn off everything that's not A and C first visits. Click update. Boom. Now we just got one big bar. That's fine. Now let's figure out where this data is actually coming from. Now the easiest way to do that is to go into our organizational units. And you see right now I have user organization, the relative user organization units tick box selected. I'm going to untick that. I'm going to select national level or the national and then I'm going to select my level and I'm going to go ahead and go all the way down to facility level. Okay. So that's going to show me all of the facilities. Click update. Nothing changed. Why did nothing change? Because my layout is not correct. So the last step is to change the layout. We are going to move organizational units to category. We are going to move period to filter, which automatically went for us when we switched out with the org units. And we're going to leave data in series. Click update. Boom. Where is that outlier? Well, it is clear. This is right here. Outlier is at facility 147. And in one month they reported 35,888 ANC first visits. Okay. This is how you isolate for the outliers going from a dashboard in through the data visualizer. A few steps here, right? You also have to know how to use the data visualizer. Now wouldn't it be nice if this notification was just sent directly to my email? How do we do that? Well, that's what we're going to talk about now. How do we see this outlier sent directly to my email? I know I just went through that quite quickly. If you want to revisit how I just gone through and used the applications in the dashboards to identify this outlier, please go back and rewatch this presentation. And you're certainly welcome to slow it down if necessary. All right. And again, I showed this example just before the break, but Rwanda has at least configured some to do this at least partially to automatically detect the outliers and send them and push them to people's to emails. All right. So the way that we actually calculate the outlier threshold, we have to compare the data that is entered against an outlier threshold. The way that we actually calculate that outlier threshold is using predictors. And we put that predicted value into a validation rule. Predictors have typically only been used for data elements in DHS 2.28 to 2.34. Starting in 235, we can actually start to do some of this analysis, not using predictors, but using indicators. So the release that we actually just put out yesterday, you can start to use this in standard indicators. But most of you, basically all of you, because we released it yesterday, are not using 235 yet. I'm going to show you how to do it in 228 to 2.34. So let's take a look at some of these predictors. Let's take a look at some example predictors. What are these calculations for? So predictors really use previously reported values. They use previously reported values to calculate a new value. That's really what predictors are for. And we can, a couple of examples of this. So the first example, I want to get the average malaria incidence over the last six months. You can't use a standard indicator for that. You cannot use, the only thing that you can use to calculate that value in DHS 2 is a predictor. Another example, I want to calculate the average ART consumption over the last three months. Maybe you're using DHS 2 for some kind of supply chain monitoring. And using predictors, you would be able to calculate, say, average consumption. Next point, I want to see the average ANC visits with three standard deviations. Remember, we talked about what standard deviations were yesterday. For the last 12 months, this would produce an ANC 1 outlier threshold. This is really the actual, we're going to be building now. We want to see what is the outlier threshold. And the way we calculate that is we want three standard deviations from the average over the last 12 months. Predictors can also be used, the next bullet point, the fourth bullet point here. Predictors can also be used to count the number of facilities that report some value. This is a very, very highly requested indicator in DHS 2. And even though it's not related to data quality necessarily, I just want to make the point clear that using predictors, you can count the number of org units that did something. So I want to know every org unit that recorded more than five malaria cases, or how many org units recorded more than five malaria cases. Well, you can actually get that number using predictors from aggregate data. One, a couple of words of warning here. Predictors have to be scheduled jobs, just like we talked about scheduling jobs and validation rules. Predictors have to be scheduled jobs. Predictors do also require quite a lot of processing on your CPU, on your servers. So if you have a lot of that predictors running, quite often, you can really text your server, just like we talked about with validation rules. So you have to be very careful about how many predictors you have run, when they run, and how much data they actually generate. Because these could all have server implement, yeah, they could all affect your server, excuse me. If you are a server expert or a server administrator out there, please feel free to talk to us about exactly how you should be considering how many predictors to run, when to run them, how much data to store. Because if you just set these up without considering how your server is performing, you could really set yourself up for disaster. So how do we make a predictor? Well, predictors are actually the value that's generated through a predictor is actually stored as a data element in DHS2. And the reason that this is good or is useful is that that data element can then be used in indicators, it can be used in validation rules, it can be used in anywhere that you use a data element. So again, predictors are very different than standard indicators. Standard indicators calculate values on the fly. They do not store those values. Every time you turn on an indicator, that value is calculated. Predictors actually calculate the value and then store that value as a data element so that you can go back and use it in other places. So how do we do this? We first make a data element, then we make the predictor, and we assign that predictor to store that value that it generates in the data element. Then just like with indicators and data elements, we have to put the predictor in a predictor group. Then we schedule the predictor group to run just like we did with the validation rules. And then finally we can put that predictor into a validation rule to allow us to get those outlier detection going. So how do we make the outlier threshold? Well, the first step is that you have to make your data element. And so I'm going to quickly cover how to make data elements. This is not a DHIS2 configuration course, specifically, but you have to make the data element first. And this is the screen to make a data element. Again, you go to your maintenance app, you click data elements, you click the plus button, and then you're into making a data element. You have to give your data element a name, of course. You also have to give it a short name. Typically the short name is the same as the long name. In this case, we're saying A and C1 outlier threshold hyphen your name. And again, you're hyphen your name in case you're doing this right now, you're following along, you're able to recognize your outlier threshold from anyone else who's making those right now. Your domain type will be set to aggregate, your value type set to number, and your aggregation type set to sum. Then we need to go in and make the predictor. So again, we go to maintenance app. We scroll all the way down to the bottom of the maintenance app. You'll see the predictors there. You'll click on the predictor and you can click that blue plus button again, and you'll go into making the predictor. The screen you see here is actually how is the screen you would build the predictor in. Again, first step, as always, give it a name. Second step is, of course, you can also give it a short name and a code and a description as well. But the second compulsory step is to define the output. So we just made a data element. We need to go in and find that data element we just made and assign that data element to this predictor. Just to repeat myself, the predictor will store the value that is generated in that data element. We also have to define a period type. So the frequency with the predictor 1, but how will that aggregated value in that data element be stored? We're going to say monthly here. And then the next step, which is absolutely required, but it does not have an asterisk next to it. And so please hear me. You have to select an organizational unit group level, or sorry, organizational unit level. You have to select an organizational level for predictors. So here, the typical rule of thumb is select the level at which the data is captured against the values that will be used to calculate your predictor. Select that same level. So in this case, A and C data is coming in at facility level. So I want my predictor, I want the value that is generated to also be stored at facility level. So I'm selecting facility here. So please, you have to select an organizational unit level. If you don't, the predictor will not work. The next step is to define your generator. In the generator, this is where you actually put in the calculation for the predictor. And in the generator, you first have to define a missing value strategy. This is the same as the validation rules. And it will just say, if the values that are missing, how do I handle the values that are missing in the calculation that you're providing? The default is skip any values if it's missing. And the question is, is this usually good? And typically with predictors, it's not useful. We actually want the calculation even if there are some values missing. And the reason is for that, because we still want to calculate this value. This value is referencing previously reported data. So even if in this case, we're actually going to be looking at 12 previous months of data. So if we leave it to skip if all values are missing, or skip if any values are missing, then if it's skip if any values are missing, then if there's one month of where there is no data, then we won't actually generate our predictor. And that's not what we want. We actually want them to generate the predicted value for the average of the last 12 months, even if there is one of those months that doesn't have a value. Okay, again, I think this is something that's easier to show you than to to talk about on a slide. So I am going to go into my maintenance app. I'm going to scroll all the way down to predictors. I'm going to go into my generator. All right. And remember, we're making a predictor that is going to produce an outlier threshold for A and C one. So again, what is how do we calculate an outlier threshold? Well, the outlier threshold is the average plus three standard deviations of the value. So how do we do that? Well, in predictors, we are allowed to type in various mathematical operators. If you're curious on what all of these operators are, I think I have a slide on it, but you can also just Google DHS to predictors. And there is a clear documentation on all the various ones. But the for the one that we're going to use right now is AVG. And then that's average. And then we're going to find A and C one. And I'm just going to say A and C one visits close my sorry, close my brackets. And you see my translator is already working. So I'm taking the average of A and C one plus I'm going to do another open bracket three times the standard deviation and standard deviation is type STDDV another open brackets. I'm going to then add A and C one visits again. And to close right. So now it's actually translating properly. So we are taking the average plus three standard deviations from the average. Remember yesterday we talked about standard deviations and the bell curve three standard deviations off the average means that it's going to pick up any value that is 98% different than the average. Okay, 98% different than the average over all of the sampled values. Again, I have to give it a name for the sake of time. I'm just going to copy and paste that in there. Submit. Okay, here it is again, I'm back into my PowerPoint. So the formula for standard deviations is average plus the plus the data element plus three standard deviations three times STDDV and then A and C one. You might be wondering if this is case sensitive. In older versions of DHIS two, it is case sensitive. In newer versions, it is not. So if you're using DHIS two point, I believe three one or older, and you're using predictors, you need to put these in all capital letters. If you're using 31 or newer, so 31, 32, 33 or 34, you can put these in lowercase letters like you see here. Yeah, so here are the other functions for predictors. You can see that there's quite a lot like average count max median min standard deviation sum. These can be used in different situations. The next step is that we have to define our counts. Sorry for the spelling mistake here on count, but we have to define our counts. Now again, in this particular situation, looking at producing an outlier threshold, we want to reference the last 12 months of data. So we are going to say 12 sequential sample counts here. A sequential sample count is just what was the last period and how many of those periods do you want to count? So we're going to say our last period was months. We've already defined that elsewhere in the predictor. We're going to say 12. We want to look at the last 12 months of data. Our annual sample count, which is saying how many previous years do you want to reference? We're going to leave that one to zero. And then we have a sequential skip count. Sequential skip count will allow you to say I want to skip some number of previous samples. In this case, we're not adding a sequence. We don't need a sequential skip count. So we're not going to even add it in. So what does this actually look like? Let's pretend that we are in March right now. Okay, let's pretend that we are in March. And if we say a, and the numbers that you see listed here, these are months of the year. So January through December and then one, two, three, four January, February, March, April. Okay. So let's say that we are in April and we say we have a sequential sample count of 12. What will that do? Well, that will take the previous 12 months. So that would take from March to April of last year. Annual skip count is, annual sample count is zero. So we're just looking at just those last 12 months and our skip count is zero. So we're not skipping any values in the last 12 months. This can become more complex, of course, right? So for seasonal data, you typically have to have some combination of annual sample count and sequential sample count to be able to factor in the epi curve of malaria, that bell curve that you have of seasonal malaria or any kind of seasonal disease. I won't go into it now for the sake of time, but there is clear documentation on our website and our user manual on how to build out predictors using combinations of sequential sample counts and annual sample counts to factor in more seasonal data. So that is how you make the predictor that generates the outlier threshold. How do we then put that into our validation rules? Again, we want to have a validation rule that checks the ANC1 value against the ANC outlier threshold and say, if that ANC1 value is higher than that outlier threshold, send me an alert, send me a notification. Let me know that this value is 99% different than the average. Well, it's the exact same process that we just went through in building validation rules. Here, you can see, building out the validation rule, we give it a name, we can give it a description, again an instruction, you can define the importance. Our period type again is monthly, our left side expression is ANC1 visits, our operator is less than, and our right side operator expression will be ANC1 threshold. Again, validation rules need to be defined based on what we know to be true. In this case, we can interpret that to be ANC1 visits should be less than ANC1 outlier threshold. And again, that threshold value is a data element that is coming from a predictor. So what happens when we run this validation rule? We can see that we received a lot of validation notifications. And just looking at this one example here, we can see that we have a value for facility 583 in October 2019 that was reported a ANC1 value of 126, but our outlier threshold is 117.9. So it's saying, hey, look, 126 is bigger than 117.9, it should be less than, here's an alert. Before I cut over to revisiting some of the scheduling, I do want to quickly show you how you use predictors to count org units. And in this case, I'm counting the number of org units that have a stock out of some commodity. This is not related to data quality, it's just highly requested thing to know. And so in predictors, we can also put an if statements in our generators. And that if statement here, if you look at the translator, where it says valid at the bottom of the of the screenshot, just directly above that, you have the translation. And this saying if the stock out of RDTs number of days is greater than zero, count a one, if not record a zero. So what is that actually going to do? That's going to say that if you have more than one, or you have one or more days of stock out of RDTs that's recorded in your monthly reporting form, then it's going to count a one value. And that one will be saved as a data element against that facility. And you can use that value to then aggregate up the hierarchy and say how many facilities did I have a stock out of RDTs last month? Or you can even say put that value onto a map and you can make a map showing specifically, these are the facilities that had a stock out last month. That's how predictors can be used to count health facilities. Again, it's a little bit more an advanced concept. If you have questions, please send those on Slack and we'll be happy to come back to them. For the sake of time here, we don't have time to go into more advanced functionalities with predictors. But there is a lot that you can do with them. Hopefully you appreciate that. All right. So one additional point that I need to make is that predictors also have to be scheduled in the scheduler app. So it's a different job than the maintenance job that we use to make validation, to schedule validation rules. So we have our ANC outlier threshold and we are going to make the job type predictor. Again, you select the frequency that you want. Again, just like we did the validation rules exact same, you can select a predefined frequency and you can put in your own frequency through a chron expression or you can say continuous execution. Again, I highly recommend that you do not use continuous execution unless you are in a disease surveillance scenario and you're only using one or maybe two validation rules. The job type will be predictor. This is again different. The job type for the validation rule was a monitoring job type. This is a predictor job type. But the parameters are the same. So we have to define our relative start and our relative end. Again, rule of thumb, relative start negative 60, maybe negative 30. You just want to cover a few previous months, right? Relative end to being just one, which is tomorrow. And then we have to give our predictors into our scheduler. So you just type out the predictors here. You'll be able to select them and then you add the job. Now, let's say that you have some predictor jobs, then you have your analytics tables and your validation rule alerts. If you're a system administrator, I'm talking to you right now. If you're not a DHIS2 techie, you're probably not going to appreciate a lot of this. But the point that we made is that you have to schedule these jobs to run concurrently. You cannot schedule these jobs to run at the same time and they need to run in a specific order for you to actually be able to generate your validation rule alerts. So the first thing is that you schedule the predictor's jobs to run first. And again, you want to have all of these jobs run when the server usage is low, when there's not a lot of people on the database. The predictor jobs, for example, you see the example here on the right side of the screen, could run at say 11 o'clock, 23 on a Sunday. Very few people are going to be looking at DHIS2 at 11 o'clock on a Sunday. Then you have your analytics tables run after that. Again, the analytics tables are just a function that has to be performed in order to see the data on the dashboards or in the data visualizer on maps, etc. So you want those to probably run right after their predictors. You can have that run maybe an hour later, run that at midnight on Sunday. Analytics tables, depending on the size of your database, may take a while. So if you give yourself a few extra hours of buffer and say, then you want your validation rule alerts jobs to run at maybe zero three. So a few hours after your analytics tables. And if you run them in this order, that means that your predictors come first. Those predictors are then able to be seen on your dashboards if you want, if you have them there. And then they'll be able to be pushed out in your emails. If everything goes properly, then this will all happen when you're asleep. When you wake up in the morning, you'll see an email from DHIS2 saying, hey, look, these are the alerts that were generated last night. Okay. There is some question already posted in the Community Practicer on the Slack channel around how do I get email or how do I configure DHIS2 to send an email? Well, the reality is that you have to generate an email service from your service provider. DHIS2 does not have a built-in email host. This has to be provided from your service provider. So if you are a Ministry of Health, and you are using a telecommunications company in your country, hosting your DHIS2 instance, then or hosting, say your email service for your ministry, that host, that company, whoever it is, needs to provide an email host or an email endpoint for your DHIS2 instance to be able to send emails. Again, not something that University of Oslo can provide you. This has to come from whoever is hosting your email service. Many of you are probably using Google as your default email service. So if you have like a Gmail, for example, and if you are using Google, then we've given you some of the guidance on how to configure this. But you have to provide a host name, a port number, a username, a password, a username, a password. This is for your email service. This is not for your DHIS2. You need to provide a TLS, an email sender address, and an address to send test emails. Again, all of this has to be provided by your email service provider. We cannot provide this to you. This has to come from whoever is hosting your email service. Any telecommunications company that hosts email services should be able to provide this to you. These are fairly basic things to be honest with you. The example that you see here is if you have your email hosted by, say, Amazon web services on the left side of the screen. Like we do here at DHIS2, we have our email hosted by Amazon web services. So that would be our host criteria. Of course, you don't have our password, so you can't use the same thing. Okay, so we have maybe just 10 more minutes for any burning questions. Nora, is there any terrible issues out there, or what's the most terrible issue out there? Yes, Scott, there were a few questions. I'm just going through them. Do you want to introduce yourself from derivative? Does predictors work in event and tracker 2? How does that differ from program indicators? That's a really good question. Yes, you can use an event and tracker data in predictors. The difference between a program indicator is that that predictor value is actually saved as a data element. So you're able to then use that predicted value in other calculations, in validation rule notifications, anywhere that you can use a data element. And that's very unique in that you cannot do that with a program indicator. There's also different types of calculations that can be done in predictors that cannot be done in program indicators. Predictors can also incorporate values across multiple programs. So that could be very useful to you that if you have like a program for A and C, a program for immunization, a program for birth or follow-up or something like that, and you want to have a value across all of those different programs, you can use that to make a single calculation. You can do that in a predictor. You cannot do that in a program indicator. Any other questions? Yes, there was something about can you use data elements across different data sets? Yes, of course you can use any combination of data elements that you want to make a predictor. And then for the general QA, that the number of from Fernando, the number of standard deviation to add to the average to set the threshold to vary depending on the value we're tracking, but a good rule of thumb is three. And I think that is what you are more releasing. Yeah, I think he said it perfectly. I would totally agree. I think three is a good rule of thumb. We don't have the intelligence in DHIS to yet to be a little bit more flexible there. We would like it to help the user define what would be an appropriate number of standard deviations. We are kind of starting to explore how to do that, but right now I think you're absolutely right. Just set it to three. Three is probably not going to steer you wrong in most cases. Then there's a question just above that from Pearl from Cameroon. Can you explain the significance of each term? Zero comma one, zero, zero comma one comma zero. I think that's a chron expression. That could be a chron expression. It could also be that he was referring to the if statement. So I could go in a little bit more detail on the chron expression. Let me just get back. Let me go to the scheduler app. Okay, so it looks like some folks have already done this. That's great. So the chron expression, if I just make a new job, I've got to define my job type. Again, the monitoring job is for validation rules. The predictor is for predictors. So I'm just going to make a validation rules. My chron expression, I have the ones that I can already choose. So like every day at midnight. And like say every week and this would be every week at 3am. So I'm just going to say test, add job. Yeah, okay. So that would be the first day of the week at 3am. So let me kind of explain this to you a little bit more clearly. So each one of these values, so we have zero, zero, three, question mark, asterisk, and then m-o-n. Well, hopefully you understand the m-o-n. That's Monday. So each one of these actually represents a unit of time. So we have second minute, so zero and zero seconds at zero minutes on the third hour. And then we have day and month. So the day is defined as Monday. And the month is, or sorry, we have a day and week and then the day of the week month. So we have this running every Monday. Now, I know that is kind of confusing to folks. Please just Google it. So if I just Google chronic expressions, there are free formatters online for building chronic expressions. So for example, if I want to say I want something to run every month, then, sorry, I want something to run every day starting on Sunday, then it automatically generates a chronic expression for me. If I could say I want every day on the first day of the month, then you see it generates a chronic expression. So if I copy this, go back to DHS too. Scott, there was an update about that question from the user. Apparently it's about predictor configuration. Okay. Well, let me just finish this then. So I've just copied and pasted the chronic expression for every day, the first day of every month here. So you see that if I click save changes, then you see you can say when's my next execution. You can see my next execution is November 1st at midnight. Okay. So you see that there. So that's how you can do chronic expressions. The actual question was on setting the org unit level for the predictors. Is that correct? It was to come facilities with dropout drugs. Yeah. So this is not, you know, actually in two weeks, we have an entire academy on supply chain and building predictors for the supply chain. So I recommend that you attend that academy if you really want to specifically know how to do that. But very quickly here, let me just show you. If I want to count the number of facilities that have any specific value that's recorded, I go into my generator and I type if open brackets. And I'm just going to say acute flaccid paralysis death under five just for the exam. And I want to say if acute flaccid paralysis deaths are greater than say five, then record one. If else record a zero. Now what does this predictor do? You can see that it is considered valid. This will say if that health facility that month recorded more than five acute flaccid paralysis deaths, then it will count one. So the way you translate these if statements say if this then this if else then this. So if this is greater than five count one if else count zero. All right. So this is like this is basic basic basic basic computer programming basic if statements. But here that's what it's going on. So if if this is greater than five, then count one if else count zero. And so what will this do? This will say, oh, this facility had more than five record a one value. And that one value again is saved as a data element. And that data element can be aggregated up. So you can say, you know, how many across the entire country, how many health facilities that I have more than five acute flaccid paralysis deaths this month, and you'll have a number there because for every single one that does it's counting a one. Okay. Now this is a very basic example. You can make these much, much more complex. And you'll actually if you do come to the supply chain academy in two weeks, you'll see that we've made some that are much more complex than this. But this is a very basic one to be able to count the number of facilities that record any particular value.