 Okay, so welcome everyone to this session entitled New Frontiers for DGS2. I will share my screen just to give a quick presentation of the two presenters we have here today. So, we have one hour and two presenters. First, we have Justine Faye from USA Sites, talking about the integration of climatic data for disease prediction. Then, after that, we have Elmerie Klassen from East South Africa, talking about their work on data science methods. So, without further ado, I will leave the floor for Justine to start, and I believe in the chat, you will find the link to come into practice where you can post any questions you have. So, Justine, please. I think maybe while Justine, if you could look at a potential... Is that data? We can give the word to Elmerie to present first. And then I think we can try some solutions, Justine. Okay, if it's okay, I think we can swap the order of presentations and give the word to Elmerie first. Hi, Johan. Am I coming through clearly? Yes, thank you. Okay, I will start sharing my screen then. That's a sec. Okay, for this, it should come up now. All good. Then hi, everyone. I'm happy to connect with you virtually today. I'm Elmerie Klassen from East South Africa, where I'm the manager for data science. I'm also joined by Jaco Fenter, the manager for product development. This is a combined effort between our teams to develop this application. Is data science in DHAs too scientific? Predicting prints or estimating targets used to be a challenging process, but not anymore with the emergence of data science. Coupled with DHAs too, you now have a powerful and informative tool at hand. Data science combines the technical disciplines that solve business problems through the extraction of knowledge from data. In other words, our presentation is going to show you how we have made solving complex analytical, probable and accessible applying data science techniques. In the past, we have become used to collecting data and preaching, turning it into information by illustrating it on graphs and tables. But very few reach the level of transforming that into knowledge resulting in informed decision making and planning. As we embrace the fourth industrial revolution, we need to take advantage of technology to provide longer and healthier lives for all. Predictive analytics, machine learning and other data science methodologies coupled with clinical insight and knowledge will lead to a positive impact in client outcomes. This is the problem we are addressing with our solution. So our problem statement was that we needed to use predictive analytics to set targets. DHAs too does have predictive functionality, but it requires the selection of a specific function which is applied to all health facilities data. Instead, we want to enhance the predictive functionality on DHAs too by providing a model that selects the best fit curve thus enhancing predictive accuracy. This is continually improved through training the model with every iteration using machine learning. The methodology to get the best fit algorithm. So not all data in a specific data set fit into a specific function. For example, it's tricky to select one function to apply in the same manner to all clinics, CHCs and hospitals or even for all clinics in a given area. It is well known that there are different algorithms for predictions. This may be based on the variables that affected. We developed a process which runs through all of these algorithms and identify which is the best fit for a specific set of data. This output demonstrates the outcome of all of the algorithms and identifies the one that is best fit to predict the trend of the data. Best fit is where the regression coefficient discloses to one. It runs through all six functions and selects the one with the best regression. In this example, the parabolic curve number two was the best fit. The next best fit curve was the cubic curve. For reference, the best two outputs are displayed as seen here. Since we store data in DHAs, we decided to build a DHAs2 app. The interface should be able to select source criteria, for example, the organizational hierarchy, the data element, and the period of source data. It should define the prediction criteria, for example, the period of the prediction, whether to do a forward or backward prediction, whether to store the data or to do a dry run. It should define the output data element for storing the predicted values and display the output results in a user-friendly format. So following here, we showed the DHAs2 app that we have designed with a toolbar on the left, selecting three parameters to work with. And the output you will see on the right. There are three parameters to work with. The first is the source criteria. You select the data element and define the organizational unit hierarchy and the period for which to run the prediction. As each value is selected, it is stored on the right-hand side in the prediction selections. The predictions can be run at different levels in the organizational hierarchy by selecting the level at which it should be run and then to select the period for the source data. Selecting or adding a source data element name and the number of periods for the prediction. Of course, the greater the amount of historic data used in that input, the more accurate the prediction. And to select whether to do a dry run or not and a dry run will not store the data. The prediction we're running here is to support the 95, 95, 95 targets which states that 95% of people living with HIV should know their status. 95% of people who are known positive should be on treatment and 95% of people on ART treatment should be virally suppressed. We therefore monitor the number of people remaining on ART as a national indicator which this prediction was run on for a district and we're going to show you the outcome from a number of facility predictions. South Africa aims to increase the number of people on ART by 2 million in two years. Therefore, most facilities should increase uptake on ART through appropriate interventions. To verify our model, we developed an option to compare the predicted data with real performance. In these, the output for the prediction is reduced by the number of predicted periods. The input period we ran here is from March 2014 until August 2019. Then we predicted data forward for 12 months and compare that with real data for the same 12 month period. On these graphs, the blue line represents the actual data and the red line represented predicted values. From 2017, in this particular graph, it is clear that the trend slowly decreased with another increase and a subsequent decrease again. The prediction follows the same decreased trend, though the real data slows a gradual increase with a sudden drop. The parabola or cube foot will smooth out the hams in data and provide a reliable prediction. When considering the knowledge and insights one can gain from this data, you can now establish why this occurred through consultation with the health facility and implement appropriate interventions. Here are some more results from the prediction run which is similar and even with extreme variances in the earlier data as can be seen in this graph. The adaptive methodology selected a method resulting in a close match to the actual values as seen here. The predicted values in this are higher than the reality, as you can see. But it is clear from the graph that the prediction looks accurate. This allows investigation into the reasons for the actual performance deviating from expected performance. We developed a 1990 report to assist health facilities to monitor their performance towards facility targets, which was set using methods before this prediction methodology was developed. South Africa is still using 90% targets for health facilities, since most are not yet reaching the 90% to push them towards 95%. Since we see such a wide discrepancy in this prediction, we wanted to look at the 1990 HTML report in DHAs too for further insight. As you can see from this slide, a manager commented on this report providing interpretation of the information. She reflects that the HIV tests done are only 61.4% of the targets, which has a knock-on effect on the indicators. If more people are tested, more positive clients will be identified and more started on art, resulting in those remaining on art being closer to the target. The viral load done at six months is below the target, which means the result of the viral load suppressed, values also being below the target. This analysis would assist the facility manager or health managers to identify reasons for the lower than expected performance, which result in the lower than predicted number of patients remaining on art as seen in the prediction graph. This process will provide insight when coupled with clinical knowledge and expertise on the external factors that influence the data. Studying based on work practices and implementing corrective measures based on those, the health system can now gain wisdom on how to intervene appropriately in order to make an impact through improving health outcomes for the client, bettering the lives of all. You can also see in the graph on the right how this facility's performance is impacting the district to not meet their targets. The green bars are actual performance and the blue are the gap from the 1990-1990 targets. What you have seen today is just the beginning and we don't plan to stop here. Our upcoming plans include adding functionality for indicators and program indicators, defining accuracy through machine learning, releasing this as a DHS2 app to the community to share with all of you, applying the insight gained to achieve impact for clients, and enhancing the AI capabilities through neural network integration. At this we believe in creating better lives for all citizens. We do this through our products and services, our partnerships and our business pursuits that adapt with the changes in the environment. We pride ourselves on being a learning organization that continually innovates and collaborates, making use of novel technologies and creatively solving problems with the collective mindsets in our organization. To find out more about what we do and how you can get involved, connect with our managers and read more about us on our website. Thank you. Thank you very much Elmarie. So Justin, can we try again with your presentation? Do we have Justin here? I think she might have left again. She was trying to test her internet connection at this stage, which is what I think was causing her audio problems. Okay, so while we're waiting for her to reconnect, let's see if there are any questions. There are no written questions yet. Let's see if we get Justin on now. In the meantime, I'd like to remind you that you can post your questions on the Community of Practice page. See, we have Justin about to join or trying to join. Give her a second to see if we can get the connection here. Seems to still be some troubles joining. There we have Justin. If you grant, can give Justin the co-host authority. Justin, if you want to try and talk again for us. Can you hear us, Justin? If so, please unmute and try to talk again. Okay, I think while we try to connect with Justin, we have a question written here for you Elmarie on the Community of Practice page by Anna Tusheng. I can read it out for you all. To what extent are the predictors to present in use in South Africa? Who runs these predictions? Does this happen at district facility level or do they need technical assistance to do that? Hi, and thanks for your question. We have recently completed those predictions. So we are still presenting this to the ministry in terms of the use. But essentially, I think it's not really required for health facilities to run it. It would be more run at a higher level at periods where they developed plans, performance plans, annual plans to set realistic targets. And we also want to work with them on increasing the use of these predictions to not just use it as predictor functionality, but as a prescriptive analytics. In other words, through machine learning, you can actually identify what activities or what base practices would result in higher performance than other practices. We would then feed that into our machine learning models and be able to assist the department to aim for higher targets and then prescribe to users or to health facilities what activities it is that would help them to achieve those higher targets. So the use for this is very broad in terms of where we can use it. Also in identifying where problem facilities are and assist those facilities to address those issues. So it's still a new developed but we foresee it being more run at higher levels. Okay, thank you. I think we have lost Justin again. So we continue with more questions and please feel free to write them down on the community of practice. I have a question and Marie, you alluded to it in the last slide that you will make this available. Can you say something about the timeline here and how much flexibility do you foresee in this in terms of adjusting it to different cases and different predictor analysis? Sorry, how much? What will be the aim in terms of flexibility of use cases etc. Will it be a very custom made app for very specific predictor analysis or do you plan to have it more open for a greater variations of use? I think it is open for a greater variation of use. That's why we want to include program indicators and indicator values as well. And as long as the source data that will be used is of a fair period, we foresee that to have quite a wide variety, wide use. And therefore we do foresee it to be something that can be used by the wider community. Thank you. I see we now have Justin online with us. Can you try to unmute and talk a bit, Justin, to check the sound now? Can you write in the chat if you are able to hear us? In the meantime, I think we have one more question for you, Elmarie. And it's about the original request for this app. Is this something demand this work is going out from? Is it coming from specific health programs or more broadly? Or is it something you have engaged with in South Africa more on your own? Yes, thanks. The request initially came from the ministry, specifically the unit dealing with planning to be able to assist them to set realistic targets. So we have produced this for that reason, but also that it will assist many of our other clients as well. So the initial request came from the ministry. Do you want us to try again, Justin? Okay, I think she will be joining by phone quite soon. So in the meantime, if there are any more questions for Elmarie? Looks like we've got one hand up from Scott. Hey, thanks guys. I just actually wanted to say Elmarie, what an incredible app and I really can't wait until it is shared with the community. I think it'll really benefit a lot of folks. Second time I've been able to enjoy seeing the presentation. I learned something new this time as well. One thing I was going to say in your presentation, you mentioned that DHIS2 does have some kind of core out of the box predictive abilities. I think that you're being really nice when you say that actually. To be honest, we DHIS2 is able to have calculations that can forecast data based upon some simple like averages or standard deviations, really basic formulas, really calculations to be able to generate data that would be projected forward in time. But nothing anywhere close to as advanced or as you've been able to do with your predictive analytics and certainly nothing with machine learning and I think this is really kind of a pretty incredible first leap into machine learning in DHIS2. So as more use cases come online, I would love to stay close to them and hear about them and learn from them. That's all I had to say. Thanks. Oh, thanks, Scott. You know, we we investigated the functionality that is there, but specifically for this specific use case that I was presenting we found that different facility types. The declining print because of referrals others were increasing. So it just wasn't easy for us to use to select a preselect a specific say exponential curve or however we, you know, deviation. So it's, it's been coming up from need. And it does require quite a long amount of data. So, because we had that it was easy to develop a system we have data for from 20 from 2000 really, which helps a lot in developing these algorithms. But thanks. We will definitely like to work with the community on finding more use cases and thanks. Thank you. Do we have you here now, Justin? Yes, yes, I'm on. Very good. Excellent. Floor is yours. All right. Thank you so much for this opportunity. I'm sorry about the previous synthesis of the voice not coming through. I'm a specialist for U.S. ID strategic information taking code on your conference about my research findings that I did at my master's degree at the school of public health university. Presenting to you the results we got from integration of DHS to one climatic data purposely I did that for disease prediction. And I took a study of Malaria occurrence in the guru district to Uganda. This research was done in collaboration with other people like Dr. Peter Navende, Dr. Simon Casasa, Dr. Nadsenyonga, Mary Nakafero and John Kisa. Their address is listed on that cover page. Next slide please. Next slide. Go, sorry. Keep going. What was the problem area and what motivated me to do this research? I realized that many researchers have discovered that prediction models, that prediction models are important and they can actually assist public health in everyday's best decision making. However, when I looked into what has been done in Uganda, I realized that we were only depending on disease burden estimates. We were only looking at our burden estimates based on the, we were only looking at disease burden estimates based on our DHS to reported data. And the, and the roughest of our season with any kind of way. Justine, I'm losing you again. Yeah, I think we've lost our game. Okay, I think while waiting. Justine, I think there is something going on with your internet connection now. We can barely hear you again. Yes, can you hear me? Okay, so I, because in Uganda we were only relying, even up to now we only rely with the changes. And the idea is still, still can't hear me. No, I'm up country. We can hear you now. Yes, please continue. You can, you can hear me now. Yeah, that's better. All right. Sorry, I'm actually up country and our up country network sometimes doesn't work so well. But I had reached a point when I was saying that I reached a point when I thought that we could still be having malaria as an endemic in most of our regions, because maybe we don't target our interventions at the right times of the dependent on weather changes. So I set out to see if there is any relationship between malaria occurrence and climate or weather changes. I went out and got data from the House and the Uganda National Meteorological Authority, locally collected data that is collected weekly, monthly, quarterly or yearly, but I rely on cases and weekly data of our weather changes. So we, because other people had integrated their data source, their data and found some relationships. I went out and also started integrating our data to see if we can actually see a relationship between malaria cases and our country. But this time I was only looking at Gulu districts. So we decided to explore what approaches are currently being used by the National Malaria Control Program to predict malaria occurrence. And what control measures and prevention measures have been happening since the year 2013. Then we also wanted to establish the suitable combinations of the very imposed and their time lags for malaria cases occurrence. And we also set out to provide some machine learning models and some predictive models. And this research study used the methodology that we call the mixed methods design, where we included qualitative and we did a retrospective cross sectional data collection and analysis of our local. Justin, I think we've lost you again. Okay, I think we have lost Justin again. Yeah, this is not going to easy. So I think, I mean, let's take the opportunity, waiting a couple of minutes, let's take the opportunity to ask Marie any more questions or raise any issues related to this work and the topic of this session. You probably used the ability to raise your hand so you'll be able to talk. Yes. Hello. Can I continue? Can you hear me? You're back now. That's terrible. There is a slide of qualitative findings. There is such findings that would be good from our research. Or should I go back to research objectives? Okay, there. Alright, so what we found was that our National Malaria Control Program currently relies on seasonality area outcomes and to determine what control measures so they can be like the rain season is about to start. Then later start distributing mosquito nets. Then they start distributing mosquito nets. And then other times they will be like, I think there is some, a lot of stagnant work around some villages that are start doing indoor rescue spraying IRS. But actually, sometimes as far as we shall see from the qualitative findings, we discovered that malaria cases do not happen. They do not increase before the rains. They actually increase after sometime when the rain has occurred. So for them they just say in September we are going to distribute mosquito nets. In January we are going to do IRS. But actually they are not backed by evidence of when actually is the right time to distribute the mosquito nets. Then we also found that Google District in particular uses different interventions and they have adopted the combination preventive mechanism. However, we did not find any documentation of when the different interventions happened. They did not have an actual date of maybe we distributed mosquito nets from this period to this period. So I couldn't determine whether the increase or the reduction in malaria cases as seen in DHIS2 was due to the distribution or to the distribution of malaria cases. Whether it was due to the IPT that happened because there was no documentation at distributed level to show that from this time to this time we were doing IPT. Then we realized that for them they distribute massively mosquito nets in September and October of every year. Not necessarily that they are guided by some weather changes, but it's just their routine program that every September and every October they distribute mosquito nets. Next slide please. So what were our key findings from the quantitative research? We realized that during that reporting time I used data from 2013 to 2017 because before 2013 our DHIS2 did not have weekly data. And the research started around 2018 so there wasn't much data in 2018. And during our reporting period realized that on average of 2,689 cases we are seeing every week in Gulu district. Then we also realized for cases they would see a minimum of 468 and they had a maximum of 11,459 malaria cases for that reporting period. The rainfall was on average 28.15 millimeters per week and then with a minimum of zero and a maximum of 163. We also had a minimum temperature in that region for the reporting period of 19.6 degrees centigrade and a maximum of 15. Sorry, an average of 19.36, a minimum of 15.27 and a maximum of 22.62, that is the minimum temperature. And the maximum temperature on average was 30.68 degrees centigrade with a minimum of 26.72 and 36.70. And then we had relative humidity at six hours. The average was 75.4 with our minimum being 34.57 and a maximum of 123.07. And we had a relative humidity at 12 hours of an average of 51.19, a minimum of 17.43 and a maximum of 151.29. So we set out to look at these different variables which one can actually cause a change in malaria cases depending on how the cases were increasing. We continued merging and integrating. There was an iterative process of trying out, let us mix the cases with minimum temperature. Let us mix the cases with maximum temperature. How about if we do it with rainfall and then we put some minimum temperature, then rainfall and maximum temperature, rainfall and relative humidity. So we were able to finally come up with a model that we can actually say that if you used this model, you can actually accurately predict the likely cases that might happen in the next month or next quarter or next year. And when we took our data into WECA software to test for predictions, as you see in that image, please, next slide, please. Next slide of WECA test predictions for targets. Sorry about that one. Thank you so much. Yeah, when we did WECA test predictions for our targets, as you can see in that image, at the start, when the machine is just starting to learn, it hasn't learned so much about the relationship in the data. So when you look at the actual cases, which is red color, those are our actual cases from DHS to that's how the pattern is. They sometimes go up, at the times they go down. So we started to, we told WECA to help predict the future of these cases. And when you look closely at the beginning, you can see there is great variance between the red line and the blue line. Meaning WECA hadn't yet mastered the pattern of the cases, but as we travel into the way these cases and the climate have fluctuating on this graph, you see that the blue is almost close to the red dot. Now it can accurately predict for us the actual numbers that will happen. And when we actually did the calculations between the differences, we realized that actual WECA can predict very accurately. Meaning if you got data of like 10 years ago, then 20 years ago, then you tell it to predict the likely cases for 2021 as possible. Then we, next slide please. So what was the conclusion from this research? The conclusion was that if we are to have a change in malaria cases that we say are due to weather, the rainformers have occurred in the maximum temperatures. This can be explained by the fact that when it is raining, the temperatures get solo. It becomes very cold for the mosquitoes to breed. So it is normally after two weeks of rainfall when the temperatures have gone high. That's when the mosquitoes start to breed and then they will start going in and biting people and spreading the malaria from one person to another. Then we also think there should be proper documentation for clear dates of the different intervention strategies. If you set out and go on a national rollout of mosquito nets, we should know that in District A, we distributed mosquito nets from 19 January to 30 March. So that when we see a reduction of malaria cases in May or April, we can actually attribute the reduction to the malaria, to the mosquito nets distribution. Because currently the way we just distribute without documentation, documenting the actual dates, we cannot attribute the changes in malaria cases to our interventions that we carry out in the different regions of the country. Then another outlook is that we can include many other contributory parameters in future studies that could actually improve the prediction accuracy of the models. Because in this research, I only considered climatic data. You could add in the vegetation cover, you could add in many other things to improve the accuracy of the prediction model. And then we also only looked at Gulu District because with master's research, I was limited by how many districts I could look at, otherwise the worker software could have been able to analyze the national data. But limitations of masters, they were like, you can't do a national data analysis. So I think if we could utilize the whole national data and given the fact that artificial neural networks are very data hungry, I believe if we used a much bigger data set, we could actually greatly improve the prediction accuracy. Because the one district that I fed in two workers is just like a small drop in an ocean and that could explain why it wasn't so accurate to the point from even the beginning of the testing. Then are you going to national stores? It's data in Excel sheets. Yeah, they don't have a clear national database as is with the Ministry of Health. And I think for someone can take this up to, since we are now in DHIS, I think someone can actually think of creating a national data base similar to DHIS to for the Ministry of Health, where they can use to collect and enter the data from their different sentinels. So it's across the country. Yeah, those were the key findings and outlooks that can actually further this research. And then next slide please. The next slide is acknowledgments. I acknowledge the IST Africa 2020 conference. They were able to review this work and it was presented and they did not only publish it on their conference proceedings, but they also gave me a second publication with IEEE. And that's the paper link in case you are interested in reading extra details of what was done of the methodologies and extra findings, you can find them at that link. I also appreciate the support of the Norwegian government that offered me a scholarship through the high-trend scholarship under Norrid at Mercury University in Information Technology. And I appreciate the School of Public Health for having grounded my public health concepts. They also supported my conference attendance at that time, both financially and of course academic skills. And a special thanks to my Ministry of Health, Uganda and the DHIS to community plus its other partners who make sure that DHIS provide the platforms for us to be able to thank you because without this data that they had collected. Made this research. And I also thank the Uganda National Meteorological Authority for having given me access to their weather data that they had collected over the years without any restrictions. Thank you so much. That's the end of my presentation. Thank you very much, Justin, and thanks for all the efforts to get online and present your interesting work. We still have a few minutes if there are any questions for Justin and her work on predictions. You can either write them in the community practice page or I believe also you could raise your hand to talk in this session room or write it in the chat if you feel like. I thought I have a question for you, Justin. So this work on predictive or prediction of malaria cases, that would be for how much time ahead can you do this? You say you need a couple of weeks or? Sorry, sorry about your question. For how long time in advance can you predict outbreaks of malaria? Oh, yeah, you can predict for as long as you need to look into the future. Because for us we normally have our cases weekly, monthly, quarterly, and then annually. So you can maybe say I'm budgeting for the next week, then you predict for those cases or you could say I want to know how many cases are likely to happen next month. But the fact that these interventions take some time, the idea would be that you predict for the next three months so that you intervene before the cases increase. Then you could also say let me predict for the next financial year. Then we could say in the next financial year according to the predictions it's likely to have very high cases. Then what do we do? We put some interventions before that year starts and then we shall actually not see those high cases that have been predicted. Yes, does that answer your question? Yes, thank you. I was just wondering if also, you know, for what you need, you know, need quite recent data to be able to make predictions. So with the current rainfall and temperature increase today you can predict more or less with confidence a couple of weeks into the future or can you predict with confidence longer than that? Okay, thank you so much. You can predict with confidence in the nearest future and much longer. Finally, I did test, you'll see in my paper, I did test the malaria cases of 2019 when I completed the analysis of using 2018 data. I did test the future of malaria cases and it gave me some numbers. Then I got DHIS to actual figures of 2019 and the prediction was very accurate. Meaning it can predict accurately both in the short term and in the longer future. Okay, yeah, thank you. That answered my question. I think we are actually running out of time so I'd like to thank both the speakers, Justin and Marie for very interesting and good presentations. I'd like to thank everyone for showing up and for your patience with us through some technical difficulties. Again, thank you everyone and have a good day.