 I'm going to kick off the session by giving you a bit of a historical perspective on where South Africa was during the 1990s and then in the early 2000s. What we had is I think a very familiar scenario to most of you that are attending this training. All countries probably experienced the same thing. We had multiple health authorities. These authorities used multiple forms for collecting data and the data collection tools were not at all standardized. In fact, when we did research on this, we did find that there were up to 50 plus data collection tools in a single facility. It was largely a paper-based system and obviously this made analysis very difficult and we found the data wasn't being used and really if you submitted data or didn't submit data, it wasn't picked up very quickly. I'll give you an example of this. There was a health authority in the Eastern Cape at that time that decided they would really like to have a more coherent information system, developed a very primitive, simple minimum data set with standardized data collection tools and had a method for analyzing this that was still paper-based but it gave the facility managers the opportunity to see what the performance was to graph that performance and then take action accordingly. In the process, this health authority decided that they wouldn't submit data to the province at that stage and so it took about a year and a half before this was even picked up and queried. It was also the beginning, shall I say, this simplistic primitive data information system that was started there. There was a parallel process happening through the University of the Western Cape in Kailitsha where they were testing DHIS-1 at that stage which is a non-web based system or was a non-web based system and there was cooperation between the two parties which really led to the introduction of DHIS-1, first of all in the region that this health authority was situated in and then escalated to the province when they saw that really we're getting much more useful data to work with based on this development and so DHIS-1 essentially used the minimum data set and what was being collected in that authority to populate DHIS-1 with some additions and it was rolled out, as I said, to the Eastern Cape province and then because Eastern Cape implemented it so successfully, the national government became interested and DHIS was eventually rolled out to the entire country. As I said before, it was a computerised database, not web-based. It was based on MS access and pivot tables, but it did have some very good data quality features amongst which we had data regression and interpolation. We included validation rules, both absolute and statistical and for us, very importantly, it had a very user-friendly method of customising data sets per facility type, which allowed us to monitor completeness of the data element capturing all those data sets. This data was then extracted into standardised pivot tables using a data mart functionality where and then the data analysis and reporting was done from there and people in general were very comfortable with that system. Together with what was in the database we had a number of other initiatives to facilitate the implementation. The country developed a minimum data set with standardised data collection tools. We had a district health management information system policy with a set of SOPs ranging from facility level right up to provincial national level, which would be the right word, which set out what was required from the users of data. Also, very importantly, the country began to employ information staff in both the districts and provinces to operationalise this whole DHIS1 rollout. What we used to do in those days was when a new form was introduced, data quality checks were done within three to four months to make sure that everyone was interpreting the definitions correctly and so what was being captured in the system was consistent and coherent and everyone was collecting the same and we could compare apples with apples. We also undertook an extensive training on data quality and on use of information to ensure that people understood the importance of their data quality and that really the point of collecting data is to make sure that it is used and it's not just collecting data for data's sake. Another initiative at that stage was that we introduced a pre-submission data validation and we developed a standardised template for people to collect this to do the pre-submission on. I'm going to talk more to this point on pre-submission later so I won't give too much detail here. The final thing was that South Africa has a process where data is locked at the end of the financial year so that no further editing can be done so in addition to doing pre-submission data validation the country had data cleanup workshops in each province to make sure that we picked up on anything that would seriously skew the data before locking then the data is locked and people prepared their annual reports based on that. So why did we stick with DHIS-1 for so long and why did we then decide to update to DHIS-2? Mainly the delay was because of the data quality and analytics features that DHIS-1.4 that we were using at that stage had and DHIS-2 at that stage did not have all of the data quality features that were available in DHIS-1.4. However the compelling shortcomings in DHIS-1.4 meant that we decided that we really did need to move to DHIS-2 not least of which was the updating of standalone computers at every capturing point when a new build was released and also the length of time for data to reach the national department of health. What we had at that stage was that data that for a specific reporting period would only be available in the national database 60 plus days after the reporting period which really didn't leave room for using the data it was more just getting a historical perspective of what it looked like. Okay so now that we are on DHIS-2 we have certain data quality strategies that we follow in the country. Obviously we've gone for a minimum data set because this impacts so heavily on on data quality the more data you collect the more likely you are to have data quality issues. The minimum data set the process is that it is revised every two years and it goes through a great amount of consultation across all levels from national to province to districts to sub-districts and the programs are given an opportunity to input into it. They also rigorously debate what was in the previous data minimum data set and anything that really wasn't used is is kicked out and once they've settled on the minimum data set for the next two year period it is signed off by the Minister of Health. Another strategy we have is daily data capturing. We refer to it as DDC and it takes place at service point level. The reason why we introduce daily data capturing is because what we found when the auditors audited the information system in South Africa which is part of the legal requirements in the country they don't only look at finances they look at your information as well and what they found was that the majority of the errors they found between the source documents and DHIS were related to the correlation of data much more so than with the capturing of data. So daily data capturing involves setting up I'll give you an example a clinic each of the consulting rooms in the clinic where data is collected is then set up at level six in the the org unit hierarchy in DHIS and the data capture will then capture on a daily basis for that service point the data in their data collection tool it is not put into a summary form in a way it is is captured directly from the data collection tools into the database and in this way we cut out the correlation errors we also use standardized public dashboards of which three are developed devoted to data quality and I'll talk to this a little bit later on another strategy is to apply legends to our indicators we use robot colors to show performance and then we also color code blue for any unlikely value so that they stick out from a data quality point of view for data quality we've also developed two html reports one monitors the timeliness and completeness prior to deadlines I will give you an example I will show you an example of this report a little later and also we have the element reporting rate for our community outreach services it is the only service where we were able to really look in detail at element reporting rate rather than than finding other means of trying to to monitor completeness then we have the pre-submission reports as I mentioned earlier that are in that standardized template and these pre-submission reports also meet the requirement from the DHMIS district health management information systems policy which requires the higher level to get give feedback to the lower level within five days of them receiving the the data we do a monthly data quality report that is sent to the national department of health and this gives an overview of the data quality within the DHMIS for the reporting period and another strategy is that we do rapid internal performance data audits where we compare source data with the values in the DHMIS this mimics what the auditor general of South Africa would would be doing when they visit facilities to order okay so what do we measure we measure timeliness and completeness by using the tag a flagging the clicking on the complete button after data entry of a particular data set so basically what we do is we have the number of data sets that are assigned to the facility and then when that data set has been completed in the data entry screen by the 10th of the month it is considered to be timely that's the 10th of the month following the reporting month and completeness is the same measurement but in this case it is by the 30th of the month obviously these are dark proxies for timeliness and completeness because it is dependent on people clicking on the complete button and they could do that without necessarily completing all the data within the data set or alternatively they could forget to click on the complete button so it looks as though they haven't captured their data when they have so in addition to these two we have program completeness and for program completeness what we do is we have selected proxy data elements from each of the programs that are in the data sets and these are flagged and we also look at the number of facilities that are expected to report on those proxy data elements and from this we calculate the program completeness also not an ideal method of measuring completeness but at least it gives an indication that some at least one of the data elements has been captured for each program and the assumption is if they've captured that one the likelihood is that they have been providing that service and are reporting on it. The ideal of course is to have an element reporting rate with customized data sets at this stage we only do that for the community outreach services because it is a defined data set which every single outreach service is expected to report on every single data element as opposed to the program reporting rate where you don't have the customized data sets and you may have a facility that does not necessarily need to report on all the data elements that are in the data sets that have been assigned to them. We do missing an outline analysis using the app and we do run validation rule reports to pick up the violations as well. So I said earlier I would talk to the data quality dashboards the three that we have the one is called the data quality dashboard and it is like an overview of of the various data quality measures that we have in the system and then we have two other dashboards that are specific for use at the lowest level of the hierarchy and that is the daily data capturing monitoring one and the data quality monitoring. The data the daily data capturing capturing monitoring one is so that the managers can assess whether the data at the daily data capturing site is being captured on a daily basis and not left for a week two weeks a month and then back captured because that will defeat the whole purpose of trying to improve data quality using daily capturing and the the data quality monitoring one is really just used to be able to check on those types of data elements that are not suitable to check on from a data quality perspective using validation rules. They don't fit the setup of the validation rules. Okay so the data quality dashboard has as I said like an overview of the data quality and this is an example so we have a favorite that monitors the overall timeliness this example is timeliness for the whole country a province would obviously see the same thing for their specific province and we have the one for completeness then we look at the country as a whole over a 12 month period just to to monitor it's over time and then we also look for the country at each specific province in the country for the last three months if you look at these two dashboards we read them together so this is the raw data that makes up the timeliness rate and the raw data that makes up the completeness rate and you can see we use category option combos for which data sets were actually completed over which data sets were assigned to the country in that particular month so we look at it over a six month period and you can see the examples here of how we use legend sets even for our data quality indicators where where green is we are happy that they're completed on time the orange would represent that they need to improve and the red would mean that they really need they critical and they need to to pay attention to this I also said earlier that we that I would show you an example of our html report that we use for timeliness and completeness once you've selected your period and your all unit it will give you a summary here of the reports that are expected the number that are compliant and the number that are non-compliant you can then go and look at it in depth by clicking on this button to hide the compliant and just focus on those that are not compliant you can also go and filter on any of these columns to refine your your search and you will see the expected reports for a given facility or or a given data set or even a given service point and who actually completed and completed on time and then you can go and follow up on these facilities before the due date to make sure so before the 10th of the month we focus on timeliness and highlight to the the district sub districts facilities which facilities have still not completed before the 10th and we cut our losses after the 10th and then we focus on the ones that haven't completed to try and at least get them to complete before the 30th of the month this is an example of the program reporting rate which is on the data quality dashboard you can see that the levels of program reporting completeness are pretty high as our standard is that we expect 99.5 percent completeness of that particular proxy data element and if they have any less than that they are high-muted for action and then this is the element reporting rate html report that is used by the outreach for the outreach teams once you've generated the report it will give you a summary of the compliance for the particular level that you drew the report for in this case it was for the the provincial level and 94 percent of the facilities 94 percent of the data elements that should have been reported on were reported on you can then further go and look at it by facility to see which of your facilities were the ones that stopped you from getting your 100 percent over this was done over three months period and you can see there were there this facility did not submit everything in april and this one may and these two just didn't submit their reports at all for may you can then also go and look at it per data element to see which of the data elements were the ones that were not completed on time and obviously in this case all of the data elements did not were not kept completely captured by every facility and it was a result of these two zeros in may okay then we do the missing and outlier analysis for our pre-submission data validation we do a comprehensive missing and outlier analysis and that is put into the pre-submission template as an annexure and it is submitted to the lower level for action on the missing data and to deal with the outliers but in the data quality report that we do for the national department of health we have selected the priority indicators for the country and we really just focus on the worst 15 facilities over the 12 month period and under review just to give an indication of how that particular priority program is performing and and how bad the missing and outliers are then for the validation rule violations this we really only do as part of our pre submission data valid validation this is an example and extracted from the validation rule at the pre submission data validation template and for each of the areas that we assess it's it follows the same principle but in this case it's for validation rules where you explain what the validation rule analysis is about and why we do it then you look at where it can be found and then we explain how you run the report and finally we give an instruction on how this report should be downloaded and saved as an annexure in the pre submission template that goes back to the lower level for feedback for training and and guidelines we we do have an online training for daily data capturing that is is fairly popular and for this there is a learner manual explaining what it's about why we do it how we do it etc we also have a data quality module in our DHIS2 foundation course with with its learner manual and we have a specific guideline for dashboard maintenance because it is so important that the dashboards are kept up to date in order for them to be relevant for the use of information and to to monitor the data quality that we have so we like them to be current and and so they are maintained on a monthly basis and with every NIDS review these guidelines are also reviewed to add in any new data elements indicators that that that have been selected and to remove anything that is no longer relevant I'm now going to just show you a few slides on the progress we've made over the years since we introduced DHIS2 this is with daily data capturing obviously South Africa does have the same limitations as many of the other countries in in Africa and perhaps some of the eastern countries where we have challenges with connectivity at our facility level so when we started daily capturing in in 16 17 financial year we have made steady progress through the years and we currently at 44 percent at now year that has just ended the 1920 year year but there's ongoing pressure to to get more and more facilities capturing daily and and so there is work being done on trying to improve the connectivity at facilities but it is a limiting factor our progress with our timeliness and completeness we started at around 39 percent for timeliness and we currently around 76 percent and for completeness we started around 52 percent and we now at just on 90 percent for that for program reporting you can see that in 17 18 we really had no one on target and it was a lot better in 18 19 and then in 19 20 we we had a lot more of of the green so I just to conclude then we have obviously made considerable progress with improving data quality since we introduced DHIS both DHIS1 and DHIS2 but maybe the take home message is that really it is very important to do monthly monitoring and reporting on data quality otherwise it tends to stagnate and sit at a certain level or even regress so we are constantly looking at our data quality monitoring it on a monthly basis and and thinking of new ways to to measure our data quality or or improve the measures that we do have in the system and so that's it from South Africa thank you very much thank you so much Jenny that was a really wonderful presentation uh really clear I would like to invite anyone um from the participate