 Okay, so in this session, we're going to go over a couple of things. So I'm going to start just giving a bit of background on this whole concept of, of metadata management and how we're kind of defining it as a term. I'm going forward, and then I'm going to get, we'll have to come up here and he's going to give some background on some of the guidance and documentation we have available. Sam Nelson is going to go over a assessment tool that he's designed to help us actually figure out, you know, what some of these problems are. And then we're going to invite Sam and Elmerie to come up here, respectively, just to talk about some of their use cases. So we have an experience sharing from Lau about what they've done with their system it's been there for a very long time. Sam will cover the timeline with you, and also from South Africa and Zambia, Melissa and Elmerie will come up and share their experience from there. So in a bit of time, we'll go through some Q&A. And once again, as I mentioned, anyone online, please feel free to join that link on the community of practice and post your questions there. All right, so just so I can kind of put us on the same page in terms of what actually constitutes a metadata issue, right. So we have kind of clear things that we kind of associate with metadata. So these, these configuration challenges, duplications of category options, problems with geography, something wrong with an option set that causes it not to load. Okay, these are kind of the classic examples that we probably associate, but then we also have procedural challenges that are not optimized. So for example, naming conventions, problems with indicators either duplicate indicators or indicators with the wrong formula issues with user roles sharing, incorrect use of production development and training systems. And a lot of these are linked to kind of overall policies and procedures that are just not implemented very well, and lead then to a number of subsequent problems in the system. We'll cover some examples as we go through this. But just so we can be on the same page is not just about the configuration, because often a lot of these procedures and principles lead to these issues that we see later on down the track, but also when we're looking at systems that have been built over a very long span of time. All right, so some brief, brief background in terms of, you know, what caused us to focus on this a little bit more. Right, so we know it's very easy to customize the HIS to either through the user interface or just, you know, using the API or something else, right. We can bring in metadata quite easily and modify the configuration. And that's by design, but then we should associate some procedures with that in order to contain this a little bit. Right. So we now see implementations increasing over size. You know, implementations have been going on for many years, 10 years, 15 years or more. Okay. And there's not as much consideration actually made in terms of this long term system management system maintenance over time. We also see that in some systems that we work with. There's a lot of administrators in the system and whether or not they should have that level of access. You know, sometimes it becomes a little bit political who has that access. And then, you know, they're able to make these modifications in the system. And then this can cause some challenges that are on down the track. And as a result, we see the configuration quality kind of degrading over time and being very difficult then to subsequently manage. All right. And so what can this cause, right, and you guys probably have some idea based on your own implementations, right. Next listed three categories of users, but you can think of some other issues as well. Okay, but for the end user this really causes some havoc in actually reviewing the data and that's the whole purpose of getting the data into the system. But when we have this very poor configuration challenge, then you know, we can't really create the outputs we need sometimes they can't find what they need easily. Sometimes they don't even have access to what they should be having access to, to create the items that you know for whatever charts and maps that they're working with. There's a lot of challenges with data quality. So in particular as a configuration is updated over time, you know, you might have old data sets or old tracker programs, or whatever it might be, and you're not able to get, you know, access to that legacy information for whatever reason, and you have problems then with your consistency of your data because there's multiple concepts represented by different things. And it just becomes very difficult to actually use that data the way it's meant to be used. So for admins there's also another problem right, they have all kinds of extra metadata maybe that they have to deal with. There's all kinds of challenges and upgrading DHIS to versions, just in the other room for anyone involved in the kind of this tracker at scale session. They were saying a DHIS to upgrade should not be a big deal. But you know we all know that this is a problem right. It leads to many problems when in practice when we're working with it. So basically what we've kind of done is separated out, he's working with this metadata into two pieces. One is reactive and that's probably what many of us are used to right. We see a problem later on down the track, and we try to figure out what's going on and fix it. Right, it's our most kind of common case. And then we have this proactive issue, which is you know where we can try and deal with this more procedurally that some of the other issues identified through procedures through training through implementation support. And we're seeing this come to fruition a little bit more and more as we go down the track, but we're really still seeing this first part being kind of quite problematic. So when we're looking at this reactive process, you know what are we actually reviewing so we're trying to review the quality of a specific implementations quality, you know, of their configuration right and this is kind of a very wide ranging definition. But the idea is to break down this process into smaller components. So we can actually figure out what we need to do right so what do we need to actually fix what is wrong with our system. Do we have something we can either quantify or measure or check, rather than just saying everything is broken, we need to fix the system. So we have a couple processes that we've introduced. Okay, one is the data integrity check as it's called for metadata and brackets here. Okay. The other is the metadata assessment report and Jason's going to introduce that it's a new tool that he's helped us to produce. And the other is a manual review. Okay, so we do have some tools to help, but some things you know, especially when it comes to policies procedures, things like naming maybe. Okay, some human intervention is required to really check this and make sure it's correct. So the data integrity check. For those of you who are familiar with this tool. Okay, runs through a number of different checks. And actually Jason and Olaf and some of the team are working on kind of refactoring this to make it perform a little bit better. For anyone who has worked with it. Some of you it might not even work in your system, you know, for various reasons it causes a lot of problems as it is right now. But hopefully this can be changed in the future. And you know, it will run a little bit better. I'm Jason I don't know if you have anything to add. Yeah, no, we'll get into this more. I mean, I would say even that this, I think this might works. Yeah. You know, this functionality has been there for some time but it's kind of misnamed it's not really data integrity. But what we're talking about here is metadata integrity, which are a little bit different. But we'll get into more the nitty gritty details of what we've done up to this point and where we're going to go in the future. Okay, and then this is just some screenshots from this assessment tool that Jason has developed. It does use are at the moment he will talk about it more demonstrated and give you a bit more background but we have, we've been trying it out in some places we shared it with some of the his groups to try it out as well, and found some very interesting results from this but Jason will discuss this in more detail. And then we also have a lot of manual review processes. Some things it's really hard to kind of pick up via a script or, you know, some type of tool. I've listed one example here that might be different, just because you know the English spelling is completely different. These are our category option combinations are lovely mouthfuls of terms. Okay, and you can see here we basically have two duplicates of five years and above that type of stuff is kind of hard to detect, because spelling is completely different. And you know there's various reasons that it might be hard to detect right. So sometimes manual review is required. So here's an example of an assessment and I think actually this is from Olaf's work, focusing on kind of the analytics portion right, and this is where you know you can run into some challenges from an end user perspective. This is an example here we had so many dashboards, over 6000 dashboards, nearly half of them had nothing on them, right just blank dashboards, you know, about 90% of them were just private so no one could even see them, other than the person who made them. You just had a very small proportion shared with actual users. Right. And you know their use was very limited. And then there were you know all kinds of other issues with the public favorites with data element groups, you know despite there being so many you can see here, nearly 14,000 data elements, nearly 4300 indicators. Right so there's just an overwhelming amount of information to then filter off. And you know a lot of this could be solved by fixing sharing for example. But you know, as it is right now, you know it's very hard to find what you need. How will you find the dashboard you need how will you find the data element or indicator you need to actually create the outputs that you want. Right. So we can see how this kind of configuration is not a best practice in terms of also our analysis which is really where we want to have some impact on. And then as a second example, this is more from the administrator side, just reducing the amount of actual pieces of metadata that has to be dealt with so in this scenario we actually just kind of blew up the database and started over and reduced everything as much as we could to the point you know we're about 50% of everything was basically gone. And this just made things a lot easier for users to work with right. And also from the administrator perspective you're not then managing all this unnecessary metadata basically and having to figure out what you need to do with that. Right so I posted some resources here and all of them come up and go over that in a moment. So we have the second process. Right, this is our kind of proactive process. And you know this has been a challenge of course. So when we talk about things like core teams are building capacity and things of that nature. At least for the last two years or so where it's been remote it's been particularly difficult to build configuration capacity. And I think we all have some sense that that's going on, but this is actually a bit beyond you know just how to configure inside because that has also created some trouble, because it is so easy to train people to make changes inside of DHS to, and then we often give a lot of people permission to do that. Without really kind of thinking about the long term ramifications of what that causes, then you know we see some of these issues that I'm kind of mentioning here right. So we get a lot of people who know how to add data elements indicators, things of that nature. Right, but then you know we don't have a lot of coordination for example. We're working on forms but there's a lot of administrators are not talking to each other and you get a lot of duplicates in your category options. So things of that nature are much more from the procedural side, and are not really discussed in as much detail as they could be. Right, so this is really what we're trying to improve on a bit. Now we have some examples available we're really trying to work on this really happy to receive feedback, particularly on this area because you know we're still kind of outlining how we can work together to do better on this. So this is as I said, beyond just updating the configuration right so you're assuming that people have generally a strong command of already you know working with DHS to either through the API or through the user interface to add or configure metadata. But what you're not making the assumption as is that they're doing it correctly, right, they can do it but you know what's actually the effect on the system itself. So there's a couple resources here of examples that we're working on SOPs and some training exercises and things that we've done with some countries that you can access here. So I'm going to ask Olaf to come up. Just go over some of the just go over some of the documentation that we have available. So, I'll just give a very quick overview of what we currently have sort of documented in this area in our documentation before handing over to Jason, because I think it might not be so easy to find always what we put there. So hopefully you're familiar with the sort of overall DHS2 documentation page. And in within the implementation section of the documentation which we've organized over the last few months. We have been starting to improve the documentation around the metadata integrity and quality. So how do you assess how do you improve metadata. So we have a lot of stuff which is almost ready to be put there. So in the coming month, there will be more new stuff here. What we have so far is this overview, which is quite new, which covers a lot of the things that I did talk about in terms of the reactive stuff for the moment. So we have some introduction and then we have started on some guidance around how do you manually review your metadata configuration. So what are some of the things you should look at, for example, naming, how do you verify your indicator definitions. So some of the duplicate metadata objects, verifying your sharing settings, for example, which we started documenting here. So that's sort of the first part. And then the next part is more like a user guide for what Jason will present in a few seconds with his new metadata assessment tool. So on the documentation page, we essentially have an introduction showing a bit how the tool is used or what it's used for. Sort of the installation part of it is on GitHub together with the tool. And then we have basically pulled out all the different metrics and recommendations that are embedded in the tool here as well so that you could sort of review this manually. So in the different areas like looking at the aggregate data elements, we have the different checks. Yeah, looking at data elements that are not assigned to any org units, which is a sign that they're not actually being used for any data collection. And what is the recommendation, etc. So all of that is documented here so that they can sort of be used independently of the tool that Jason will be presenting. That was it. So if you're interested, keep an eye on this and we'll be adding more stuff that we have almost ready in the coming weeks. Yeah, thanks, beloved. I've seen a little bit about and I'm sure who has used the data integrity checks in DHS to. Yeah, some people have you had problems with it. Yeah. The current checks are written in Java. So basically everything is a memory. So that has advantages and disadvantages. The disadvantage being that on very large systems this can consume a huge amount of memory and really slow the machine down. There's a couple different problems with it that we've tried to address. And what I'm going to show you today. Just very briefly I'll go through. I've added these to the, to the main slide deck, but we've tried to put together we've we've collected a lot of these kind of problems over the years. Things that were in the data integrity checks and things that we had developed based on experience in the field based on giro tickets based on what people have fed back to us about what are problems. We've tried to combine a lot of that knowledge that we've assembled over the years into a common set of checks that would be applicable across. In addition to what's there but across all DHS to instances. We have rewritten most of these checks in SQL one of the reasons that DHS to was using these kind of Java based checks previously was that we were not bound to any specific database now. We were not really supporting postgres so it became a lot easier for us to implement these checks directly at the database level as opposed to relying on Java to do the checking. One of the other issues that we tried to address as you noticed from what Olaf had presented was some additional guidance as well. If you saw if you've used the data integrity checks it tells you like. What is this problem, but you may not really even understand what the problem is or what is the impact of the problem and if you do have this problem. What do I do about it how do I go about rectifying it. So we've tried to include also some descriptions and guidance as to you know what is the actual impact of having a duplicate category option is that something I should worry about. My geometry is invalid for an organization unit. What is the actual impact of that. Similar to the previous version or the current really the previous version current version I'll get to that in a minute. We've included the ability to see the details so you know when you get these duplicate organization units maybe that have the same name. You'd like to be able to see exactly which ones they are so that you go back into the system and then rectify that. Most of what I'll show you today has already been integrated into the core. We're missing a front end. We've got a kind of hacky. We've got a front end at the moment that we've written in our markdown, but this is available from 237 or 238 I'm not exactly sure when which version, but hopefully over the coming months and we'll get a front end on top of what is available through the API. So now what we have written is in arm using our markdown, we've kind of given you the ability to run this met metadata assessment locally you could even automate this on a server if you wanted to. It's not that difficult if you have some experience using our markdown you can basically install everything on your laptop include the credentials to a DHS to server. And then the the our markdown report will upload all the necessary metadata as SQL views run through all the SQL views what you get back is a report essentially. The idea of being eventually that all of this would be done through the native DHS to use your interface through an app as opposed to this, this our app basically. Yeah, and it's important to keep in mind that this was really meant to be a proof of principle. The implementation team, all of myself Nick worked very closely with the back end development team to implement the solution. So they handled all of the, you know wizardry of how to do it inside of Java and then we tried to write all of the guidance and the SQL to identify these these metadata problems. So, in terms of the actual report. All of this is available on on GitHub. And really the only thing that's required. And I won't go into the details here. But again, if you are familiar with our markdown. You can easily should be able to easily reconfigure this yourself. But all we do basically is to set some URL and username and password this can be set inside of a local profile, and then you execute the report doesn't this can really be any arbitrary server. We've run it against, I don't know, half a dozen or so actual DHS to instances. There are some limitations on versions. That's one of the problems with SQL is that if they're database level changes then some of the SQL which we've written doesn't work basically. But we will work closely with the development team to keep all these checks up to date. So going forward from roughly 236 is where we started. And that's what this version of the report is based on. So going forward, we will keep these up to date with the development team, basically. So just to show you how it works inside of your local are marked in your studio or whatever you're using to create the report. We can just hit knit. And then we should get the report runs relatively quickly. There are some checks that run very slowly. And I'll explain that in just a minute. So what we get back then is listing of all of potential problems and what we tried to indicate here is a general theme like categories. What the particular issue is here we've got one a category option with no categories. So, you know, that's probably not really a big problem. We've also tried to indicate what the level of criticality is warning like this is probably not really a big problem, but it's probably not something you won't hang around your system, either. Sorry. In addition to that, then a relative percentage just to give you an idea of the total number of category options. How many of these violate this integrity check. And then we could even get down to the details. This will download a CSV file, which you can save to your local machine. And then this allows you to go back and like, we these out of the system basically you know how to I want to resolve this because you may not always be able to make a decision directly as to what you want to do with this. Some of these are a lot more difficult, like category options with more than one membership for a category. You can read the guidance to understand exactly what that is. It's there. But what we've seen is that a lot of times you have data attached to these pieces of metadata basically. So you can't resolve the problem right away, you may have to go back consult with the programs consult with other people to determine, you know how are we going to go about resolving this. One of the other things that we included in this report that we will hopefully include in the DHS to front end app when we get it is some profiles on users, because this is part of what Olaf was discussing in terms of standard operating procedures. As you can see here, number of users who have super user. I'm not sure that this is correct because this is from the play server will have to check the code, but we have seen this in other systems where, you know, even if you go through and fix all of these problems. If you've got 75 people that have super user access, then very likely you're going to go back and revert to a state where you've got chaos in the database. So it's not enough just to fix it, but you've also got to think about other things in terms of who actually has the ability to to alter the metadata. Yeah, and we've included some some other pieces for, you know, to give you idea particularly about disabled users, how often are people logging in how long since they have logged in. And in addition, this is what Olaf had showed you, but this is kind of integrated into the report itself that we've got the level the number of issues and then some description in terms of a recommendation about what to do. We haven't got to the point where we've actually given you a solution, like how do I actually fix this, like categories without a category option. Well that one's relatively straightforward, probably, because you can go into the maintenance app and delete these extraneous category options, but in many cases, like, what is the one that's. Yeah, category option combinations with disjoint associations. It's hard one to explain but basically this happens when you create a category combination, you get a category option combination, and then you all through the category combination. So you maybe add another category, you end up in the system with male, let's say male 50 male under 15, you add another category called. I don't know something else. Yeah, HIV positive HIV positive positive and negative for instance. So then what happens the system creates a new category option combination male less than 15 HIV positive HIV negative. But it's still associated with the old category combination. This can be very, very difficult to resolve because you may have data attached to that. So we tried to indicate, you know, in general what is the recommendation but what we haven't given you is some like magic wand to to fix the problem. That's kind of the next step. I did just lastly also want to show you if I can get rid of this. Let me go back to the screen. This is young is the developer that worked with me on this, and he did a great job documenting this in this particular ticket 10763. And let's see, just to show you that this is has been integrated into the core actually. So all of the checks that we ran through are that were developed in SQL have been integrated into the DHS to core. And you can see we've got things like legacy. These are the Java based rules because a lot of those. We can't do everything in SQL, determining whether a program is a program rule is valid. It's something that is probably not for mere mortals to write SQL for. So some of these we have to do programmatically inside of Java. Some of them, however, are are not. But this is where this is going to show up. These are going to show up inside of this data integrity in port in point basically. And then eventually when the front end team get some time, they will put a nice little app around this so we may have a mix between the SQL based tools. checks that could also be run directly on your database, as well as these Java based checks that are still going to be in the system. Great. I think that's all I had any questions or I'll turn it over to the next presenter. You can just go to that endpoint API data integrity. You just want to run it on the API. Yeah. But I would definitely suggest if you're running a later version of DHS to that you consult the documentation. One of the big advantages now is that you can run these checks independently. You don't have to run all of them. That was one of the problems with the previous version is that you had to run absolutely everyone and that could put a big strain on your system but you may only want to run like I'm going to deal with this problem today. So I'm going to run this check and I'll try and fix it I'll go back I'll run that same check again, rather than having to run everything. Yeah, the entire report basically. Yeah. So, if you can blame me, it was kind of my call. Right, quite honestly, and we can debate about this, you know, and we we we tried to feel basically critical was like this is very likely going to cause an error in the system as an example. One valid organization unit geometry can actually cause the analytics engine to fail. So, that can happen. And there can, there can also be other situations where the metadata misconfiguration can actually cause an error. So those types of really, really problematic things were like, you need to fix this right now. This is very arbitrary. Some things might be, you know, those critical things probably we shouldn't really debate about like those need to be fixed. Other levels of warning like you go down, we also have information. These are things that shouldn't be that don't need to be fixed, but are there for, you know, your your own information, but number of organization units for instance, that's not a problem per se. Things were meant to be like, it's probably it might be an error. So you should take a look at it but I would take that I would take those those levels of criticality with a grain of salt. But generally, things that are critical I would say definitely need to be fixed things that are warnings are like, you better take a look at it but it may not necessarily lead to any problems. So we need to go back and review that if people have strong feelings about it, then then we can certainly, you know, think about maybe changing that classification. Probably not. Probably not. I don't think that this would be compatible with anything below 236. No. Yeah. Well, I don't even think 2.39 maybe I have to I haven't even spoken with the front end team yet so it's it's on that's TBD, I think as they say, at this point. Yeah. I'm going to maybe not take your question and allow the next. Okay. Okay. Right. I think it can be run through the API and schedule I think I think what the piece that we're missing is the front end to be able to visualize it. That's that's the piece that's missing but in terms of actually being able to run the checks and view them through the, through the API that parts done the part. Right. Yeah. Thanks for that. Hello. Hello. Okay. Okay, thank you. Hello. My name is Sam I come. I can come Sam. My works for Hips Vietnam but basic allows and also supporting allows to do the implementation of the HS to and also coordinate with the Hips Vietnam team. So now we need any support or any implementation support like training or especially the, the cleaning or maintaining of the database. So okay I will just sharing this experience because I've been using the HS to for almost eight or nine years now. So, just a bit background so you understand where we're coming from and why do we need that. So, we started using the SSU since 2014. So, basically, we started with mostly just aggregate. So we have like MCH IPD OPD, then MCH we also have many programs like NC delivery nutrition API. We also in 2016. Mali also would like to join. So they also join with using aggregate and events. Also we expand with HIV. And as you see, we are using, we were using aggregate for a couple of years just for us to know how to understand and get comfortable with with the HS to first so we start simple we use aggregate and then once we are comfortable we switch to be introduced tracker. Sorry. Sorry, just a minute. Okay, share your screen. No, you had your presentation through zoom right. No, it's just this. There. Okay. One more. Yep. Okay. So, maybe I'll just talk a little bit about this. So in 2017, we introduced the second instance, because we figured out is better to separate between tracker instant and then aggregate instances. And also, in 2020, we have another instance basically for COVID. So just for the performance and all. So as you can see we've been using this a lot and there are many programs so for now we have almost nine health programs using this throughout the countries and the country, the population is around 7 million people. So as you can see, we have many programs and the first mistake we made or not really a mistake so we were really excited to have many programs to see data. So we made everything public. And then we later figured out we should not do that. And not everyone would like to see everything especially user at the lower level that they want just want to see that program and maybe the higher level or central level they should be able to see. So that's the first thing that we found that once we have many programs we need to better organize so then we have some, for example, the data sets. And that we used in 2014 once we have, because it's based on the paper. So once they get used to this tool they know how to work the, sometimes the form should be updated to to use the new technology. So we have several data sets that need to upgrade especially the mch we have moving to 2.0 and also API that we made another data set, especially for API. And reducing that we made many new categories then some categories is not relevant anymore. We need to also introduce the new options and sometimes many programs need to use the same option but we created more especially like genders or sex and this kind of thing or as groups. So those are the things that we created. And then, because of new program choice as well, you also try to limit the sharing also the settings. And also, other things like indicators. Now we have sharing indicators some like populations we have many version of the population, and so on and so forth. And the last one is also we also needed to to upgrade the DS2 software or version, because we did not upgrade every version so we just buy the right one and then we do the upgrade. And I can say that the problem we have is more on the, the try to balance between the, the design and the speed sometimes some program they would like to grow our as fast as possible so we didn't have enough time to test everything or to plan how we would like to make the program. So, kind of a dilemma we need to choose that we can have a good design but also we need to roll out as fast as possible so that's why we have so many things that we need to clean up afterwards. And this is the approach that we took. So we didn't have the tool at that time and excited maybe to try the tools. So, at that time we were just did the assessment so we put everything in the blueprint. So we download all the schema or the structure in the blueprint or the Excel, make it out of the blueprint and then in the blueprint, we have different sheet for everything so just for example the metadata indicators another sheet will show you another example in the next slide. So, once we have the blueprint then we sit with MOH and defy each one that for example user, which user should see what and which kind of data they should see or who should see those kind of data which that board they should see or the admin level what they should be able to see. So once we have the pie of okay grouping then the team will try to convert that grouping into to the script so that they can apply into the HSU. And then we also defy the test cases and then try to run the case and validate if is what we need what we want to have and also invited MOH staff to do the test. So all of this we did it in a separate server we had another server for development so we did the test we evaluate and then we adjust the blueprint and also try to make the blueprint okay before we apply that into the production. So this is just the example of the grouping that we use so from the column D is what we just pull out from the database so this is what we have and then who can see at the moment and then from the column E to F is the new things or for example this one new user group which user group should see this and then we kind of try to defy and then using the script to make this into the DSU configuration. Then I think this is the last slide so we have done this a couple of times so since 2014 so we did one back then when we were using 2002.227. So these are all the lessons we've learned so far so the first challenges is about to train. So because once we update or one especially the new version or anything we do anything with the database we need to. If not training we need to prepare the document to send to the user so especially at the district when they see something different from what they used to see it and they will ask what happened. The second is also about the planning like I mentioned once we when we were preparing the grouping we spend a lot of time defining and also getting everything and talking to different program to get it right. And also label intensive team or developer team also working hard to make the script once they make the script they also have back and then they also try to fix this. And the last one, even though we test and we done many testing but there's still bug after that so once we apply in the production there's still some bugs so also need to prepare to fix those bugs. So those are the challenges but the lesson learned what we have done that we like to share so the first one is also about just to make the documentation properly so we to document every step so we. Once we done the cleanup we can tell to the user what have you done and what are the new changes. Then the timing timing is also important because most of the time we need to make sure that after we clean up we don't need to use data to make the report on not big reports coming so make sure that you plan that because there will be some bugs and. You will not know so make sure that you don't after you did the upgrade and you, you don't want to use it. You can have some time for testing to fix up and then. Yeah, and also take advantage of the holiday so like, I'm not sure in other countries but louse we have usually allow new year so it's quite long so we usually we took that opportunity to upgrade the server and also to do the cleanup. And maybe I just keep that one and the last one. From the experience we didn't delete anything so we just keep try to keep it but we also just to move it to a different group and then make it private nobody should see or we just add some prefix so it doesn't show on top. And then we also try to keep back up before and after the cleanup. Yep, and I think that's all for the chair. Great. Thanks. Yeah. So, you know, I think this is a good illustration from a country implementation perspective about how do we keep things clean. So, you know what I what we saw in the metadata integrity report is how do we identify things when it's not clean. We saw some great examples there of. Yeah. Through standard operating procedures and discipline. How do we maintain the integrity of the database. Now we'll. Let me set up. Were there any questions for the previous speaker. Yep. One in the back. I'll we can give you a mic. Thank you. This is just a more of an implementation. What, what is the timeline and a budget for this and how often do you recommend doing this over. Are you talking about the review of the metadata cleaning up and how often to and then how to maintain consistency across instances. Yeah. I don't know. Nick and I have some stories from a country that we will go unnamed where we spent a lot of time. Whereas Olaf he disappeared. Yeah. You know, it's really hard to say. I would say, you know, depending on where you are in the implementation. If you're in the early stages, I think implementing things like, you know, standard operating procedures dev testing staging production type of implementation. Never doing anything through the user interface scripting absolutely everything, you know, some of the guidance that we've included. That's your best bet. If you've been running an instance for 10 or 15 years. That's probably going to be a very different situation. So I don't know if we really can answer that. I would say that. You know, it's, it's, it's a bit. You will not be able to solve all of these problems in one day. But I think, you know, you can start chipping away at it and determining, you know, what is the most crucial and what is maybe not the most crucial. Nick. So, so yeah, like Jason said, you know, in terms, when we've done this with ministries and presented the plan, they're they're quite, you know, taken aback by the length of time that we propose, you know, and Sam didn't said he did it over a holiday, but I think there was probably a lot of background preparatory work. It's a bit of time and it's also a bit dependent on the country capacity for this one because it really requires them all those lists all those blueprints you know you really need. Ideally need approval from them before you start fiddling with things. Now sometimes we're not so good at getting that. You know we kind of say all this is best let's just do it. But ideally you would want, you know, some of those inputs. So, you know, yeah, I think it is a bit hard to say but we could probably document a bit more, because we do have quite a few countries where this has been done, to some extent, these guys will share another story an example. So it could probably provide a bit more inputs in terms of, it's different, but we could probably come up with some inputs that would allow you to come up with some estimate or some type of calculation, you know, to do it a little bit better. So that's a great idea I think that we could pursue a bit more. So thank you. Yeah. I can save some more time for discussion at the end. I'll turn it over to Elmerino. All right, thank you. And just to acknowledge our attendees online as well. Hi, I'm Lisa Andrews, digital transformation manager for his South Africa and I'll be presenting with Almedie class and who is our engineering manager for his South Africa. And we're just going to go through some lessons learned while implementing the DHS to in in Southern Africa. So, yep. We're not going. Unless we on another screen. Yeah. Okay, so we'll be focusing on base practices specifically and implementation in South Africa listing those best practices and then really focusing on the impact on Southern Africa and those surrounding countries. Almedie then we'll go through the rapid assessment that we've conducted as well within just some of our experiences as well in Zambia. As usual, there are always some recommendations. So, our best practices within South Africa would really a combined effort which we've had with the Ministry of Health, and I'd love to acknowledge both our members here today. So, thank you very much for being present. And I'm sure if there are any specific questions around implementation in South Africa they would be more than willing to text them as well. So, when we're looking at just some real understanding of lessons learned to best practices, we're always starting at our minimum data set creation. And in South Africa, we've got a periodic update of our national indicator data set which we then normally just shortened to nerds, and it happens every three to five years. And our indicator to data element ratio is quite healthy. It's about one to 1.8. We also then work with a technical working group, and we really involve quite a lot of partners. So it's not just one person making this decision decision around what would be important for us as a country. Through that technical working group, you have customization which needs to happen that customization is normally done six months before implementation. And the reason being so you can train people, but also make sure that you've got this longitudinal data also then just updated as well. During that six month period, we're looking at printing, making sure that you're actually training your health staff as well. Within that period, we've got our preparing our registers, and you know you'd think that is a bit strange, but obviously we know sometimes it is when you're printing registers or some issues that you were experiencing so we really do focus on those as well. And for us specifically we're looking at our population figures being then provided to us by statistics South Africa. I'm sure every country is presented here as well you've got your national body that provides you with your levels of information. So we've mobilized our district populations and previously what we had was that the stats South Africa then was only able to provide us with local municipality data and that was in excluding metros. So in our last iteration we've actually then been working a lot closer with stats essay to be able to provide us with that data. And this sort of coming coming online quite officially within user crude methodology to establish this facility population, and then we've also developed a app specifically then to assist us with that in DHS to. So, and moving along to another important key sort of principle and I think we've already discussed that at length as well would obviously be your metadata management and naming conventions making sure we actually all agree on naming conventions keeping it standard making sure we're, we're just within that process integrity checks, integrity checks then if you're also checking just the picture on on the other side there normally it would be zero and you'll notice then that they're on that there are zeros there. One of the most important things I think if you ever in the M&E space or research space I intention to detail, we're not going to be very attentive to detail then I mean it is a bit of a loss, a loss cause. So I'll hand over to Almaty and I'll take us a bit further into meat of things. So, yeah, thank you. Thank you everyone. I hope everyone can hear me. And then just for us to continue. So, what is really naming convention practices. And then just on every country, we're not saying that you must name certain things in a specific way, as long as you have a guideline for good naming practices to name the same thing consistently so that people understand what it is. Pay attention to spelling because if you make spelling mistakes you often get duplicates, double spaces between words and those type of things are just small attention to detail. Nora's bugbear never ever use number off or percentage off in front of a data element or indicator. I'm sure Nora you're very happy for us to say this. It's sort of redundant. You are reporting in aggregate the data you are reporting numbers so you don't have to say it's a number of something or a percentage of the other thing is that if you do that, then when you're on your analytics side, then everything is sorted, starting with number off number off instead of starting ABCD but everything will start with N or P. So that just helps and we often get those things from programs with a number off and we we often help them just to reword those more appropriately for them. Usually, these things are very important to ensure that we also match previous indicators with new indicators so that you get that longitudinal indicator report. Then on master facility management, it's important to maintain a master facility list as one point of truth for facilities where different systems roll from and to use a standard naming convention in that approval of the any facilities before they are being used or to any changes to the master facility list is important. And again, attention to detail was spelling to use the use of GIS coordinates and to review that they are correct. So, in South Africa we first managed the master facility list in DHIS for a while, but it is now managed in another location, but we still as an as a registry, and what we still use one source of truth for the master facility list. Then on user management, we had long and hard headaches on trying to understand what would be the best way for us to manage our users. So we decided to define user roles and user groups based on function and not on position. So in other words, we don't say that you're a data capture and therefore you can enter data and you should clear your cache and you can see reports. We rather create those multiple roles like data entry aggregate data entry tracker data entry clear cache and those and we allocate multiple roles to one individual and then also user groups. We find that works easier for us to not have too much confusion about what the user roles are that we need to have. We have a policy in place to discourage staff saving passwords in their browser. There are really a lot of cybersecurity focus that we have and it's very easy for your database to be hacked if you save your password in a browser. So it's not the easy thing to do that, but at least for our his stuff and we rolling that out to to government as well. The user access to DHS is only granted through a registration form approved by the person's manager, and that registration form is also loaded into the system on your in the DHS and then when we do audits we can actually retrieve that form and say why does Melissa have a specific role and who approved that and we can understand that. We have a password expires date set to three months and accounts not being used after three months are automatically disabled. It's nice that there's now a task scheduler that we could do that through. We previously just did it with a scripted brand, but we will use that. Clearly now, we also do monthly user reviews so the user form gets printed out every district must review their users monthly and sign off those users that they should still be valid users that should have access to the system. And they must disable users who designed from the service. So when the auditor general comes and ordered the system, they would for instance ask the HR who designed and they ask us to prove that we have actually disabled them shortly after they or on the day that they have actually resigned. So that those are important things that you don't continue to give people access who might be actually crossed with you that you know why they have left your employment and you need to stop the access. The two factor authentication is something that we are trying to implement as good principles of authentication. One issue that we have is that not all users has featured phones to be able to scan a QR code from the app. So we would really like to see how can we perhaps have an email option for email or an app verification. So the key part here is to have standard operating procedures for user access to health information systems and our auditor general strictly audits the department for this and really we don't have findings on user management. So these are very good practices in terms of war upgrades. We have developed an internal testing protocol which looks like a bit like the picture on the right where when there is a new version that we want to implement we have different sheets like you could see the login page or the dashboard and we try to understand. Okay, what is each of the functions if you log in then what must it do. We write it all up full told us that actually that is available from Oslo so we will be helping them testing a little bit more. But we then have at least a number of users, including some of the provincial users are testing these functionalities and then we would. And so we don't just test the new functionalities we test existing annual functionalities it's very important for us that whatever has worked before does not break when we when we upgrade so we would delay and upgrade if there is a serious part that we know that we had before. Then we have a quite a standard change management process for implementing those upgrades that we confirmed that the testing has happened what is the changes that would be affected for users. And then we create a quick user guide on the changes, comparing the previous version that they were using and the existing version. So change management is also a key focus for us in terms of so we have three risk categories low risk medium risk and high risk. And then we would have a specific criteria like if it is a low risk say it's a staging of development instance and we just need to replace that. It's a low risk and we just create a ticket and it gets done. But if there is any changes to a production version, we classify that as either medium or high risk, and there is certain levels of approval that is required for that, including a risk impact assessment and a rollback plan if that change that we affect is not successful. And then you can see here is just a process flow of our, of our change management, essentially I'm not going to go through all of it but it's rating a ticket testing, then creating a change request that gets approved, and then a change notification gets sent out the changes effective, effective, and if it is successful, a change notification is issued if it's unsuccessful with we follow the rollback plan. Incident management. And, you know, I always think of the Swiss cheese issue when you have an incident it's not because one thing went wrong, but actually the issue went through many, many loopholes that happens that should not have happened. So we have these processes including our change management and other aspects that sort of we try to, we obviously deal with the incident as it happens but we also try to prevent those incidents. So on incident management, it's really things like a server just being unresponsive or slow, or completely falling over, and we just immediately respond to that by restarting it or something like that but then to also take that step further to say, Okay, so if this happened once is maybe not so serious if it happens every week. What are we doing to prevent that from continuing to happen. We need to increase resources on the server, and so on. Other incidents that we are following. So those are more server based incidents. And others are cyber security incidents or threats, vulnerability status identified or data breaches that could happen on our on our systems if we had any of those, we would declare an incident and evaluate and investigate that incident right and incident report and right strengthening mechanisms to prevent that incident from happening again. So, it's important for us to, to ensure that we manage our, you know, both our server incidents to have as much uptime as we can, and ensure that we have cyber security measures in place because we consider it not about if you get get hacked but when you will will have an incident so we need to be valid vigilant for that. Then we have a training program for DHS to experts or superstars essentially this is a two year program that we have implemented for training super users we found that a five day course is not really training someone to the extent of someone who has worked with DHS for a year or two, you cannot build that capacity in a five day Academy training or something like that so we enrolled 18 participants from the nine provinces in South Africa and the national level into this program it included face to face training, online training through a learning platform that we have assignments that we then ask them to perform, which are graded, and, and if they are not passing those assignments we give them exercises on a training instance that helps them to exercise that thing more in terms of and basically we teaching them the good principles that are followed here that that we have not just how to create a data element but how to create it properly. For instance, so, and then they have the ability to redo those assignments, and they have a period of mentoring by that test data manager, and then we hand that function over to them over time so then they could, they could continue doing that while we also continue to mentor and supervise them further. So, the lessons learned really is defining good practices, testing of what changes is required and change management and incident management and over the few years we've been able to really distribute these lessons and to countries around us listen to Namibia. I know Maria is online on Zoom. Hi Maria. And also Zambia that was going to be, but I don't think they have arrived yet. So, and we've also been able to learn from them in South Africa again. So now I'll just share a few slides on rapid assessments that we have done for maximum value from DHS to in Zambia. We are working on a project called E4H with MSI in Zambia, and really the aim of our work, the one aim of our work there is to help them strengthen their HMIS in Zambia. And from the rapid assessment there were a few recommendations, the data DHS to integrity checks cleanup, the implement consistent naming conventions to align the HMIS on the minimum data set in the M&E framework and effective utilization of DHS to artifact such as groups that could make their reporting easier. This all was, is still in a process of getting affected, but some of it has already been done, and there is some dashboards were changed and it resulted in a better structured reporting framework for the country. So, they also agreed to implement some of the best practices. We developed guidelines for data management. It's just a few slides and user management principles of defining data elements and indicators, organization unit guidelines, and we assisted them to implement a master facility list. And that we applied the principles from the Orc unit guidelines standardization of naming conventions, added in a systematic process of creating and closing facilities, and it's also being used as a master across all systems in use. So, yeah, you could just see some of the lessons learned, we're able to apply best practice to the unique environment, adjusting as required. And the Ministry of Health has internal capacity and skills to manage their own instance. They are committed towards DHS to and proud of owning the systems. However, they do have a shortage of resources to give adequate attention to detail. Change management is dependent on buy in and teamwork, additional collaborative work to continue strengthening the DHS to implementation in Zambia. And something that really we did there is to use the bottleneck analysis and the action tracker apps that we could take back to other countries now is to enhance the planning cycle and feedback. So, just a few acknowledgements to the Ministry of Health and the funders that in the countries that mentioned in this presentation, and that is just wanted to contact us at all, a little bit about taste. Thanks. Put anything. Thanks. Oh, Mary, for great presentation. You know it's not only about what's in the system but as you can see that there's a lot about process standard operating procedures and I think his South Africa has really done a lot of great work there. I mean, I think we've all come a long way. We're not really in the days anymore of, you know, having root access to the database, I won't name any names, but I do have a co conspirators sitting in the room where, you know, it was a little bit of the Wild West before. When you know you went in you created a data element you didn't really think a whole lot about what the potential consequences of that that were. But 10 years down the road, you see that the decisions that you made may have big, big consequences basically. And so with, you know, with time and with maturity with changes to DHS to we all see that there's really a need for better standard operating procedures, as well as better tools to help us maintain the quality of the metadata over time. So we have like 10 minutes left. We can take questions for any of the presenters. See if we can grab a mic so that everybody on zoom can hear anybody want to get us started. Thank you maybe I did want to come back to. Let's see if I can share sharing. I want to share my screen. I did want to come back to one of the questions that was asked earlier about running these metadata integrity checks on legacy databases, things that are before. This is I think this is in the presentation from Olaf. I didn't go too much into the technical details of this. We have a red hub, we have organized this kind of according to theme, if you will. And each of these contain a YAML file, which is YAML is not markup a markup language or yet another markup language depending on who you believe. YAML is basically like a kind of relatively human readable form of a song. So if you do have a legacy database, you might be able to run this. So you kind of clip these out put it into an SQL view or run it against the database, don't run it on production. Obviously not testing server. A lot of these will work, because there haven't really been any changes to the model over time. Some of them will. Some of them won't. But like I said, you know, we started with 236 and will make hand, but we have run needs against older databases. So a lot of these checks could work on legacy systems and it could be a good idea to try you if you have some SQL skills you could also probably modify these if they're not working. One of the reasons that you might want to do that is because as you upgrade the system may become more strict about what's valid what's not valid. A good example of that is the organization unit geometries. So in previous versions, you can kind of put anything you wanted in the geometry field. It was just text. It didn't really have any impact. Now it's like a defined post GIS geometry type. And if that geometry is not valid, then you will be able to upgrade your system. So you're thinking about upgrading to later version to be just to then determining, you know, which of these checks are important. You're playing those against your, your legacy system might be might be smart. But yeah, you can give it a try. They probably work not all of them well, but most likely maybe them well. Yeah. Is there any kind of identifier for each of the test that you are running. Yeah, so we can share that is is choosing a request 123 or whatever. I'm not yet. I'm not sure if I understand the question. Oh, sorry, I'm on the wrong computer. Yeah, so if you go to this, this great. This is really the best get hub ticket. I've seen I think in terms of documentation. But it may be inside of here. I mean, with this new API, in point, you could say, give you all the integrity check for program rules. There's also more complicated filtering about giving me this particular data. That what you mean. I mean, the identity check, the ID is the name, or you have something different. Basically, there's no ID. I'm pretty sure actually that I would have to read the documentation, but I do think so that it is possible to run a single integrity check. I haven't looked into documentation, but I think the answer is yes. I have to consult documentation about this, but I'm pretty sure it's possible now to run a specific check, instead of, you know, run all of these for my group. I'm pretty sure that was implemented but we can double check on it. Is there a list in the documentation of all the parameters that we could use there. I think so. Yeah, I haven't seen, I haven't seen it in the documentation and if we're missing something. Then it's there. But I believe it's documented both in this ticket as well as the documentation. Something's missing there or it's not clear. Yeah, one, two quick comments. Can I avoid having that? Yeah, just with regard to integrity checks, you know, DHI is one we had user-definable integrity checks. I asked the HIs to group to implement that I think five, six, seven years ago. Never happened. Should actually happen and the ability to run individual once it should actually be there. So that was not my main point. My main point was to the question of if you have a database that has gone out of control. Now, South Africa is a special case because we fully rolled out in 2000 and for the first 10 years, basically I kept control of the implementation because we were using Microsoft Access. We were distributing the databases on DVDs, updates and whatever, and it allowed the HIs group, particularly me, to ensure a reasonable deal control. From about 2010, the National Department shifted. They had some new management in. They were not so focused on telemedicine and the pys in the sky anymore. They were definitely also realizing that these big, what I, the other day called the megalomaniac options, this idea of super systems that you roll out anywhere. They had seen that that didn't work in the UK, in Canada, in Norway and many other places, right. So they started actually engaging much more active. And that's when the standard operating procedures come in. That's when the policy of having two yearly reviews and that changes had to be signed off by the director general, etc., right, started coming in. So they took control over that change process and that is the best way. Don't let your databases get out to control. I worked in another country the last four or five years, and it's taken me three years to work with them to clean up their mess. And as an example, again, of being preventative Nora, my colleague and I, we walked into the main sort of office there in October, 2019. Another techie guy from Oslo had been there assisting them setting up a lot of new data sets, but he's a techie. He had no idea what the hell stuff was all about. Nora and I came in and we were just horrified because the desk were covered with these huge registers and stuff when we said this is you cannot implement this. And Nora then spent 14 days, they basically bought into that she spent 14 days negotiating with all the health programs and cut down the number of data items they wanted to collect from 13,000 to just over 1000. And that tells you if they had gone the route of trying to collect 13,000, they would have had an even worse mess. But when I've been cleaning up for this is also I need to say this right because a lot of the problems are coming from so called the tries to experts with a pure technical background who do not understand data. They have no experience with data management with using data. I'm not saying they are bad people but they have their understanding which is technical right we're going to set this up in this or that way, etc, etc. And when I looked through, I could see who had been causing the mess over the years from, and I knew all of them right from my friends last year and down the whole list. And the problem is that each and every one of them are not bad people, but they come in with their own way of doing things. So you have 14 days of this one 14 days of that one. I'm just saying here, I know that's fine. I'm just saying here that it might take you to a series to actually clean up a really bad database, but what you need to do is to ensure that you control the changes from them. And for that, I think you need a review every two years. Thank you Kelly. Thanks. Thank you. One last comment, then all of us have to go outside for the picture. Hi. Yeah, you hear me. So, you mentioned talking about setting up from scratch you, you kind of advised, maybe using scripts and so on instead of the UI and I just wondered whether that meant you think configuration as code is a good option to try like an ideal to try and follow. Yeah. I mean, that's kind of my feelings, right. And I think it's going to depend on every single implementation. Right. I don't think that there's a problem and is probably advisable to use the UI in many situations, but not in production. So, I think the ability to use the constraints of the UI simplifies a lot of things. But then moving that to a testing machine, moving that to the production instance. I think that's what's recommended. Scripting everything really requires much more sophistication and in many cases you lose the validation of the user interface, which is also not desirable. We got to go outside for a picture. Just to add to Jason's comment. Yes, I agree when you're in this exercise, scripts and things can be very useful, but you don't want to kind of minimize the contributions of others entirely right so once you get into that regular routine hopefully where things are more controlled, you can maybe think about reverting back using the user interface for some things right, but just kind of think about that separation a little bit. Not mine. I'll add the couple of slides that that I everything will be.