 in developing semantic data models and biodiversity, he's going to be talking about the use of GitHub to create and manage semantic-enabled control vocabularies. Edmund, over to you. Hi. Thanks, Kiran. Is my audio coming through all right? Yes, it is. Oh, okay. Awesome. And so I used my words. Oh, share my slides. Is that coming through fine? That is coming through Edmund. Okay. Thank you. Yeah, so thanks for having me today. I'm Edmund Chuck from Tern, and I guess my presentation today is slightly lower level compared to the other presentations. It's more about how we're managing and how we've implemented a set of control vocabularies using GitHub as our vocab system. So the outline today I'm going to briefly mention about the project we're working on and then go through the set of requirements we've established for our vocab system and then how those requirements stack up against our previous vocab systems we've worked with. And finally, how we've landed on GitHub and how GitHub meets our requirements. Yeah, so a quick intro to vocabs. I don't think we need this really, but it's just a set of co-list taxonomies or classification schemes model with this costata model. And with the project, Tern is currently contracted with the Department of Climate Change, Energy, Environment, and Water to work on a set of standardized field survey protocols to collect data across Australia. And in the Tern Data Services team, we're tasked to vocabularize those protocols by encoding them with the costata model for things such as feature types, properties, methods, and categorical values. And when we started this project, we thought it was probably a good time for us to review our vocab system. We wanted to actually establish a set of requirements that we actually want in our ideal system. We wanted a way to collaborate with the community and external partners. And we also wanted a way to hand over the project once it's complete in an easy way. So our requirements fall into five key areas. So content management, versioning, community, maintenance, and automation. And under content management, we wanted an easy way for our domain experts to edit and maintain the vocabs. And we also wanted a way to review each changes that get made in the vocab. With versioning, we wanted not just to version the vocabs at the whole vocab level, but we also wanted some way to track the history and changes at the individual term and concept level as well. With community, under the community section, we wanted a way to be able to collaborate with the community and not just within our team at turn. And we also wanted a way to be able to have discussions around how the vocabs are being used and that kind of thing. Our previous solutions involved maintaining servers and applications. So this time around, we wanted to see if we had a way where we can get started quickly and we didn't need to actually maintain anything. And this includes backup management as well. With automation, we wanted to be able to make changes and have a set of tests run as well as have automated deployment pipelines to deploy our vocabs to downstream services. So how do these requirements stack up against our previous systems? So when we started vocabs at turn back in 2019, we started using Git with Bitbucket, then we moved to PoolParty and then to BopBench and now we've landed on GitHub. I've used a lot of emojis here to denote if the requirements meet what we wanted or so the green ticks, the yellow triangles, the feature exists, but we didn't use them at the time and then the red, the notes that these features weren't available. So with Git with Bitbucket, we started out with managing our vocabs in a Git repository. But the issue that we had was we managed everything in a single file. And this became very problematic when we had multiple users editing the same file causing these Git merge conflicts. And it was just not a good experience for our editors. So from there, we quickly moved on from Git into PoolParty. So this is a PoolParty version that's supplied to us by ARDC. And PoolParty provides a very nice way for users to edit the content. It actually also provided a way for us to do backups and stuff automatically. But due to the opinionated view of how PoolParty thinks SCOS should be used, for us it didn't meet our own needs and we needed a bit more flexibility. So we moved to BopBench which is I guess an open source alternative to PoolParty. But with this one, we did have to maintain and host it ourselves on our own server. So there was that a bit of overhead. But we still maintained that easy editing experience for our users. And with the new project, we did decide to review this. We wanted to have green ticks for all the other parts of our requirements. So we revisited Git with GitHub and we managed to come up with a solution that essentially meets all of our requirements. So I'm going to go into details of that now. So one main difference with Git this time around is we separated each resource into its own individual file. So resources such as concepts, concept schemes and collections. And this actually solved that merge conflict issue when we had multiple users editing the vocabs. It's very unlikely that a single user will edit a single concept. So separating this into its individual files really solved that issue for us. And at the same time it provided us with a lot of other benefits as well. With GitHub, we can actually have users who don't even know how to use Git to contribute. They can actually just go to the website and edit the vocabs directly in the browser. Each change that's made to the vocab goes through a review process known as a pool request. And these pool requests are very detailed. They actually show exactly what file and what line was changed. So this served very well for like reviewing things and implementing different governance rules into this. Such as like when Rob mentioned having different profiles of SCOS, we can actually have that validated through a review with humans as well as automated tests. With versioning, since each resource or each concept is mapped to a file, the Git history of that file is essentially now the history of each concept, which is really neat. You can actually see how a concept evolves over time. We can also use the Git blame feature where we can actually go into each file and see line by line what was changed, when it was changed, by which commit, and by which user. We also utilize GitHub's release feature. So we actually release our vocabs with different versions. And if we wanted to, we can also use the Noto and its integration with GitHub to automatically mentor DIY for each release that we make as well. There's also another really powerful feature with the release feature in GitHub. You can actually compare the different versions of releases and see exactly which files were changed and what was changed in those files as well. We are developing in the open. So our repository is public. This fosters collaboration within turn, as well as with the community. And the coolest thing with this is anyone with a GitHub account can contribute, make suggestions, make, oops, make, yeah, they can collaborate with us essentially. GitHub has issued, sorry, an issue tracker and a discussions forum built in. So this, again, allows us to collaborate not just within turn, but with the community and our external stakeholders. People can come here if they find issues, they can put in the issue tracker, or if they just want to discuss on how the vocabs should be used or make new proposals, they can all go in the discussions forum. And best of all, with the solution, we don't have to have servers or applications to maintain. And we don't have to maintain backups either. GitHub is large enough that it's unlikely that it's going to disappear. Sorry, I'm getting messages here. Yeah. So GitHub's not going to disappear overnight, basically. With automations and integrations, yeah, so here we can use GitHub actions to implement different shackle constraints and run those tests every time content changes in the vocab. And we also have a bunch of other custom Python scripts that run tests and formatting and those kinds of things as well, just to ensure that the repository isn't in a good state and everything is, yeah, well-defined, essentially. We have these GitHub actions to also automatically deploy our vocabs to downstream services, such as ARDC's research vocabulary Australia. And now going back to the DQ project, once the project is finished, if DQ wants to continue maintaining the contents on GitHub, we simply just transfer the ownership of the repository over or otherwise. It's just a Git repository with a bunch of text files, so they can take it and then move it to whatever other platform they choose. So our question is, or I guess our message today is should we manage vocabularies in the same way we manage software? And through this experience with this project, our answer is yes. Through managing it this way, we were able to have a bunch of automated tests in ensuring that the vocab is conforming to different kinds of profiles or standards. And we've also actually, I think, empowered our domain experts in actually getting things done, moving fast and being confident in their changes, because now there's a review process and tests as well as other humans can actually review and validate that, you know, these contents make sense. So in summary, using GitHub, we have now an easier way to create, edit and publish controlled vocabularies. We have a way to collaborate not just within our team, but also with the community. We actually have a battle-tested review process, something that's missing in the other systems that we've used. And we also have a way to manage releases and versioning of vocabs, not just at the top vocab level, but also at the individual concept level as well. It's very good that we now have a way to run tests and deployments automatically. And best of all, we can get started without needing to maintain our own service and software. So if your project or if your organization has these similar requirements, then perhaps this is a good solution for you as well. And if you're doing something similar already, please let us know. We would love to see what you're doing in this vocab management area as well. That's all from me. Thank you for listening.