 Okay, welcome everyone to this webinar on provenance. My name is Tom Honeyman. So, provenance or lineage is, or sometimes known as pedigree is a description of entities and processes involved in producing and delivering or otherwise influencing that resource. So I'm paraphrasing the W3C provenance and give a better group description slightly here. Provenance provides a critical foundation for assessing authenticity, enabling trust and allowing reproducibility. Provenance assertions are a form of contextual metadata and can themselves become important records with their own provenance. So today we are joined by Kyle Monic from the Bureau of Meteorology and Matthew Miles from the Department of Environment, Water and Natural Resources at South Australia. Kyle, could you expand a little bit? Sure, thank you. Thanks for the opportunity to be part of this webinar. So I'm work for the Bureau of Meteorology. I've been here for about 15 years and my work has been in the observations field, meteorological observations and maintaining information related to that. We call it observations metadata and that's important information to help us understand the history and the traceability and to get trust in terms of the observations that we have in the Bureau. And I've been working in this meteorology field, I guess my entire career first in South Africa and more recently in Australia. These observations are used for the climate record, they're used for meteorological operations and for public information and so on. I'm also part of the World Meteorological Organization task team on what's called WIGOS metadata and we've produced a metadata standard for observations metadata to help ensure the provenance of the observations that are collected around the world. And I can talk to you a little bit more about that at a later stage. But this standard has been published last year and it's available through the World Meteorological Organization and it's a standard for all types of observations whether it's surface-based, normal weather station measuring temperature and rainfall or drifting boys on the oceans for satellites, for radars and all these kind of things. So it's a broad scale standard and it's I guess something that has been recognized as really important. So and then my specific job is in observing strategy. So we look at how observing networks, meteorological observing networks should be changed and adapted to be future fit. So yeah, that's about me. Thanks, Carl and Matthew. Thanks, Tom. Again, thanks for the opportunity to present here and take part in this. It's great to have a conversation like this happening. As mentioned, I'm with the South Australian Department for Environment and Water. I'm a Principal Advisor for our Environmental Information Unit. So I lead a team of spatial analysts and data managers who curate environment and water and natural resources data in the realms of ecosystem science. So biodiversity observations, habitat mapping, soils, coastal information, native edge management and the like. These cover the whole of South Australia from the arid regions through agricultural regions to the Murray-Darling Basin, Marine Parks and everything in between, really. The team supports, we support policy officers, legislative requirements, ecologists and project managers, on-ground works and management in terms of the landscape, essentially to have the right data at the right time in the right format that they need. So that's the task of my team. And so clearly understanding where our data comes from and how we process it and where it is is really important. So we've come up with some provenance tools to help us lock that down. OK, thank you, Matthew. So the format today will be a panel. Unfortunately, this panel is a little bit reduced in number. We've had, due to unforeseen circumstances, David Lasinski from Geoscience Australia has had to pull out and he sends his apologies. So the format today will be the panel members answering some questions provided in advance. So feel free to ask questions as those questions are asked. At any time, using the question box and I will insert these questions as we go along. I think we can get straight into it. Carl, if I could begin with you. Could you tell us why provenance is important in your field? How do data consumers use the information or why do they need it? OK, yeah, that's an important consideration when we can think about data and its use. We, in the meteorological field, we need to understand the nature of the measurements and make sure that they meet or understand what standards our observations meet. And I think most people intuitively would know that if we measure temperature in different situations, meanwhile, that may be the same day, there will be different impacts whether you measure it on top of a car park full of asphalt versus out in an area that's grass covered or so on. So having some information about the nature of that measurement is really important. Now, our climate data users are really interested in the time series, the homogeneity of that measurement over decades up to 100 or more years. And so having information about what is the source of that measurement, what are the characteristics of that measurement throughout that time series is really important. And often they go into great detail analyzing those time series and look for what they consider in helmet journalities and then they'll go back into the metadata, the provenance data to see what is a potential cause of that. And so our information that we collect regarding the measurement is always historical. It needs to be dated and understand at what point it was relevant and how it changed over a period of time. In our field, we also change our measurement technology and our use the temperature record, for example, as well. We started off measuring with mercury and glass thermometers and had different models of those. And then over time, we've changed to electronic devices, resistance thermometers or thermistors and so on. And so those have some impact on the measurement characteristics. And so information on all of those kind of things is really important. Also, it's important to have information regarding the sighting and exposure of where we do our measurements. You know, trees may grow, buildings may be put up and land surfaces may change and all those kind of things have impact on the nature of that measurement. And so that's key to us. And then we collect this information in Australia, but also globally, we share a meteorological information through the World Meteorological Organization. And so it's important that we have such information, not just for our own data within this country, but also globally as well. And so the WMO, if I can use it as a little bit shorter, it's hard to keep meteorology running off my tongue easily. They have two different classes of metadata. They have what they call Discovery metadata, which helps people discover the data series and find where it's measured and what is measured. And then they call Interpretation or Description or Observations metadata, which talks more about the nature of the measurement as well. So these standards have been established globally and all countries are required or encouraged to meet these standards. And so there's now global database available where countries can upload their metadata for these observations that they collect and share internationally. And I guess the global climate observing system, they have a statement and their statement is, the details and histories of local conditions, instruments, operating procedures, data processing algorithms, and other factors pertinent to interpreting the data should be documented and treated with the same care as the data themselves. So that sort of encapsulates the types of things that go into our metadata, our provenance. And then there's obviously from the measurement, the processing of the data as it moves on, the quality control processes that are applied to it and any other processes to adjust for maybe inhomogeneities or so on. All that information needs to be understood and part of the data set as a whole. So, yes, to us, it's incredibly important and I think the Australian Bureau of Meteorology is, I won't say leads the world, but it's one of the great examples in terms of our comprehensive metadata that we have at least over the last, since the late 90s when we established our comprehensive digital database on metadata. So, yeah, thank you. Thanks. So Matthew, the same question. Why is provenance important in your field? How do data consumers use the information and why do they need it? So, in our field of ecology or landscape science, the science and information that we produce, particularly from a government agency point of view has to be transparent and defensible. And so we do collect metadata about a number of our data sets, particularly the ones where we're supplying authoritative data, whether that's mapping data or coastlines and things that are highly technical in nature. The difference here with our provenance tools is that it's the processing of information through our projects, which we're capturing the provenance of when multidisciplinary data sets get combined to produce answers or advice relating to specific policy questions or management questions. So, what we've found is that our project-based approach to collecting information and analyzing information really hampers that transparency. Over the years, sort of poor data management practices, staff turnover, those kinds of issues make it a real challenge to transparently record what's happened to data that have ended up in reports or advice. So, these provenance records are really needed to counter that. We need to be able to defend the creation of the data, we need to be able to add to it, we need to be able to reproduce it at times and on supply it and represent it faithfully. So, that's kind of why we need to do it. How do we use the information? Interestingly, the major consumers are project managers and others outside the initial project, if you like. It's kind of like paying it forward. The current project officers or people who are doing the actual science, who are doing the work, they know where the data is, they know how it's been put together and to generate a product, but the next person doesn't. The manager might not, the executive might not. So, we need to understand where these things come from and be able to document that for the project managers. So, another key issue is that project proponents need to understand the resourcing required to manage data properly. If you've got a large multidisciplinary project, you might need to actually put dedicated resources into that to do your data management and to maintain these provenance systems and to maintain the metadata associated with all of the data and the reports being produced. So, that was another reason for these provenance tools was to be able to upfront at the beginning of the project, forecast whether you're gonna need some resources a little or a lot in order to cover the data management requirements. They don't just happen by themselves as people I'm sure listening to this will understand. Okay. Before we move on, did you have any questions for each other or would we like to move on to the next question? No, that's fine. Okay, so, Karl, how did you decide what provenance information to capture and how and when, particularly in your workflows, do you capture it? Okay, yeah, that's a good question because there is so much that can be collected and one needs to make sure that you collect what's appropriate and useful. And so, I guess part of our work with the World Meteorological Organization has to be, has been to set a standard set of metadata and there's, we've divided it up into 10 categories and this is across disciplinary group, I guess in the broad meteorological field, it's had hydrologists, it's had marine scientists, it's had satellite folk, it's had radar and all those kind of things which we call comprehensive but obviously just a little sloth of the greater environmental area. And so, some of the components that we've put into, well, the 10 categories that we've got as our metadata is information about the observed variable itself, its nature, its measurement characteristics, and so on. Secondly, it's the purpose of the observation why have you collected it? Because that often gives you information on the nature of the measurement and it's how it's been used. Third is the information regarding the measurement location, the station, the platform, and things relating to that whether it's a fixed station, a mobile station or some remote sensing platform that might be an aircraft or a satellite and so on. And the fourth category is to do with the environment, the geographical environment where the observation is made. And there's all kinds of information that's useful there in terms of topography, land use, and so on. Then we have the instruments and methods of observation which is important because we know different instruments can record variables in different ways, have different characteristics, and also the methods because whether it's an electronic method or a human-based method or some type of chemical analysis, all those kind of things are important characteristics to record. The sixth category is the sampling. How are samples collected? Is it, again, there's such a range of information regarding sampling. Number seven is the data processing and reporting characteristics. How is this done? How are the statistics calculated? What data is thrown out and what is transmitted through to the final product and so on? Eight is the data quality. What type of data quality and traceable aspects are provided to the observation. What checks are done is the other measurements compared against some international standard through some traceability chain and so on. The ninth category is ownership and data quality or data policy. Who owns it and what rights are there relating to the data and how it's been used. And then the last category is to do with the contact. How do you get hold of those who collected the data if you want to find more information. So I guess that's one way of deciding what information to capture. And then the second is how do we do it? And so we've got processes in place in our, for our field staff who go out to observation sites and they have a certain set of metadata that they need to collect and update. Much like that list that I just provided to you. So at the time, well, generally in our field we are collecting data continuously 24 seven. And so our metadata is collected when we have some type of change or visit to sites, for example. And so we have a number of things that need to be done. The characteristics of the verification done photographs, checking that information on serial numbers and so on of equipment is updated and so on. Any incident that may happen that may impact observation we become aware of a building that's been put up nearby or there may have been storm damage or any of those kind of things that are collected and put into our, what we call our sites DB database which is our source of all of our metadata and there's huge amounts and different types of information that have been collected. So this information is collected in the field transferred into the database and then it's QC to some extent and manager will look through and make sure that it's appropriate information that they haven't made a mistake in terms of selecting the wrong station or field or whatever to apply it to. And then it's available right across our organization for people to query and view and so on. So that's a type of information that we collect. I guess one of the things we've discovered and is that photographs are often invaluable. Today we may think we need to know some information and but we haven't had the wisdom to collect that information prior to this point and going back into photographs from the past is often solves some of those questions. You can see things that were documented or you can see from characteristics what model equipment it was or something like that. So those are the kinds of things that we collect. Something as simple as the units that we measure in is important. We notice changes in temperature when we move from Fahrenheit to using degrees Celsius. We can see differences in maximum temperatures if we change our observing time from 8 a.m. to 9 a.m. that has been happened sometime in our past as well. Positions of sites are also a real challenge. They may have been located on a one in 250,000 map. The way degrees, minutes, and seconds are recorded may in database conversions, it may be seen as instead of being minutes that may have been incorporated as decimal degrees or something like that. And so there's these areas that can creep into databases due to the way data is managed. So those are all kinds of things that we collect and I guess in terms of our workflows, we collect them, try and collect them at source and bring them in and make them available to our staff. Thank you. Yes, so that's interesting then of course with the nature of your data is that there is that long tail that goes. I was wondering if I could get you to talk a little bit more about instances where you might be updating the provenance of a record. So you mentioned that scenario with a photo where maybe some event triggers you to realize that there might be some errors in the data and you wanna go back. Now, presumably that applies all the way back into the history of the records held by the Bureau. Yes. Yes, sorry, did you wanna finish? No, no, go for it. Yes, so we have had some projects in the past where we call back seeding because we have a lot of information that's available today or the last two decades that is electronic but prior to that it was all paper records. And so for important climate sites and important climate records, we've been going back to paper files in order to document some of that information so we can get a more continuous record of changes of information because that's important. It's a really challenging area just in terms of database management. For example, you may say in 1933, a thermometer was broken and replaced and it was this model thermometer. And so in your database you put in a date and time and say that happened then and then the next event you may discover is in 1960 something. And so there's this assumption that that thermometer was in place from 1933 to 1965, for example. And so that can, you know, if there is a record missing in the paper file where there was another change made, it can lead to wrong understanding or assumptions made in terms of that. So and when you discover a new piece of information and you want to insert it into that database record, you also have to end the life of the previous, the moment, for example, put in the new model and start it again. And so it is important training that needs to be given to staff who do that. And so, you know, we do that because there is so much importance put on our temperature record and other rainfall records and such things. And so as we've gone through those exercises, we've also realized that there's additional metadata we need to, you know, put in. And so we've added to our repertoire of metadata but we're not always able to go back in terms of that. I've got a quick question from the audience too. Catherine Brady asks, are these 10 categories of information captured manually, semi-manually or automatically? Good question. So I guess a large number are collected, I guess semi-manually in our case. So a lot of the information is collected on site and either loaded into a laptop at that time or it's when they get back to their office, that's their opportunity to do that. These days we have more and more intelligent equipment that is able to report its serial number and any changes that happen. And so we are moving to a point where you have more automated information but I would say the bulk of it is manual assisted or manual in terms of its collection. Matthew, let's turn to your question. How did you decide what provenance information to capture and how and when in your workflows do you capture it? So thanks, Tom. Yeah, interesting the level that we work at here slightly different to the sort of foundational, if you like, or direct measurements from the bomb there. So we had to work our way through working out what was important to understand when you want to resurface data or you want to access project-based data or the products of evaluated analyses. So here's a report and it's made up of some maps and some photographs and some text and so where did that come from? How can we roll that back and understand what went into it? Interestingly, so a lot of the data that goes into those reports is of the nature that Carl is talking about. Sometimes it's climate information or it's direct observations from instruments. Other data types that we deal with are human observations for example, animal and plant sightings for example and we have standards around the kinds of information that's collected and then the metadata that goes along with those. What we quickly recognized was that in our line of work we have IT systems that hold metadata for a given format of information but it sits within that system. So whether it's a spatial database or it's a tabular system, whether it's our surface water data system or our groundwater database or our soils information or our photo libraries for example, they each have metadata associated with them but it sits in that IT solution if you like in that database. What we needed was metadata for a project that said okay, so where did you get that data from and what did you combine it with and then the thing that you produced, where did you put it? Next was to say okay, so how has that product been evaluated, who has done the evaluation, who's done the approval of that evaluation or a review of the evaluation and then ultimately who's authorized the release of the product down the line. So we have many people working on these projects and as I say often they're multidisciplinary so we need to understand that the people involved and the decisions that are made as much as the actual data, the granular metadata about the data itself, for example, like what Carl is talking about. So we needed a new system, it was really a metadata system that sits across the top of our other systems and provides it at the level of a project. So we have a project management framework that guides the way that the agency gets money and undertakes projects and that's about planning the project and then delivering the project and then reviewing the outcomes of the project and then closure of the project. So your classic Prince II method for running through a project, what was missing was really the data management components of that. So we are getting to the second part of the question, how do we capture it? We try to work it into that project management approach and produce a series of tools that slot into each of those parts of the project management phases. So in the planning phase of the project, we produce a template that records basically what are going to be your data inputs and outputs? What are you dealing with? Which IT systems are you going to be interfacing with or using in your project and who's making decisions in the governance of the project? So really just letting the project managers get a focus on the kinds of information, the amount of information, who the audiences are and who's making the decisions along the way. So interestingly, a little bit different to just sort of metadata about observed, about observations. This is about metadata of the process of value adding in an analytical or a science-based way. So we worked them into that project management framework. We recognized that there was kind of a need to capture both sort of broad level information during the project planning to sort of get the resourcing right when you can't really go down into the detail at that sort of inception part of the projects. But then to gather more detail as each workflow in the project is executed. So there's potentially change as you go through the project as well. So we produced a series of templates including that planning form, some charts which we call them data charts which are sort of a one-page summary of workflows that captures provenance of that workflow and the governance of that workflow as I've mentioned. And then associated with that, we call it the catalog which is sort of a metadata system that allows you to capture some of that more traditional metadata of where's it come from? It's lineage, it's location in space and time and all of those kinds of things but really across the formats. So you could say at a project level, here's a list of all of the information resources that were used in this particular workflow irrespective of whether it sits in an Oracle database or an ArcGIS Esri database or an image library or indeed a Herbarium where we've got samples or water samples. So the provision of those templates is sort of the first process. We also found that we needed some human support to that to really guide people to using the templates. It's one thing to just give them a template but there was really a need for us to hold hands, if you like, help people through filling these things out and really not because it's a difficult process but it's really just learning and understanding the process of extracting that information and recognizing what are the kinds of things that need to be captured. Ultimately, probably even more so, probably not. As much as we needed to collect information for the data to be transparent and defensible at the end of the day, we're actually trying to raise or improve data management culture right across our agency. So we really wanted project managers and project officers to understand that it's not that difficult to write these things down. And indeed, if you plan it into your project and put some resources into it, then it makes it much easier to find this information later on down the track. And as I mentioned earlier, it really is about paying it forward here. The project officer will often defend themselves and say, I know where all my data is, I know where it's come from, I know where it's gone but once that person's left or a new officer comes in or indeed a data manager or a data provider, if a member of the public rings up and says, map three in this report here, which has been made public, can I get the data from that and do something else with it? If the people aren't around who know where it is, then we need some records around that. So really trying to produce these templates and tools that enabled the project staff to learn how to do that and then see the value in those products, which often, ironically, doesn't often present itself until a bit down the track. So it's not until the project's been finished or nearly finished that somebody wants to produce the data that you can go to these records and really see that value. A really good example is we often have a project that might be completed and as I say, the final output is a report. Often it's a report these days with electronic communications and open data, people are wanting the data that goes into that report as well. So it's really important for us to know which element of the workflow can we release in the data sense? Do we want to release the raw data, the kind of stuff that we might be getting from Carl's area or is it the value add where we've done some combination, some summarizing, some analysis and produced a graph, for example? So when someone comes to us and says, well, that project that happened last year or was published last week, can I have the data please? We really needed a way to investigate well, which bit of data are they actually talking about? Is it the map? Is it a layer in the map? Is it the graph? Is it the table under the graph? And so we're trying to improve and I guess improve the language around data management and data workflows and analytical workflows so that we make sure that the data that we're releasing is actually the stuff that's been quality assured by, you know, through our processes. These are all what process the quality assurance and approvals are already embedded in the way that we run our projects, but we weren't doing such a good job at capturing that to understand that so that if somebody that wasn't involved in the creation of the project comes along and wants to find out that information so that we can pass it on or as I say, transparently look into that, that was really what we needed to gather. So helping people to use these templates and then recognize that you do it at the beginning of the project, you might review it in the middle of the project and then you need to review it at the end of the project to make sure that as things change through the project, you end up with a decent record at the end of it. So I think really important to have the templates is to also have, and this is where my team comes into it, that support that allows the projects to recognize the value as soon as they can what these products are. I guess following on from that, my team also then for those people that provide that support to the projects, we get them together on a monthly basis and they can, they talk amongst each other about any patterns that they're seeing or any products that we need to improve or extra products and new products we might want to develop. So that allows us then to sort of complete the circle if you like to keep a finger on what information do we need to capture next time? Are we getting it right? And so in that way, we're trying to embed this system of improvement in terms of data management, the language and the culture around how we can make our science transparent and defensible at the end of the day. Thanks Matt. We've actually got a few questions from the audience, mostly around the templates. So Mingfang Wu asks, is the metadata a standard or is it an in-house developed one? It's an in-house developed one. I looked into things like Prov and recognized that they were much more at the end of the day machine to machine oriented, if you like, they really enabled some very clever technical comms and what's the word, discoverability and accessibility at that machine to machine end. As I said, the focus for us really was about raising our awareness of data management. So we wanted something that was more human to human, if you like, and we didn't have a lot of resources to develop templates that were in a flash interface on the computer screen. So we simply got PowerPoint templates going. It was really just about making sure that we've got a simple something that's accessible, all project managers, most people can drive PowerPoint, give them some templates to put icons on the page that tell you whether it's a spatial layer or it's a table of information or it's a database or it's a video or it's a photo and then just put in processes in between those that capture how the information was combined and then what did it produce? And as I say, who did the quality assurance and who did the approving of the product coming out at the other end? So no, I couldn't find a standard for that particular process. So it was in-house developed. So Mingfang also asks, does the meta jarter change from project to project or is it a standard template that applies across the board? Yeah, that's interesting. What we've found is that it's evolving. Some of the things that we, at a base level that we wanted to capture is, is the data classified? What classification does it have? So inside government, we classify things as whether they have any sensitivities, whether they're for official use only, whether they're cabinet sensitive or whether they're public data. So it's got to be classified. So we capture the classification, any licensing if it's open licensed or not, and then who's done the authorization? Who's done the evaluation? So in that sense, they are pretty well locked in. The way that we capture those though has kind of evolved. So depending on the type of project, we're finding it very applicable to any of the different projects that we have going. So we're not rigid about the kind of information that we're collecting. As I say, in almost more than the data that we're collecting at a corporate level, the value in this is a business area getting to understand its own data management needs and the way that it stores and value ads information through its workflows. Okay, so thanks for that. We've got 14 minutes remaining. And I think we've actually got a strong sense of what the main challenges are in capturing provenance information in both your organizations. But I wanted to give you a chance to maybe raise any other challenges starting with you, Carl. Sure, yeah, we do have a few challenges that I wanted to highlight. One is consistency in information. And so often where possible, we use tables of information that they can select instead of use the entry. What's also a challenge is communicating the value of metadata to those who are decision makers because often in terms of the short term, they don't see the value of it, but we recognize the value over the longer term. Privacy is an important aspect because recording who entered the metadata is important because sometimes there can be a, this person always does it this way or something like that. And so if you can do that, but we don't wanna necessarily record their name and so on. So that kind of information that is important. And I guess ongoing motivation of staff is also important because they keep doing the same old thing every site that they go to. And so in the beginning, they can be well motivated and be comprehensive in their metadata. And then over time, they're on a two week trip driving around remote parts of Northern Territory or something. They start their enthusiasm, lags a little bit. And so ways to encourage them and to monitor that is important. And then also to identify what's missing, that's something that's not there. It's very difficult to identify remotely. So how do you find, how do you have a process in place to document those things that somebody may just not see? And we find, we have a number of skills in terms of our staff. We have those who have a great, it's a meteorological understanding. And then those who are maybe more engineering technical and they might not understand things that influence the measurement of one of our meteorological variables. And so they may not recognize the impact that, you know, a new, let's say a new building next to our observing site might have with maybe radiation reflecting off the building and increasing the heat load of the Stevenson screen. So I guess those are, I guess, and one last thing is database changes, you know, there's always a need to update databases, move them to new, you know, software systems. And sometimes those changes, assumptions can be made about certain fields which will change the nature of them. So that's the challenges that we face in the Bureau of Meteorology. Thanks. Thank you, Carl. Matthew, do you want to highlight any other challenges in capturing provenance information? Sure, I think that convincing staff of the value, I concur with Carl completely there, providing some demonstration that there is value in doing this extra task, as small as it may be. But it's always later in the project or even later down the track. When you need to understand the data products more deeply or from an external viewpoint or you need to access it. So one of our challenges is definitely that ongoing demonstration of the value proposition here. Having said that, we are finding that the value does present itself. When we start working with projects, it only takes one or two instances. Some people get it straight away, but it only takes one or two instances of somebody who wasn't quite sold at the start when a certain request comes in and they need to understand something. We've had managers who've needed to make a rapid decision about whether or not to release a certain component of an analysis and our charts are very much enabling that for them to rather than spend time deliberating where did this come from? Who did the evaluation? Does it have good quality? That's really demonstrable through these things. Two more challenges, I guess. One is our ongoing development of the tools. So we knew that we needed a framework to manage these governance processes and these workflows, and we were able to put some time to that. It took us a good couple of years to really work our way through this and get to some reasonably mature templates, but we are finding that they could do with further development as with any kind of information system, I guess, as we learn slightly better ways of doing things. There is definitely a challenge in keeping that level of effort up as other priorities come to bear. I guess the other thing is that, and these are good challenges in a sense because I guess it shows that the system is working for our officers and our project participants to understand some of the standards around classification and review, quality assurance. Indeed, the standards around where we store our data or where we store different types of data, that feedback loop of people then recognizing that there is a corporately supported way or certain structured databases that you can engage with. So I think that's a good challenge that comes out of this is people actually upskilling and working so that our data management is better right across the board. I think they're our main challenges in trying to keep this system going and growing. Oh, thank you. Matt, we've got a question from Avert Blaise in the audience. He wants to know who is responsible for collecting and collating the governance metadata. So who is responsible for it to the governance of the governance? We're trying to encourage the projects themselves to own that task. It's when projects are initiated inside government, there's a clear line of governance one way or another. That might simply be a line of management and then the person at the top of that line gets to, you know, sign off on the final products. What we try and do here is embed the understanding with the project officers and the middle managers in particular to be able to describe that governance. So we really try and just provide support for them to document that governance. But the responsibility itself, but the responsibility itself, we put into the project itself. So all of these are project resources. I guess another challenge that we're looking at is if we've got six or eight or 10 projects that are all collecting this information, how do we compile that to get a deeper understanding at a corporate level or at a meta level what's going on? That's not really our focus at the moment. I think that would take a whole lot more resourcing. At the moment, as I say, we're pushing this back to be of value to the projects themselves to help them execute and maintain transparent and defensible workflows. So that governance here, we try and keep that in the project itself. We support it from outside, but we want them to own the need and the value of collecting that governance information. Mingfang has had a comment to summarize the panel discussion, basically that good data management practice and training and anticipating downstream data use of a two top keys for provenance. I think I can agree with that for sure. But I think we've also had a lovely contrast here. Obviously, authenticity and trust are paramount in capturing this provenance information, but the challenges between the two organizations vary from data quality issues to I guess organizational or a culture change type angle. Thank you. Did you have any questions for each other? And also this is your chance if you're in the audience with four minutes remaining to throw in any more questions? No, but well, maybe for Matthew, in terms of you talked about incorporating different data sets to make a product and you keep some traceability on what changes happened to that data and how it was incorporated. If you could just very briefly explain that that would be interesting to me. Sure, thanks Carl. That's really, so these charts that we have will, you know, we have an icon here. Say we've gone out into the some location and we've taken some photos and then we've written some stuff down in a table and then we bring those back. We might have taken some video. We bring those back and in the chart we simply have a box that says analysis or compilation or derivation or summarization and then just a few dot points underneath that of what actually happened. So we might say one of our really good examples is our underwater dive surveys for our marine parks monitoring. So the guys go out with the boat and they tow videos behind the boat and they collect fantastic videos of fish feeding on bait. They bring that back into the office and they run it through some software that can recognize different species of fish and counts the species of fish and then puts those into tables. So we can say, you know, we collected the video and we wrote down the date and the time, et cetera and then we process that and we extract from that or distill from that a table of species count, for example. And then that species count data, that table then goes into our biological observations system and then that can be either supplied onwards or goes into some kind of understanding of the species richness over time in that particular location. So to capture the processes in there, we simply say compilation or analysis, whatever makes sense for the project and then a couple of dot points about what they actually did and so whether they can record in L, we use the MacSense software or, you know, and then we put the data into such and such a database. So again, it's really what makes sense to understand that part of the project so that somebody who externally can come and look at this thing and say, oh, I see what's happened. You know, you've collected some video, you've run it through some software and this is the table that's been output. Again, it's a very human oriented and user oriented thing rather than a machine to machine. You certainly couldn't get a machine to read these PowerPoint charts and then go and access the information but maybe one day down the track, we could get closer to that. So in the remaining seconds, Matt, we have a question from the audience. Are those templates that you use, are they available? Are they open source? Yep, they will be. So we'll make them available with the recordings from this webinar. They're certainly not polished but we and they are internal resources that we use within the agency but we're going to make them, the links obviously won't work to our internal systems but certainly the format of them and we're very happy to share that and hope that people can make use of them. Okay, so my apologies to the questions that I didn't get to but we've run out of time. Thank you very much to Matty Miles and Carl Monic for joining us today. Thank you also in the background to Susanna Bacon, to Kerry Leavitt and Julia Martin and Mingfang Wu for helping set up this prominence panel and thank you to everyone for attending today.