 Are we going to be thrown out at six? Sorry. Are we going to be thrown out at six? You have 15 minutes. OK. No problem. OK. Good. Thanks for staying on. This is a progress report on the project that I've been presented also in the previous area in Maastricht. Today, me and my co-author would like to present the status quo of a study that began in 2016 with a pilot project supported by the Netherlands Royal Academy of Sciences aimed at improving the quality of archives archaeological field walking or survey data sets. So they become reusable in practice and not just in theory, as I will try to convince you in a later slide. Although this work was mainly concerned with archaeological field survey data, particularly from the Mediterranean area, this topic has much broader relevance that we want to present and discuss here. Firstly, we all have a moral obligation to ensure that our research data remains available and reusable by other researchers in the long run. So we cannot dump it into some repository in such a form that nobody can actually do anything with it. Secondly, we have funding bodies, research funding bodies, as well as our own employers, the university, who require that we take measures to safeguard the data that we produce and that we deposit them somewhere where they can be checked for truth, I guess, if necessary. So this is referring to something that happened in the Netherlands where we have a major case of fraud and the government decided that research data now have to be put somewhere where somebody could check that they are not based on some fraudulent action by the researcher. Unlike in some of the hard sciences, as archaeologists tend to have quite individual approaches to producing and storing data, and this leads to great problems if we want to share and even merge data sets. So if we don't understand properly the structure and meaning of other researchers' data sets, then we are unable to join these data and query them in any interesting way. Skip that one. So I'm probably preaching to the converted in this room, but I want to use the next few slides to remind you of the corner that we have painted ourselves in as field survey archaeologists. We are unable to effectively share and merge our data sets. Thinking about archiving data sets has crystallized around this acronym FAIR. So your archive, your digital archives must be findable, accessible, interoperable, and reusable. And the first two of these only ensure that the data can be found somewhere and accessed. It doesn't ensure that we can do anything useful with it. The latter two, interoperability and reusability, are supposed to ensure that those data will make sense to other people, and that there are no technical obstacles preventing them from being reused. Just to illustrate this, I'm using some current archiving practices. Currently, the Dutch institutional digital repository for archaeologists is with DANCE Data Archiving Network Services. That's an institute run by the Royal Academy. That's responsible for the management and sustainable maintenance of digital data archives. Within DANCE, there is the so-called ED Power for Dutch Archaeology, EDNA, and here we have a look inside EDNA. So this is the EDNA starting page for one of the Italian data sets deposited by my institute. So just in general description of the project. Here's the description, which uses 15 Dublin metadata descriptors of that project. And finally, here we have about 725 individual data files produced by that project. And this is what the archive is composed of. So this particular archive is findable through the website of EDNA. The data is accessible if you have the proper rights as an academic scientist. But the interesting thing is nobody has ever requested this data set to do anything useful with it, except us, because we would run a test on it. So why has nobody requested this? And that is, I think, because other people's data are a big to work with. I think you gave an example earlier on. And basically, you give up after a while because it's too difficult. When we requested another survey archive from EDNA to investigate whether the information supplied would allow us to understand the data set, we very quickly had to give up. So current best practice, as shown by this EDNA system, can be described as fulfilling the FA principles, but not the IR principles. If we look at this internationally, we can point to, for example, something called FASTI Online, which is a service built on what used to be called MAGIS. Here we have many other groups and organizations have produced survey data in the last 50 years. And it would seem logical that those data should somehow be mergeable and reusable for large-scale analysis. This FASTI Online database shows the status quo of archiving of field surveys in the Mediterranean area. Here we see Italy, for which some 120 projects have been recorded since the 1960s. Quite a large number, but many of these were never or only partially published. So the only thing we actually have here is a point into the fact that the project exists. For the digital survey archives are very rare. Some of our own surveys and some other ones that I won't mention here. So there's only a small number of actual archives available through this portal. So here, the data appears to be findable, but mostly not accessible, let alone interoperable and reusable. So as a discipline or a subdiscipline, we are doing very poorly. The original PsyDoc CRM from the Museum World produced a relatively small set of concepts of general applicability. We've seen this diagram before. For example, actors and events. And domain experts have extended this system using more specialized concepts. In this diagram, we have the visualization by the EU project Ariatne, which includes various extensions that are of interest to archeologists, like inference making, geographical concepts, building architecture and excavation. And of course, if we want to do something with field survey, it should come in somewhere over there, preferably inside the CRM archeo extension. So then we have to make a set of concepts about survey archeology. We started out with some test databases containing field data and ceramic data guarded during recent field surveys of our own institute in the University of Groningen in central Italy. These surveys have been ongoing since the late 1980s. And the extent and intensity of this type of research has increased considerably since the late 1990s. So that now our department owns several different extensive data sets containing survey records. As you saw earlier, we don't believe that Dan's archives contain de facto interoperable and reusable data sets. Luckily, we received some funding in order to investigate steps in the direction of creating an ontology for survey data. And I'll skip through this more quickly because you've seen similar material in earlier presentations already. We started out by trying to analyze what it actually is what we're doing as survey archeologists, so we can identify different types of activities that we undertake, different types of actors that do these things, places in which we do these things, and other types of concepts that seem to be necessary in order to describe field-working surveys. And we visualize this in this kind of a process model which shows that we have three stages of research, let's say the actual field survey, the field work, which is the central part, that immediately follows, or even at the same time, by artifact studies, the artifacts that come out of the field are being processed by specialists. And then also at the same time or later on, there's a lot of data interpretation going on, very poorly controlled. As one example, it could be that sites are defined afterwards on the basis of the collected data of the field survey. So that was our understanding of what we were doing. The next step would then be to link somehow the concepts that we have just found for ourselves to existing concepts in the CDAC CRM for its extensions. We don't want to invent all new concepts if there are existing concepts to use. So here's an example of concepts, a hierarchy of concepts in the CRM, and here we have our own few of the concepts that we try to link to one or more of these existing concepts. I won't go into details of this slide, just to show you the step in the process. Sometimes this works nice, you can immediately understand, okay, this is the concept that we need in order to describe this part of our own data, but in a lot of cases we have doubts or we completely don't know which concepts to use or we think we need completely new concepts to describe parts of our data. So there is a number of issues to resolve and these were discussed briefly in a meeting earlier this year of the Special Interest Group for CRM. So in view of the time, I cannot go through this in detail, just do one or two of these. So we have the problem that if you do field working you don't just make observations about the field, which are also collecting things at the same time. So you cannot say that field working is a type of observation, it's more complicated than that. We have a lot of problems with the fact that we make temporary groups of, for example, of finds. Let's say I have here 12, 12 amphora shirts, they come out of a bag of shirts. I actually make that group very temporarily when I describe all the finds in this bag and when I finish describing them I put the shirts back into the finds bag. So there are no longer physical separate group. So we have temporary aggregates that exist very briefly, but we have also data about this temporary aggregates that we need to model somehow. So we don't know, so we put in the form of a questionnaire how to deal with these. Okay, so I skipped the moral of these issues. I thought seven was a nice number to stop. So supposing that we can resolve all of this with the help of the special interest group who answer all of our questions then, the next step would be to map our own and other people's databases to this set of concepts. We haven't got very far with this yet, but actually during this very conference I'm going to speak to three different survey data set owners and see how far we can get with this mapping. So I guess that means that in the next DAA I should be able to tell you more. So what does it mean to map? Well, basically you look at what is inside your own database and you define how that maps two existing CRM concepts. So for example, here's one of our own data tables and in the red ellipse you see some detail. You see if I can find it quickly. So the first three fields of this table that I show here contain information about the survey unit identifier, the type of survey unit, and the administrator responsible for taking notes. So we would like to find the appropriate concepts for being able to share this data with colleagues Our mapping document then defines the database field content translated to appropriate CRM classes and specifies the relationships between these things. Thus, the unit ID in our table called units is so-called identifier E42, which identifies P1, a particular survey unit, which itself is a declarative place SP6, that is the process of mapping. Of course, I don't want to say this is correct mapping, this is an example of how you would map a little part of a survey database. Now, because I'm running out of time, we go to the outlook, do it like this. So as I said, we're currently testing our ideas on other people's data sets because we need to know whether they're sufficiently robust. If they work for us, we cannot be sure that they work for other people's data sets yet, so we have to test this. Also, when we've done this, we still haven't done anything useful. We have to be able to share the data with other people, so we need to take further steps to convert our own database into linked open data that can be accessed over the internet and merged with other people's data. Thirdly, if we can do what we want to do and make a good CRM-archeo conceptual model, then that could contribute to better archiving practices. The current archives don't use any ontology, and maybe we can sell this idea of an ontology to digital archives like Danse and Edna and they can apply it to their own archiving. Lastly, already proposed last year in Maastricht, if we want to be successful in asking for more funding for developing this conceptual reference model, then it might be useful to obtain political support of the EAA. So I've been trying to get this support by mailing with the president and some of the secretaries. It didn't get anywhere so far, but I'm hoping to use this way here to get one step further. Thanks very much. That's all we have time for.