 So, this is of course what we're trying to build and data capture is, I'm assuming you've seen a number of these slides before, but data capture is really, you mean it was a jigsaw puzzle, it's now this. You don't want to say the data commons is a jigsaw puzzle. So I guess what I really want to talk about is what happens there with the research data and what happens there and I'm assuming Claire's going to talk about the two brown things in the right and top quadrants. You've seen this one? That's a jigsaw puzzle. That's another jigsaw puzzle. So what I'm hearing is you're deconstructing the things that I've been working on, but that's probably good. The point about this is that in the curation continuum, the data capture really kind of happens down here in what back then I called the private research domain. So it's the stuff that's coming out of instruments and laboratory information management systems and it's getting captured into there. And I want to structure the few minutes I've got this afternoon around the data sharing verbs. Is that a jigsaw? No. Excellent. Next time in March. It could be. So how many people here have heard of the ANS data sharing verbs? Wow. Okay. Excellent. I'm not going to talk to you about the data sharing verbs much, but they are one of the ways in which we think about the things that need to happen in order for data to be reused. And the names of some of our services are modeled on the verbs. So identify my data, register my data, describe my data and so on. Everything outside the verbs, these green things in the middle, are planning. I'm contractually required and Margaret is sitting on my right to talk about planning. So the planning has to happen in order for this to work. And down the end we've got the preserved verb which we haven't talked much about yet. We've been kind of taking the cop out which is oh we have to finish by the middle of next year. Haven't got time to worry about preservation. I don't think that's any longer a defensible argument so we are going to have to start worrying about preserving. So creating or capturing, because of course increasingly the way data is generated is by capturing it from something rather than creating it out of thin air. Ross and I in a paper we wrote for eScience a couple of years back made the argument, I don't think we're the first to make this, but we made the argument in this paper that the closer you can get to the point of capture for capturing the metadata the cheaper it is to do. So in other words pushing it out down the track makes it harder to do and more expensive. And so one of the things we've been trying to emphasise in the data capture program is focusing on putting in place infrastructure to make that possible and as you know we're investing $12 million directly with the institutions through the expression of interest and another $5 million that we're proposing to spend in a variety of thematic ways. Another way of thinking about it is this which I suspect is also not a dig still puzzle. Good, excellent. So this is the diagram that we worked up for that eScience paper and essentially it was looking at how you might go about reducing the cost associated with metadata. So you have an experiment, it generates some data, that data goes into a laboratory information management system, you add some experiment metadata, you perhaps migrate it across to a collaborative, so this is the shared domain in terms of that curation continuum, which makes some context explicit, you finally deposit it into some kind of public repository and you can now discover it. And so down the bottom here you're going to try and auto generate your technical metadata rather than asking the researcher for it. Here you might try and extract some experiment metadata from the laboratory information management system, here you might insert some context that you've already got and here you might provide some discovery metadata and you can see that idea that we were talking about earlier on this afternoon that part of that discovery metadata might be a link to the publication. In terms of the kinds of metadata that I'm going to talk about briefly in a little while and more tomorrow, this is what you might call information for reuse, the stuff that would let you reuse the data, that is information for determination of value, the rich context around the data and the discovery is obviously information for discovery. So we've got the create verb, store verb, we don't do storage but we care that it happens. We've got the advantage in Australia that we have the Australian Code for Responsible Conduct of Research which has obligations on researchers and on institutions and clearly we're interested here in working with institutional and national data stores and integrating with the new national data fabric, whatever that looks like. So Florian's talked about the data fabric environment that we have now, there's this extra 50 million that Ross talked about which is going towards a new national data storage infrastructure and Anne's clearly cares about how that's going to pan out. Once you've stored something you need to describe it, I am going to spend more time on metadata stores tomorrow so I won't say much about this now. These four kinds of information that I referred to before, discovery, determination of value, access and reuse and you'll note that some of those might apply most obviously at collection level and some might be both at collection and object level and again that's a theme I'm going to return to tomorrow. You also need to identify things and of course we have the persistent identifier services, the identify my data service that give you this level of indirection to assist with long term access so you provide a handle or a DOI, you can then repoint that as the object moves around. So the handle version of identify my data is available now, there's machine and human interfaces and Anne's thanks to the sterling efforts of Adrian Burton who is going to be spending tonight up late talking to the data site conference that's happening in Germany at the moment was a foundation member of the data site consortium which is looking to produce DOIs for data objects and obviously the fact that we're going to be able to produce DOIs for data is going to assist us with that linkage between publications and data that we talked about before. So DOIs for publications and DOIs for data are going to help with that. So those are the only verbs that I want to cover, the remaining verbs are not directly relevant to data capture and we'll probably get covered by other people. I did want to talk a little bit about the data capture EOI process because I understand that there's been, we could have been perhaps possibly clearer about what the sum of the steps were for getting from the EOI to signed contract. So the letters went out in November, most of the institutions got a visit typically from Ross and me and sometimes some other people. The institutions then worked with their assigned ANSBA to refine the project description and that's an iterative process. So typically what's happened is there's been a first draft, it's been sent either to the ANSBA or to me or to some of my staff to comment. We've sent it back with some suggested changes so there's been a few goes around that loop. Andrew Williams in particular bears the scars as some of those goes around the loop. But the point is that when you fall out of that loop you now have a project description that's in reasonably good shape. I would like to think that, you know, I added a tiny amount of value, Andrew. At that point we send it off for independent assessment and the reason for that is that we need to be able to say to the auditors we weren't just handing money out willy nearly to our friends that there was an independent assessment process. So that's typically done by a couple of my staff who will independently look at it. One of them in particular is very, very good at, she describes herself as a hole puncher. She's good at finding problems with things which is a rare skill and she's very good at it and as a result what will happen is there will be an assessment which will come out which will in every case so far identify problems. There has not been one data capture project that has got through this phase unscathed. So if you get told, so I'll come to you in a minute Frankie. So if you get told that there are problems it's not we're picking on you, we're picking on everybody. I'm sure that makes you feel better, Andrew. Frankie. I have a quick question, because there's so many loops, how do you know if you're at the loop stage where you're still working with the ANSBA, if you're at the independent assessment stage? If you think, if you've got to the stage where whoever you're working with has said I think this is ready to go off to be assessed, that's how you would know I guess. There is a tracking spreadsheet that we maintain internally that says where we're up to on those. Can you follow that up and find out? Yeah. So you are in fact according to the current iteration of the process it has been through almost as many iterations as there's been. Possibly more. Possibly more. So you are supposed to get a letter from ANS, telling you that we're about to do the independent review. A formal letter? There's an email from ANS. Right. Okay. It's not a letter with process signature and letterhead, but it is supposed to be an email from the VA saying we think we've agreed, passing it on for an independent assessment. No problem. Well, I've come. And yes. But it may also not have come because it's a relatively recent step in the process because people having exactly that little confusion. So I was asked about, I don't know where I'm in the process and we were making people like Andrew's life difficult by not letting people know that that was happening. And because it was fun. Sorry. And because it was fun. Yeah. Well, you know, that's part of what the Director of ANS is all about making people's life difficult, but on the whole we try to do that only to our own staff. Yeah, that's true. So it goes through the assessment process. Sometimes, and the assessments then go to me, and I don't always agree with them. Sometimes the people who are doing the assessment will say, we can't fund this and I'll say, well, no, we can. So sometimes there's a loop there where I'll bounce it back to them or to here. Finally, what happens, what falls out of this is once the project description's been improved as a result of that and that, or hopefully improved as a result of that and that, I write an approval recommendation, sorry, the BA drafts an approval recommendation that I sign and send off to Ross. At least sometimes at that point it says no. And we have an instance at the moment where something's gone to Ross and he said, well, I'm not entirely happy about this, which is why there's a loop on the end. Once Ross signs off and says, yes, I'm prepared to approve the expenditure of $875,000 with this institution on these things or whatever the amount is, we then move into contract processing and finance processing. I can't give you times for any of those things with loops on them, obviously, because it depends how many times around the loop you go. Nor can I give you times for this or for this. The reason I can't give you a time for this is that people's ability to lose contracts is truly astonishing. The current record is a nameless institution that managed to lose their contract three times, which I think is, you know, they're just clearly trying too hard. So what was the, was it, was the mean time for this measured in, was it 90 days, 95 days? Like 90 days, because the back and forth between, it depends on how A, whether people don't lose their contract. And that does happen surprisingly often. The, whether they then manage to lose that within the institution. So if someone gets a copy, he says, yes, we've got an incentive to the lawyer or whoever needs to go in there that's lost there. And then depending on how much value the lawyer decides to add at the institution. Some of them are quite happy. Some of them decide they want to go right back to first principles and learn how to question the contract. We have had people quite a question. Monash's contract with Disar, which is the overarching contract under which all of these are funded. Yeah. We think you Monash shouldn't have signed this deal with the department that gave you $48 million. Well, okay, fine. Thanks for that. Probably not. So it is a slow process and we are finding that it takes a lot of probing at our end to actually get the contract back from whence they have gone. Now I say this not to depress you, but to simply point out that do not assume that once Ross signs the letter, the money will flow immediately. So I'm just managing expectations down. And is this my last money date? Yep, sure. Okay. So status update, second last slide. 23 institutions got data capture funding. There will not be a second EOI round. Let me just say that again. Any of you thinking that there will be a second EOI round? There won't be. There's some money that we haven't spent yet, but it's not going to be an EOI. Over 20 have started discussions about what they want to do. Over 10 have near-final proposals to, I think, have signed a contract. And as Ross has indicated already over the long term, the shift, and by the long term I mean a year out, data capture is going to shift from commissioning to adapting and adopting and deploying. So the kinds of things we're looking to do in data capture in this year are not what we're going to be doing in the two out years. So that's going to be an interesting kind of shift. And that's why we're going to be giving you, not giving you a grief, but pushing hard the idea of reusable software. So not something that's just specific to your particular case. And of course, this is why we're doing it all because we care about researchers. This is serendipitously stolen from today's first dog on the moon. I'm assured that the titles at the top are actual research, that the Department of Innovation is currently funding. Temperature-specific adaptations of Antarctic octopus venoms. But I'm assuming that the gloss on the bottom is all first dog on the moons and there's no relationship to reality. And I'll stop there.