 I guess from my point of view, first it's exciting to see the range of stuff that's happening and it's even more exciting to realise that this is just the stuff that's happening in Victoria. Of course there's similarly exciting things happening around this vast brown sunburned land of ours so that was actually really fun. What I want to do now is provide some context. Now for those people sitting at the back can you hear me over the air conditioning because it is a bit of a rumble? Good. I want to provide some context. You will have heard on a number of occasions this morning people saying we want this to end up in the Australian Research Data Commons. And when we talk to people we detect some confusion about the relationship between the Australian Research Data Commons, the Collections Registry and Research Data Australia. Possibly in the past that may have been because we were confused ourselves about the fine distinctions between them. No, no, no I'm getting head shaking. Alright possibly it's because I was confused or because the message was a little bit too nuanced. So I'm going to have another go at explaining the difference. Hopefully this time the new improved message will work a bit better. If it doesn't bail me up at morning tea and say no that still doesn't work for me. So I want to talk about a thing that we're starting to refer to internally as the four transformations which sounds a little bit you know China 1960s cultural revolutionary but hopefully won't involve too many struggle sessions. I want to talk about the Australian Research Data Commons. I want to distinguish or I want to talk I guess specifically about the role of the Collections Registry and then finish off by talking a little bit about research data Australia but not too much. Because otherwise Sally who's going to have a longer slot later on in the morning to talk about it will kill me. So the way that we've just started talking about what we're trying to do in ANZ is in terms of transformations. So part of the challenge for us is boiling down this you know huge complex swathe of activity into a series of very simple messages. So these messages are sufficiently simple that they've been tested by the steering committee who all say yep that works for us. And so you're going to see this set of transformations appearing more and more in the kind of ANZ branding. So what we want to do is move from the world on the left to the world on the right from things that are unmanaged and disconnected and largely invisible and that only get used once typically by the researcher who created them to things that are managed and connected and findable and reusable. I should parenthetically insert here by the way I lost the argument inside ANZ to have the opposite of reusable being re-useless but I haven't given up. I'm continuing to try and get this up as a meme. It just hasn't happened yet. You'll notice by the way that on the left we're talking about data and on the right we're talking about structured collections. So the emphasis for ANZ is on coherent collections of data and we've got some exercises later on this morning that are looking specifically at how you might define what a collection is. And why are we doing all of this? We're doing it so that Australian researchers can discover and access and use or reuse the data. So that's if you like the very high level message on what it is we're shooting for in ANZ. So what does that mean in terms of these components that we've been talking about? So the phrase that people kept using this morning and that was great I'm thrilled that people are using it was the Australian research data commons and the current version of the business plan which is going to be out real soon now I hope says that the Australian research data commons is the combination of a set of things. It's shareable research collections, some held by institutions that you work for, some held by other national agencies. It's the descriptions of those collections. It's the relationships between not just the data but the context around the data and I'll talk more about that this afternoon. The researchers that produced it and the projects they worked on and the instruments that they're using and the institutions they work for and you'll have seen all of those components already in the talks this morning and the infrastructure that's needed to make all this work. So all of that is what we mean when we talk about the Australian research data commons. It's not just the particular bits of software that ANZ is providing. It's not just the data that you're feeding in. It's this whole rich context and one way to think about that is to think about the situation as it exists now at most institutions. So within a lot of the institutions that you work for and that I used to work for there is a lot of locally managed data and some institutionally managed data and in fact the size of the balloons is probably inversely proportional to the way the reality is. At most places there is much more locally managed data than institutionally managed data and by locally managed I mean the kinds of things that gives Gavin McCarthy conniptions you know floppy disks in drawers, stuff on yellowing printouts, USB drives, things on CDs and DVDs with out any labels on them, stuff on the PhD students laptop we left last year and so on and so on and so on. An archivist, is that an archivist dream or an archivist nightmare? An archivist reality, okay. So there's a lot of that stuff happening up on the left. There is some stuff that is institutionally managed that's growing. Some of the things we're funding are contributing to that and so you've got data that's going into an institutional data store or multiple institutional data stores. There's some context again, institutionally managed context, information about the research grants at that institution or the researchers at that institution and that typically ends up either in an institutional portal or ends up being made visible in an institutional portal, not very often. In some cases it gets fed out to discipline portals so particular disciplines have got well organized systems for exposing data. So if you work in genomics you know that you have to put your stuff in GenBank. If you work in protein crystallography you know you have to put your stuff in the PDB but those are the exceptions. For many many disciplines there isn't this kind of well established structure. So if that's the world that we currently have, what's the world we're trying to move to? What we're hoping to do in ANS is add a series of components that will get us towards those four transformations that I talked about. The first of those components is a metadata store to manage that context better and you'll notice that's now in the institutionally managed box and I'm going to talk this afternoon about our goals for metadata stores. We're adding the notion of collections descriptions and we're adding this thing called Research Data Australia which if you like is a kind of an ANS portal. I should say by the way I've just realized that this is a slightly old version of this diagram. Ross Wilkins and the Executive Director of ANS and I were having an argument about how this diagram in my view over simplifies the world and he said no you idiot that's precisely the point it's about making it simple. So I said look can we just add one more line please let me add one more line and I managed to win that argument and of course I've forgotten to put it on this diagram. The missing line is the line between the ANS portal and discipline portals because we want to take people from our discovery environment to those disciplines specific discovery environments where that makes sense. So the the IMOS portal, the Australian Ocean Data Network portal, the old scope portal and so on and so on and so on. So there is actually a link between those when you you know if you can just draw a line between those on your printouts and I'll try and make sure that the version we upload has that line added. Some of you may have seen this version of the diagram, by the way Ross likes this, doesn't like this but I'm going to use it anyway because it enables me to talk about some things that that diagram doesn't talk about. This is very much an institutionally focused way of viewing the world. I mean it very much says these are the things that need to happen inside research producing institutions to get this transformation. But ANS is not only concerned with research producing institutions, we're concerned about a range of other groups. So down the bottom left, well the bottom and the left, the stuff in green, those are if you like the sweet spot that most people think of when they think of ANS. Research data down the bottom that's being produced by researchers and research institutions and on the left hand side these big discipline investments like the synchrotron that you've heard about and ANS. So that you've heard about this morning. But in addition, so those things are if you like research outputs, in addition, on the top and on the right hand side are what we talk about as research inputs. That is, things that are being produced by organizations that researchers could use if they only knew about it and you had to get to it. So on the right hand side, public sector operational data, stuff that's being held by government departments for their own operational purposes, but the researchers could use. So Department of Primary Industry in Victoria goes out, works with farmers on soil salinity. They measure salinity in particular paddocks, make recommendation to the farmers that, okay, in this particular paddock, you've got a salt buildup, you should sow this variety of wheat rather than that variety of wheat. That data at the moment is held on desktops within various DPI offices, but it could be useful to researchers if it was made visible and available. And up the top, the cultural collections agencies, galleries, libraries, archives, museums, again, data that they hold, data collections that they hold and manage that researchers could make more use of if they could get access to them and knew about them. And so we're interested in making all of that stuff available for people to discover. So if that's the Australian research data commons, what's the role of the collections registry? Well, the collections registry is a particular bit of infrastructure that we run, we and run centrally, that holds information about collections that we get fed by you and by those other groups. And some information about associated entities, I'll talk more about the entities. In fact, Sally, when she talks about ISO 20 on 46, we'll talk more about the entities. It's mostly populated by data feeds, and you've heard a number of references to feeds this morning. And its primary purpose is to enable discovery so that people can find and access and reuse data. The way it's populated, I'll talk more about this this afternoon when I talk about metadata, is via feeds in a format called OARPMH, which I'll be talking about. And the preferred payload is this thing called RIFCS, repository interchange format collection schema. But we can also do an HTTP get rather than OARPMH. PMH is preferable for reasons that you have been will talk about. And we can also take XML in a variety of formats and transform it. But those top two are the preferred mechanism. What we're doing with this, you're not meant to be able to read the small print on this, don't worry. What we're doing with this is we using the collections registry to build a series of web pages that we call Research Data Australia to enable discovery. So it's from feeds off on the left into the collections registry, building this set of interconnected web pages to enable a variety of discovery mechanisms. And those interconnected web pages are for researchers and research projects and collections and organizations and services all linked together to build this rich complex interconnected mesh.