 Thank you. What I want to do is really talk about the fact that this thing called the research data alliance is in in existence and is becoming quite important. Just six months ago the research data alliance was launched in Gothenburg in Sweden and we started something about sharing data without barriers around the world. No mean feat and yet incredibly important and completely in alignment with what we're doing here in Australia. So Australia was one of the three founding members of RDA along with the European Union and the US through the National Science Foundation. So I had some pretty heavy hitters behind it and people turned up to the first meeting with a degree of enthusiasm and looking forward to what might be. From Australia Stephanie Ketters and Andrew Trelaura myself have been involved in helping to perform RDA. But really the thing about RDA is that it's what it's going to do and I'll get to that in a moment. What it's intended to do is to enable data to be shared more easily and rapidly. And the reason for that of course is that the data environment we're in is changing really quite quickly. Australia's made substantial investments in its research data through investments like IMOS and TURN but also through more generic investments such as RDSI and the investments that many of the organizations that are sitting around this table have made CISRO has made some substantial investments but many of the universities have as well and right around the traps we're seeing the need to treat data differently has occurred. So what I wanted to do really today was to give you an update on the second plenary. This was an interesting thing. I have used friend Berman's slides as the basis for this presentation. Fran is one of the co-counselors of the Research Data Alliance and I spend many evenings talking about RDA with Fran. But the notion of three days of peace, love and data is reminiscent of the first meeting and the notion of the woodstock of research data occurring. And it was very interesting. So we had 368 participants from 22 countries and all sectors. Huge amount of interest in this. We had quite a lot of significant country interest in participating in RDA at the funder level. We had interest from Germany, South Africa, France, England, China, South Korea, New Zealand, Taiwan, and the list does in fact go on. One of the interesting things about RDA is that it clearly has momentum and so data site convened a data citation summit the following two days after and those of us with energy managed to attend there. So the nature of the meeting was essentially a working meeting but it started out with some clean notes. Tom Kalil from the Office of Science and Technology Policy Unit at the White House spoke. The significance of that is that it's saying that right at the very top of government throughout the world we have people interested in these issues. So the first plenary was opened by the deputy commissioner of European Commission whose name briefly escaped me. We then had Tom Kalil speak and really backing up the comments made around Nelly Cruz as the name that was escaping me briefly and perhaps typing in. So we had the US speaking after its major announcements about publicly funded data being publicly available whether it's from the research sphere or from the government sphere. We then had as per usual an inspiring talk by John Wilbanks who is known for his work around the creative commons really driving the agenda that says that it's in everybody's interest including providers to make that data available and then Carol Palm providing an insight into some of the issues around making that data available from a library's perspective. We had Claire McLaughlin listed as from the Australian Embassy but most recently responsible for getting the interest to be approved through government. We had a bunch of people talking about why they saw it was of interest to work in participation and cooperation with RDA from Code RD, ECF, W3C and data site etc. So what this slide says is that there's a significant degree of interest at the under level in RDA. Now this one says that there's a lot of people involved. 50 countries have participation in RDA, academics, research sector, private sector, public and there's a few people who are not known in that space. But the key is really that RDA is building momentum. So RDA is intended to enable work to be done and delivered rapidly. I guess we use the metaphor of building bridges between data locations enabling people and data to connect through the bridges. Now sometimes those bridges are going to be essentially a technical in nature and sometimes they're going to be more in the social nature. You know, how do you in fact encourage those sorts of things. But one of the things that is the hallmark of RDA is that the work needs to happen fast. RDA is not set up to be a 50-year initiative. We have the notion of a 10-year initiative and what we're trying to achieve is to get consensus and dated to be shared amongst those groups quickly. What we want to try and do then is to put together in particular working groups who will deliver results within 18 months. So what we're trying to do is follow roughly speaking the IETF model of groups formed as a result of people coming together and saying we need a problem resolved, working on that problem coming up with a resolution and then delivery. So there were lots of birds of the feather sessions established there. There were groups starting to look at how they were going against the delivery of promises that they made six months ago because that's one-third of the way through the end of the group using the model that we're talking about. There are a whole lot of stuff that came out of that. One is to ensure that the IP that is developed in the interest in working groups is protected so that it is able for all to be used and also how do we in fact enable the RDA activities to be delivered. But the pictures on the right are the heart of what RDA is. It's not about people sitting in a big auditorium listening to people talk. It's about people doing and so these were the 360 people turning up at the meeting doing. So RDA is intended to be a doing group. So what is it doing? This. So these are the things that emerged during the meeting prior to the meeting at the last meeting of areas where there was a decision made by people who practically came along and said yes I want to put some effort into thinking about or working on this to coming up with what might happen. What I want to do now before I go into the detail of these is distinguish between the various bits. So the working group is intended to be a short, sharp piece of effort that tackles a problem and comes up with a solution that is adopted. Adoption is the key. So what we wanted to see was work that was done and delivered and then used. So from an Australian perspective we needed to know whether the work that was being done in this space was delivering. So we'll come back to that in a little while. The interest groups were where perhaps there were a number of pieces of work that could be done but there was an interest in the longer term in these areas and there was work needed to be done in this space. So there was if you like an area for conversation with the aim to take those pieces of conversation and spin off working groups that were delivered. Now some of the activities that might make sense let me give you a good example with a community capability model which is really describing areas ready to engage in that space where in some sense the output is less a specific example of data being shared and more an example of ideas being shared and the mechanism that would enable those things to occur. But I think you can see from this that there are plenty of groups gathering together to discuss related issues. There were a bunch of people who'd been through the hoop that is establishing a working group which is a well-described statement of work and importantly a notion of what will be delivered and then even more importantly who will take that up. A much wider group of interest groups around the spaces and I'll talk about that a little bit and then as indicated that were a large amount of effort put into the data citation activities in particular and then a data citation meeting right at the end of the data site. Absolutely all it requires is going to the RDA website the research data reliance site and then diving to there and express interest and there are chairs of all of those different groups who will enable you to participate. The intention of RDA is that the plenary meetings are a small part of the work of RDA the bulk of the work of RDA is occurring through these discussion groups. It is I think important that Australia starts to get more involved in this Australia's got heaps to offer but more importantly Australia's got heaps to benefit because Australia has been so actively involved in doing. One of the things this gives us is an opportunity to implement things that are internationally agreed and secondly it gives us a chance to test the proposition that the approach being used at a particular institution will stand up in the international environment or over these things give you an indication of where their interest is around the world. One of the things I really would say is that there's not only an opportunity to get involved in the current things but if there's something that's important around research data that isn't well represented currently you should create a group. One of the things about that is that ANS will work with you along those lines to support your involvement and engagement in that. I've indicated that Andrew, Stephanie, I've been involved in the organisation of RDA but we're equally interested in supporting the work in the conduct of RDA in that regard. It is the case that all of the meeting proceedings are available online through the RDA forum and you can click on the relevant bits and let's skip to the next slide. The only thing that's worth noting here is the rapid growth in this space. If you see all of the list involved news since last plenary says that there's a big thing going on. I thought I'd give you a brief update on where things are going with the working group progress since they're the ones that perhaps furthest along the pass. The data type registries was really agreeing that there would be registries enabling descriptions. What sort of data do you have? How do you describe that? How do you know if it's interoperable? You need to have some type registry that is put in place so that machine-to-machine capability is applied. This has been led by Larry Lemon at CNRI. They've got an approximate vocabulary agreed upon and early prototypes soon. For me, this is almost top of the pops, something that's very practical, needed if you're actually going to do machine-to-machine data interoperability and likely to be delivered. The second one, metadata stands, started out being, well, we want to talk about metadata and that was not really a working group. What spun out of that was a metadata interest group. This particular area morphed into if we're going to enable metadata to be used in the different places, well, we need to know what are the metadata standards that are floating around. The need for a registry of metadata standards is clear. There are several around. The work has transpired to be based on the DCC metadata standards registry because that's the most advanced in the world that has been identified by that group. I've heard in Australia a number of time people saying, okay, well, we've got to use some metadata. What standards should we apply? Here's the space for it. We know that there are gazillions of metadata standards. What this is looking for is some metadata centers that are commonly in use in the relevant space. The next one is practical policy. This is how do you have, sitting in front of your data, a policy regime that is machine-enforced. This might be an excess protocol. This might be a protocol that says, okay, this is how you can engage the data. This might be a policy enactment that looks at the licensing regime and then applies appropriate combination policies to enable that to work. There was something like 13 different policies from around the traps being looked at. Reagan Moore is leading this, and he's a driven person. Something will come out of this that we'll deliver. Looking for more policies, but the aim of this is to enable people who want to run a data store with appropriate policy in front that can be machine-enacted to put in place policies that are actually going to stand the test of time. The next one, persist and identify types, which is looking at, well, what are the types that are being used for PIDs? Not unrelated to the work of the data type registry, but this is more specifically looking at how do you get the relevant types for PIDs gathering that information and putting that together. The work in that's progressing. One of the things you'll notice about these particular set of working groups is that they have degrees of overlap and arguably not the world's best titles requires a degree of cleverness to work out what they're really doing and so it's probably necessary to have a look at their paragraph descriptions that is again available on the RDA website. The next one is perhaps the one where the computer scientists will be most comfortable, which is the data foundations and terminology, which is looking at an abstract data organization model for interoperability. So this is looking at it again from a bunch of different areas where there is already data organization models and then looking at, well, what can be generalized out of that and so they come up with a relatively complex model already. I suspect the model will get even more complex, but sitting underneath there is going to be something that's relatively machine implementable that doesn't turn into a grand unified scheme of everything, but something that's actually able to be used by a number of the other areas. Now the first working group that is in fact based on a specific domain is based on linguistics, not that you'd be able to tell it from the title, but is actually seeking data categories and codes for natural languages. One of the interesting things is that it's based on an ISO standard, but the ISO standard only takes you so far when you're looking at how do you describe languages and language variations and how do you look at languages that are perhaps morphing of different languages or languages that are changing etc. So this is a very practical straightforward description where there'll be a shared namespace that's agreed. So if you skip back a slide back to this you'll see that agricultural data, marine data, toxicogenomics data is emerging. So there are a variety of areas that are emerging, but if you look at this slide it says that the bulk of the work started early on in just working in the kind of underlying areas. Now my view is that RDA really needs these specifics to come forth relatively soon. I then wanted to look at interest groups of AANs and Australian relevance because there's a lot and yet we have some engagement but not enormous amount. We've got work in some of these areas. Bayden mentioned that there's work in the legal interoperability area where the Australian contribution is already significant. It's not just AANs contributions as stated but there are exemplars there. So if I can run through them brokering. So this is about saying if you've got data in one location and data in another location and you need to get access to that data do you have to go off to the first location, do the negotiation, get the data, go off to the second location, do the negotiation, get the data, bring the data together and then do something interesting or can you ask a question of a broker which will then use machine engagements with those different bits to allow that to occur. Now of course in Australia we've been talking a lot about enabling data reuse and that's not just simply using data from a single location again. This is saying well here is the need, where is that data located, what are the mechanisms that enable us to get out that data. So I think there's a real need for that and we've not really been engaged in that. Big data is of course big in terms of space. Lots of people are talking about big data. Sometimes big data is associated with high performance computing because you need to have high performance computing next to that data and the data isn't so much transferable as you need to put services over the data where it lies and the data needs to live in something that's used to dealing with petabytes and so there may well be an opportunity to look at a group that's arising out of this that may set up a test bed in this context for doing big data issues and it may well be the case that we could have an Australian host on what are the RDSI facilities for example. There is no business model currently part of the group will need to look at that. Skipping over to the right you'll see there's a data economics interest group that is emerging which is really just asking the question who pays. So who's going to pay for all of this infrastructure? Who's going to pay for making data available? How is that occurring? Australia has a model that has essentially been around upfront investment through TurnIMOS the initiatives at the various institutions putting in place mechanisms that enable all to access. That model isn't being used internationally quite so much and is also sometimes arising out of discipline areas. How do you impact and enable those things is the topic of that discussion the brokering discussion is really around well what do you need to have in place? Of course there are already answers to some of these questions within discipline and for me we will succeed in RDA if we learn from discipline approaches but not if we learn from a single discipline approach because part of what we need to do in this space is integrate across those areas. So trusted repositories at the moment there is no requirement to put your data in any particular form of repository you can put it anywhere. So sometimes it might sit on a publisher site sometimes it might sit on an institutional site sometimes it might sit on an RDSI site sometimes it might sit on a data store next to a HPC facility and what we're trying to do is avoid that the data sits on a memory stick. It could well be the case that funders say if you generate data in your research project you should make that available and a possible threshold is that the repository you put it is trusted. So what's it mean to have a trusted repository? What are the characteristics of a trusted repository? Now the Dutch have been thinking about this problem quite a lot and quite hard coming up with certification regimes around that space. Equally the world data system is looking at you know what does it mean for a data system to be certified as part of that space. So there are a number of approaches floating around I think this is going to become a hot issue because I do think funders are going to say I want to know that the data is held in a place where there's essentially a trusted operator. If so how will you do that? Is a sort of gold stamp an elephant stamp that's needed for a repository or are there descriptions of various methods of making the repository a trusted one? I think that will be relevant to us. Capability how do you know if your data ready what are the things you should be thinking about how far along the journey are you uses Liz Lyons capability model that she's developed in the context of the DCCC and it's not unrelated to the ANDS data maturity model. What this one highlights is the fact that it's not simply about technical data interchange mechanisms that are needed here what's also needed are the social ingredients of this space. So how do you know how you're going? How do you know where you're up to? Now ANDS has been putting a lot of emphasis on data publishing. There's lots of issues associated with that because there's a lot of people starting to say data publishing is important and mean different things by that. So what are the workflows that sit behind publishing your data? How do you know that your data is published and the influences of that? You know what are the bibliometrics associated with that? What are the nature of data publication services and who's going to pay the data publication costs going over to that data economics one? Data citation I won't spend much time on today since data citation has been the subject of other discussions in this forum. Suffice to say that there's a lot of engagement in this space and we think it's very important. Has Bayden mentioned legal interoperability is important? How do you know that you can use somebody else's data? How do you know whether you can combine the data? How do you know whether you can make a data available? And how do you do that in an international context most importantly? Because we know that research data does not respect boundaries but the law does. How do we make sure we get that right? So having people from many jurisdictions involved in that discussion. So we have Bayden from Australia, Paul Yulia from the US and a guy who's known briefly escaped me from Spain has been three people who've been significantly involved in that discussion. The marine data is a great example of a group from an Australian perspective in that marine data interoperability is relatively mature in the space. The IMOS engagement with the European and US counterparts is deep. They are planning to set up the Southern Ocean Observing System mechanism to exchange data and being hosted in Australia. So there's a lot of reasons why this area is one where Australia ought to be very seriously engaged, not just at IMOS but also the agencies who have an interest in marine data. There was a recent paper published by my mate Fran and Vint Cerf who was listed as one of the founders of the internet asking the question recently, who pays for research data? Because we know that data costs, we have a model in Australia, we have a completely different model in the US where there is no such thing really as infrastructure investment in research. It all kind of comes through the research investment. So when you want to put in place a new synchrotron you seek an NSF grant to do so. Now that doesn't really fit the Australian model and there are really significant implications for that. So data scientists are incredibly important in the US because you have to justify yourself in terms of a science agency whereas we pay at least as much attention to data management, data librarianship, data technologists etc are all terms that we use in this context. So the models that sit behind this are going to need to work internationally even though different economic models are going to be applying locally. Metadata is clearly big and it's more than directories is a wide discussion in that space. And finally I wanted to draw attention to preservation because we think data preservation is very important. However that group has really not yet got going to the extent that one might imagine. I think exactly how that group will work is still a little unclear but I do think there would be value in having Australian engagement. Moving on to how you get engaged so you can get engaged as an individual. You can get engaged as an organisation. There's a organisational advisory board which will govern the organisation assembly. There's a lot of organisations that are thinking of joining. We think the Australian Antarctic Data Centre is a really good target in that list. Intersect would be good, turn etc. So there's lots of ways in which those discussions can occur. What this really means is that it provides a small amount of funding but the opportunity is that it enables institutions that see this as important as a way of early access to the outputs of RDA and equally to influence the thinking of RDA as to what are the things that we need to have taken up. It also provides the group future proofing if you're going to adopt a particular approach. How does that sit internationally? Another relationship is organisational affiliation where there are a bunch of organisations which don't really want to be a member of RDA and it doesn't really make sense for RDA to be a member of them but nevertheless there's a relationship that exists in that space. So we will be creating a legal entity which will enable organisations to join and have something to join by the next plenary that will be in place. So next plenary, here's where it is. It's in Dublin and it's hosted by Australia. Australia and Ireland are cooperating over holding that a bit there. Plenary fall forward with Netherlands, probably Amsterdam and plenary five or six is likely to be back. So we're holding a twice a year, September and March is roughly the dates, pencil them in. So back to how it's all coming together. There's the RDA colloquium which is the people who give us money sitting on the top with no actual connection to the rest other than money flows. So the RDA council, I'm involved in that along with six other people from around the globe. We have somebody from England, Germany, France, two from the US, one from Botswana and one from Australia. The technical advisory board is looking at ensuring we're on the right track. Technically it's not saying what we will do because groups will say what we do, it's simply there to enable us to achieve maximum impact in that space and the entry to law is involved in that. We have recruiting for an expanded technical advisory board. The secretary general is a person who runs the secretariat, works with council, works with colloquium, is a doer, is a community activist and doesn't yet exist. So we're recruiting for that position soon. We hope to have that position put in place by Christmas. And then we've got an organizational advisory board and assembly which is doing that work that I just talked about. But the heart of the work is done by the working groups, the interest groups, people participating from the membership into all of these different discussion groups saying, I think we should do this and this is why I'm not quite so sure about that because of this. So the picture is not right because there's a little bit of structure and a big bit of working groups and RDA membership there. That's the key to the space. So those are the people involved. Here's where it'll be next and that's the end. I'd like to say thanks to everybody for participation then. It's I think an important issue and I'm glad you took the time.