 I guess what we're hoping to do is provide both an introduction for people who are new to metadata for research data Australia, but also an update for those perhaps who are a bit more familiar with some of the discussions today. So this is broadly what Melanie and I will be covering in this session. Peppered throughout the presentation will be references to additional information that you can access on the ANS website and a list of these is provided at the end of the presentation so that they'll be easy for you to follow up on. So no need to sort of scribble things down as we go along. And also during the session we'll touch on some new features and functionality that are currently in development for the ANS registry and research data Australia. And as we have a mixed audience today we hope that everyone will be able to take away something new. So we'll be having a quick look at metadata, research data Australia and ANS. With an introduction to RIFCS we'll talk a bit about metadata to support institutional goals and some of the metadata pathways to research data Australia including the manual entry option and harvesting of records. And towards the end we'll talk about some sources of information and support. I'll be starting off with sort of the first half of the presentation then I'll hand over to Melanie Barlow over in the Canberra office to talk through some of the options around harvesting of metadata. But let's first start by having a look at some key concepts that we'll be looking at in a bit more detail as the session unfolds. Many of you are probably very familiar with metadata but it's probably important to start off with a little bit of a definition. We often summarise metadata as being data about data. So it's really descriptions of in that case data collections so that it enables and enhances discovery, reuse and citation of data. Metadata describes the data and its attributes associated with it. So you're able to talk about things or describe things like rights information who created the data set, some with the background, the context of the data set and some of the conditions around things like access and reuse. Some of the things that we really want to emphasise today is that when you're planning your processes for collecting metadata, do think about how the data will be reused and by whom. And really make the metadata you create work for you and for the people you envisage may be interested in the data you're describing. We'll talk a little bit about metadata standards. They are important because they provide us with a common way of describing information resources and they facilitate the exchange of data between systems. And they must be great, there's so many of them. Wouldn't it be nice if we had one to rule them all? But today we will be focusing on RIFCS, which is SO2146 standard and it's the metadata that we use in the ANS registry and for display in Research Data Australia. So what does Research Data Australia or RDA as we commonly call it? Well it's essentially a catalogue of metadata records that describe data collections produced by or relevant to Australian researchers. It's designed to promote the visibility of research data collections in search discovery engines such as Google and Yahoo and to encourage the reuse of those collections. It's essentially the face of the ANS register my data service or the ANS registry and that could, I guess, the ANS registry could be regarded as the back end of Research Data Australia. It's where records for Research Data Australia can be manually created and it's also the destination for records that are harvested from institutional repositories or metadata stores. But Research Data Australia is where users come to discover data and hopefully access and reuse data. It's an example of a metadata record in Research Data Australia. You can see it describes a number of characteristics of the data including such things as geographic coverage so there's a nice little map there. It's got good information about the license conditions which is something Karen referred to in terms of this actual presentation. In this case it's licensed under Creative Commons. It's quite clear who owns the rights in this material which in this case is Geoscience Australia. It has citation information so it's quite clear too how anybody reusing this information should be citing it. So well described collections are easy for users to discover, assess, reuse and cite and we'll come back to some of these things in more detail shortly. But let's just take a step back and look briefly at the approach ANS has taken to metadata for Research Data Australia. With our earlier ANS projects such as Seeding the Commons and Metadata Stores the focus for metadata was often around assessment, compliance and project milestones. Moving forward though we're shifting the focus away from compliance to really looking at ANS staff being a source of support and advice to assist providers to use metadata and Research Data Australia to achieve institutional goals around data publishing and management. And we've always had extensive documentation, we've always provided guidance and support to institutions and that will certainly continue and we will certainly be continuing with harvesting and manual entry options for creating metadata. So as I mentioned the focus now really is on providing metadata support and advice rather than assessment processes and we really do want to work with you and your institution to identify and help you achieve some institutional goals around publishing in Research Data Australia and to do that it's of course useful to plan ahead, look at the outcomes that you're interested in achieving and not just what the inputs are and something else that's changed is the harvestor options for providers. So in the past we were only able to work with the RIFCS schema but now we are in a position to work with providers who are using other schema through our new harvestor options and Melanie would be talking about that in a lot more detail a little bit later on. First is a little bit of background I guess for people who may be quite new to working with ANZ and providing records for Research Data Australia to understand I guess what we're hoping we can help you to achieve but also for people who've perhaps already done a few projects with us to understand that our focus is changing as we move forward. But RIFCS is really what we're here to talk about today so we'll start by just having a brief overview of RIFCS which will be I guess an introduction for those who are new to RIFCS and a refresher for our more experienced audience members. So RIFCS is a metadata exchange format that populates the records that users see in Research Data Australia and here you can see that there are four classes of records described in Research Data Australia collections, parties, activities and services and this is a key differentiating feature between RIFCS and many other metadata schema this concept of the four classes of records. So we'll look at that in a little bit more detail. In summary the ISO standard 2146 which is the standard for directories and resource registries has been adapted by ANDS to suit our requirements. It caters for parties which can be people or organisations for activities which are generally projects or programs of work and services which may be services which are likely to be services that actually do something with or provide access to data. As well as of course we describe the collections themselves so to actually implement this standard we needed to work out what attributes we wish to describe for each of these classes of objects and also to have a means of moving those descriptions around. And so the method for doing this is RIFCS. It describes the objects and provides a method to exchange metadata. RIFCS stands for the Registry Interchange Format for Collections and Services and the RIFCS documents are an X similar format and they carry metadata from institutional metadata stores to the collections registry where it can be stored and searched via Research Data Australia. It is a living and evolving standard and it's subject to annual review by the RIFCS advisory board and that board is coordinated by ANDS. Any changes approved by the RIFCS advisory board are usually implemented at the end of the year and we have our next version of RIFCS due for release next month and we always welcome people's ideas around improvements to CS. So RIFCS supports this registry service that contains descriptive and administrative metadata for collections, services, parties and activities and it also supports the expression of relationships between those entities. Now this diagram is available on the ANDS website, it's probably quite hard for people to see but you're able to express the relationship between say a party and a collection. So a party may be a collector of, a funder of or a principal investigator of for example. So it's really useful to be able to talk about different roles that different entities play within this information model. It's worth noting too that where ANDS is actually increasingly moving towards a more linked data approach to RIFCS whereby related entities or information may not always be described as one of those registry objects or classes of record but may be referenced by an identifier. For example, we are now able to, instead of perhaps creating a party record for an entity, you can use an identifier such as an ORCID ID to link out to an authoritative source of information rather than create a new record for research data Australia. In a similar way you may use DOI for example to reference information about a related publication rather than impacts include the full details within a record. So this is an approach that you may have some interest in following up on and there's certainly information on the website about that. Now, as I said throughout the presentation we want to just highlight that there is quite a lot of information available to you and there's a lot about RIFCS itself sort of background information about the schema itself, the schema guidelines, some of the things like the control vocabularies that are associated with it, a whole bunch of useful information and this is just an example of the type of information that's available to you. So that was a real rip-raw through a little bit of an introduction to RIFCS so we can move on to really looking at how you can use RIFCS and research data Australia to help achieve institutional objective. But as I mentioned earlier, Anne really wants to encourage institutions to think about what they want to achieve by publishing via research data Australia. In this session we'll just look at a few goals that we expect might be quite common. So one goal might be that we want our data to be highly visible and easy to find so that would be a goal around discovery. Another would be we want to know who is reusing our data and how often. So that's the goal around citation and citation metrics or another would be that we just want our data to be widely shared and reused across the research community. So that's really wanting to maximise the reuse of the data. And we'll look at each of these three goals in a little bit more detail. This isn't to say these are the only goals you might have but these are some that we thought might be quite common. So the goal one is around discovery. Well, multiple access points aid discovery and this isn't news to the librarians with us today. Multiple access points means there are many ways for people to search for and discover your records. People can search by name, by title, by subject, by an institution, by geographic area. So in terms of I guess maximising discovery and when thinking about your metadata, think about the many ways you and you expect other people may search for things. There's no one magical discovery point so it's wise to think about the multiple ways people access and look for information. And further widens the discovery net by having Google, for instance, crawl research data Australia and having records in research data just indicated to other aggregators. So it's not necessary even for somebody to know that the data set is described in research data Australia. They may well come across the data set through a Google search and be directed back to research data Australia for more information. So that's good news for discovery. There's other things that you might want to think about here as well. We do have in research data Australia a number of theme pages. Some of them you can see in the graphic there. Titles like urban settlements, tropical research, population health research platforms and astronomy. These are a great way to highlight data that fall into these categories and it's a way for people to be able to browse data by broad themes. So again, different ways for people to discover information. We also have the contributor pages. These really have great potential to highlight your institution and its key research areas. They're essentially pages that describe your institution and then provide links off to all the related records associated with your institution. These are the records that you've provided to research data Australia. So it's a nice way to aggregate information associated with your institution. To all of these mechanisms, I guess the key message here is the best chance for maximising discovery of your records is to have those high-quality descriptions and think about the many ways people will look for information and how you can maximise the discoverability of your data by perhaps being included in things like theme pages or by having a contributor page. Another goal might be the goal around citation. Now, citation or referencing of data is rapidly becoming standard scholarly practice. This essentially means that authors include a formal citation to data that they've used in their research in the reference list of related publications, just like authors cite other articles and papers that they've used as input to a publication. And there are a number of ways that you can use metadata to facilitate this and also to track some of the reuse of your metadata, of your collection, sorry. First one there on the left is citation metadata. Now, in RIFCS there's two options for providing citation information. One is where you provide a citation as just a single string. The other, and what we normally recommend, is the use of the citation metadata option. And this is where you provide the elements of the citation separately. So each component of the citation is provided as a separate sort of sub-element rather than as a single string. This provides for the greatest flexibility when it comes to manipulation and reuse of the citation itself. For example, if you want to be able to have citations imported into EndNote or have them harvested to the data citation index, for example. So if you are keen on the idea of facilitating citation and citation metrics, one of the things we would recommend is that you strongly look at the citation metadata element in RIFCS. Another thing to think about is whether you wish to consider assigning DOIs to your data collections in Research Data Australia. DOIs are used with data in much the same way as they are with journal articles. They're used to permanently identify and provide access to data. They're certainly not mandatory for citation, but they are considered best practice and does offer a DOI-menting service, which is currently machine-to-machine only. But we do, in our next release, which is only next month, we will be releasing the ability for providers to manually mint a DOI. So that may be of interest to people who are very low volume, expect to be very low volume in terms of the DOIs. They may have mint or as a short-term option, while a machine-to-machine service is established. And there'll be more on that coming out associated with the next release. Something else to consider if you are interested in the citation aspect is whether you might want to consider having the records in your data source included in the Thompson Reuters Data Citation Index, which I alluded to a little bit earlier, and now has the capability to export records in a data source from Research Data Australia to the Data Citation Index. That offers an additional discovery point for the data, but it also allows for the capture of citation metrics. And at the moment, the Data Citation Index is the only commercial service offering in this area. Those of you who have been involved in providing performance reporting, perhaps for ERA or other purposes, would be aware of the importance of citation metrics and some of the citation indices that are available for publications. But the Data Citation Index is one that's specifically around data. So if your institution is likely to be interested in being a participant in this, it's worth having a look at the information on the AMP's website to see what's involved in establishing this. And again, the encoding of your RIFCS records can make this very easy for you to achieve should that be one of your goals. And of course, citation of data is closely linked to the reuse aspect. If data has been reused, then what we would expect is that it's actually cited. So this was our third goal that we mentioned earlier, which was the reuse aspect. So some things to consider here, if you're wanting to see... Your institution wants to see its data being reused. Some things that you can consider like the type of license that you might assign to the data. If you want it to be easy for people to reuse your data, then assign an open license if you can, such as the CSBI. This means that people are freely able to reuse your data as long as they attribute you. And that's sort of like an ideal scenario. In order to facilitate reuse too, it's important to, if you can, provide some provenance and reuse information. The example here in the little snapshot shows where one of our records in Research Data Australia provides quite a bit of information about how the data was processed, what sort of format the data is in, what sort of software anybody reusing the data might need in order to actually open it up and use it. So that's really useful information that promotes reuse and enables that reuse to happen quite rarely. And of course, something else to think about in terms of reuse is providing direct access to the data. So in some cases, people choose to go through a mediated process to provide access to the data, where perhaps you might need to contact somebody who would then perhaps send you a copy of the data. But if you're actually able to provide direct download access to the data, that is going to encourage greater use than where somebody knows that what they need to do is contact someone, so by phone or email and wait for a USB to arrive in the post. So do think about where you can provide direct access to data. And with the next release of Research Data Australia and RIFCS, we're going to be providing better visibility of data that is freely available and where data is connected to services that allows people to reuse that data quite readily and is connected to the tools that would allow them to do that. And I'll just show you in a little sneak peek here, if I can, of what that's going to look like because that's actually very nice. So this is, well, it's a mock-up and it may not look like this at the final analysis, but this is what's under development at the moment. What you can see here is that it is a data set description, but here there's direct access to download the data and actually to use this particular tool, tissue stack, to visualize the data. So it really does make it very, very easy for people to reuse the data that you're describing and that, of course, is going to encourage not only the reuse, but the citation of the data as well. Look out for some of this functionality that's going to be cropping up in our next release and think about how you can use some of these options to really highlight your collections and encourage the reuse of your collections. And of course, again, we've got plenty of documentation to help you, and we're here to help you as well. It's not just documentation. There's people behind it. We've got a documentation that can not only explain RIFCS, but also how to put it into practice. We've got information about metadata requirements, step-by-step guides, and information about practice. So let's move on now to just look at some of the options available to you for getting your metadata into the ANS registry and making it discoverable through research data Australia. The option one is to manually create records in the ANS registry. A number of you may have created records in this fashion, perhaps for seeding the commons projects, and it still may be an option that you might choose if you're going to be creating very few records or are not currently in a position to establish a metadata store or institutional repository. This slide just shows step one to manually creating a record in the ANS registry, where you're prompted at this first step to decide which type of registry object you wish to create. So are we adding a collection, party, service, or activity record? And the process steps to creating records are essentially the same regardless of the object type being described. Now, this is another step in the process after you've decided you're going to be creating, in this case, a collection record, and I'm not going to go through all the steps today, but what I wanted to do was just to highlight that there is quite a lot of on-screen help available to you if you are using the ANS registry to create your records. You can see here on the left-hand side is all the elements associated or the fields associated with this record. And there's just an entry screen here where you can type in the information associated with those elements. But you can also see that there's a lot of information and support available to you. There's help files. There's a video tutorial that you can take a short tour of the steps involved in creating a record. There's links off to the content provider's guide and the metadata content requirements. So there's quite a lot of on-screen help available. And when you're ready to save and validate your records, as you can see on the blue button there, also you're provided with on-screen feedback, prompting you to make mandatory or optional changes to your record. Now, as I said, we're not going to step through this process today, but more detailed training is certainly available and can be arranged by your outreach officer or contact us after today. Access to the ANS registry is via ANS online services, which, as you can see, is tucked away here at the bottom of the homepage of Research Data Australia. And something I wanted to alert people to is the availability of the ANS demo or sandbox environment. So this is an opportunity to learn as you do without the risk of messing up a production system, which makes many of us nervous. So this is essentially a clone of the production environment in the ANS registry. And it's a great place for you to practice creating records. But even if you're not intending to use the option to manually create records, you can still use this as a learning tool to understand how RIFCS registry objects relate to each other, to look at things like the different sorts of types and relationships and vocabularies, and actually build your understanding of the RIFCS environment. You can see how records will appear in Research Data Australia, so you get to actually see it as the end user will see it. And you can also use this environment to familiarise yourself with the ANS registry functions that you would use regardless of whether you're manually creating records or having them harvested. And that's things like the Manage My Data function and other aspects of looking after the records in your data source and your harvest. Because I have a look and see what's in the developer's toolbox, we've got a number of web widgets and services that might be of interest to you, and you can see those in the demo environment as well. So I encourage you to think about using that as a learning tool. If you're interested, contact your outreach officer or email services at ans.org.au to arrange access to the demo environment. And look, again, more documentation. We have a fabulous tab-by-tab guide to manually creating RIFCS records. So if you are working in the demo or production environment of the registry, you can literally go through step by step and be walked through the process of creating a record.