 Hello and welcome everyone to the Top 10 Fair Data Things Global Sprint webinar. The actual sprint is on the 29th and 30th of November 2018, everywhere around the world. And today's webinar is basically to give you a bit of an introduction and overview of the sprint and to talk about how you can participate in it. So it's brought to you by Library Carpentry, the ARDC and the Research Data Alliance Libraries for Research Data Interest Group. Okay, so today's webinar, before we get started, there's a bit of an overview of some of the things we're going to cover. It's only a short half hour webinar, so we're going to talk fairly quickly through these things, but you will have a chance to ask questions. Before we get started, just a bit of an introduction to the speakers today. I'm Natasha Simons. I'm the Associate Director for a program called Skilled Workforce with the Australian Research Data Commons and I'm based at the University of Queensland in Brisbane. So the ARDC was established this year and it's a combination of the Australian National Data Service or ANS, NETA and RDS, the Research Data Services, to meet the needs of data-intensive, cross-disciplinary and global collaborative research. And the ARDC is supported by the Australian government through the National Collaborative Research Infrastructure Strategy. Over to you, Keith. Thank you. And my name's Keith Russell, Manager of Engagements within the Australian Research Data Commons. Chris. I am Chris Erdman and I'm the Library Carpentry, Community and Development Director. And Liz. Hi, I'm Liz Stokes. I'm formerly from UTS and I've just joined the Skilled Workforce Team at the Australian Research Data Commons. Excellent, thank you everyone. OK, we'll go to the next... Oh, there's a link to the slide deck there if anyone wants to load them. So the first topic is the who, what and why about the global sprints. OK, so the top ten fair data global sprints is organised by Library Carpentries, by the ARDC and the Research Data Alliance Libraries for Research Data Interest Group. It basically came about because there's a need to develop some educational and resources in fair data for different research disciplines. There's a lot available that covers fair in general, but we're actually missing quite a lot of information in a disciplinary sense, which is where fair makes a lot of, where fair really comes to light. So we've organised this sprint in collaboration with a number of global partners, Foster Open Science, Open Air, the Research Data Alliance in Europe, the Data Management Training Clearinghouse, which comes out of ESAP, the California Digital Library, the Dryad Repository, ARNET, Dance and the Centre for Digital Scholarship at Laidon University Library. And if you want to get an overview of the sprints, there's a link there to the Library Carpentries blog where the sprint was announced. OK, so the purpose of the sprint, as I mentioned, is to create a wide range of top ten fair data things by research disciplines, and or we also have themes as another way that you could slice and dice this particular sprint. So what is actually a top ten fair data things resource? Well, hopefully a number of you listening today have actually done or been involved in the ANS or Research Data Alliance 23 research data things. So things, if you've seen through that program, you will have noticed that things is a neat concept for creating package content on any topic that you choose. And each thing is basically a self-directed learning activity for anyone who wants to know more about that particular topic. In this case, it's about fair research data. So the resources that we create during the sprint, the idea is that they can be used by the research community to understand fair in those different disciplines and different theme contexts, as well as providing people with some initial steps that they want to consider to make their data fair. So here's an example of the top ten things. In this case, it's not actually a top ten fair thing specifically, but it is a top ten health and medical things related to research data. So this was created by ANS before we're now joined with the ARDC, but we don't have the branding on that poster. However, you get the idea of what is covered in this particular resource. So the example I've pulled out there is thing two, which is issues in research data management. And with these particular top ten things, you can pick different activity levels that you want to engage in. Some of the activities I like read something here and think about it and then make a comment on it, or it might be go and try this new tool and see what it does for you in the data management area, or it might be critiquing something. So there's different types of activities that people can do that's more than just a sort of passive reading of materials. So the ten things is about doing something. So you read something, you start to understand it, and then you reflect on what you've read and you make a comment on it, or you create some or you try out some new tools or some new ways of doing things. So that's the whole sort of essence, really, of the ten things. And there's a link there so you can explore those. These resources are then available for anybody to use, whether you're in top ten medical health or not. Anyone can actually go along and do those top ten things whenever they'd like to. So if this is new to you, which it probably is, particularly the creation of the resources, we created a primer, which is basically a set of instructions that will help guide you through the creation of your top ten fair data resources for different disciplines. And I'll put the link to the tiny URL there, but it's basically just a few pages that tells you how you'd go about this, which is essentially not to start with another top ten, although it's good to see that example, but to actually start with the 23 things and start with some bigger ideas and then start to sort of skim down from there how you might want to get to ten and whether you want to do different levels within each ten. So you might have one thing and you might have three different levels in there, a beginner, intermediate or advanced, or just different types of activities that people can choose from. Because the one thing we found through the 23 things is that people really like choices and they like to be able to do things and they like to be able to talk about them. So just consider those things when you're creating your top ten resource. So Natasha asked me to speak very briefly on the fair data principles. Now I'm guessing most people here will have already seen the fair data principles and already know them really well, so I'm going to keep this short and I'm trying to focus it on what does this mean, how can you think about this in the context of the ten research data things. So the fair data principles were drafted in the workshop in Leiden, the Lawrence Centre 2015 and subsequently the original authors and a few other people sort of wrote an article in Nature about this and that received a lot of international recognition and uptake by all sorts of different organisations and funders and policy bodies and journals and all across the board. And I think there's a few factors that led to that contributed to the success of the fair data principles. I think one thing was the angle in which they talked about not only making data usable for humans, but also for machines and how to actually prepare data in a way that machines can pick up the data and use it and use it in data intensive science or data intensive intensive research. The fair data principles are formulated in a way that they're technology agnostic. They're not based around one specific technology solution. They address elements both regarding the data and the metadata. And I think that broader perspective I think is really helpful and useful. And they were set up in a way that was discipline independent. So not focused on one discipline specifically. However, because they're discipline independent means that there is not a lot of guidance in the principles strictly about what you need to do as a researcher to make your data fair in a specific discipline. So that's one of the gaps that has since appeared and getting quite a few questions from different groups, different research communities saying, well, yes, we'd like to make our data fair. But how do we do it? What are the things we need to do in our discipline? What I tried to do here is mentioned for each of the fair data principles very briefly things you can think about that you could talk about when you're thinking about making data fair in a specific discipline. So, for example, when you're talking about making data findable elements on the right hand side, you'll see the actual original principles. And on the left, you can see a few things you can think about. So, for example, in a specific discipline or a specific domain, are there types of persistent identifiers that are common are used? Is there a specific discovery metadata that's common and relevant? So, for example, in Geosciences, you might want to have discovery metadata that describes where the data set is about. So what is the what's what are the geographical boundaries of the data set? For example, and if you're talking about making the data findable, are there discipline specific repositories or registries that the data should be either deposited in or be findable through these routes? When thinking about accessible, accessible is also an element in there's different ways of thinking about accessible. Basically, you just talk about protocols to get access to the data. There's also questions you can think about and how open can the data be made? What sort of considerations and protocols are there in that discipline around making data available? Is it can it be made very open? Are there procedures around sensitive data that that can be copied, can be used, etc. Are there platforms or solutions out there that researchers can use to provide access to sensitive data and in some cases it's more valuable not just to have the data as a download, but to provide the data through data services which can be interrogated by machines or humans. If that is the case, what sort of data services are common, are standards in that in that specific discipline? Interoperable is always a complicated one. When talking about interoperable, it's a lot about actual the content of the data and to a certain extent also the metadata, what language is used to represent that data and things you can think about there. Are there standard file formats? Are there standard ways of making data available in that discipline? Are there standard vocabularies and ontologies that can be used either to reference the data to use when creating the data or the metadata? And where can those vocabularies be found? Are there places where those vocabularies are deposited? And when looking at relating data to other information and other outputs out there, are there identifiers that are common in that discipline to point off to related information? For example, it identifies for projects, it identifies for samples, it identifies for authors, so that you can actually create links and qualified references between bits of information that is out there. The fourth letter, reusable, covers a range of other elements which are useful on top of making data findable, accessible and interoperable because you need to do that anyway to make it reusable. But on top of that, there's a few other aspects you can also think about. So is there around licensing? Is there a standard? Is there a practice in the discipline around licensing data? One of the other elements there is provenance, attaching provenance information to your data. So is there a discipline specific approach around provenance? Is there a standard that people use? This is something you could point people to. And finally, are there community standards for the data and the metadata that that researchers can adhere to, can use, which will increase the reusability of that data or that resource? So that was those are all aspects broken down by those four or printed by the four letters in the in the fair data principles. However, there's also may well also be other elements that you might want to pull in for your 10 fair 10 research data things. So and those can be more general disciplinary context that would be useful to incorporate. So, for example, are there relevant policies that funders or journals or associations or societies have in the discipline? Are they starting to push for fair? Have they got guidance in that space? Is that something you want to point the researchers to? And, for example, are there other standard approaches, templates or tools or materials out there that researchers should be aware of that they can use and pick up just to make it easier for them to make their data fair? So these are just a few things to trigger thinking. When you're thinking about I want to do 10 fair research data things, these are elements you could, for example, include bring on board. So that's my perspective and I'm happy to hand over. Thanks, Keith. So how will this work? So to begin to start, we'll be using the Monash Zoom. So the Zoom link will be open throughout the sprint. And we'll have we'll have individuals will have people in the sprint sort of there to help guide different groups, different individuals that are trying to tackle a different theme or discipline. So on the hour, every every hour, we have check ins and people can come in and discuss what they're working on and and ask questions, get feedback or just have general conversations. But it's a good it's a good way for people to come to check in and see what others are working on and get a sense of what what they might want to tackle as well. It's a great it's a great check in tool. But if you're unable to make it into the Zoom or if you actually need an answer quickly, we have a Gitter channel. So we have a chat room where the link is is also listed there where you can actually just chat with individuals across across the world that are working on, you know, different different aspects and you could you could ask ask them something that you're you're facing and just connect with them as well. So one way the chat works is you can use the at symbol to actually talk to certain people directly. And there are already about 20 people that have already signed up and you can sign up with your GitHub account or Twitter account. So you can you can communicate with people that way. Check in throughout throughout the sprint. The other method that we're using to indicate that people are working on the sprint and working on disciplines or themes is a registration spreadsheet that we're working on where the link is there. It's a spreadsheet where people have listed their names, their group names, their contact information. And they're also linking to the document, the collaborative document that they're they're working on. So we have already two groups that started to include their collaborative documents in the folder that I've listed there. And the folder also contains all the other resources that we have mentioned, the announcement, the instructions. So that's that's another way to connect. And one other thing to mention, too, is we have a code of conduct. Really, what this says is to just we want to create a welcoming environment, welcoming place for people to discuss the 10 fair data things. Please be kind and professional. Just we want this to be a enjoyable experience for everyone. And then also another way to indicate that you're working on particular things is to use the top 10 fair hashtag in Twitter. And I included a tweet that we had sent out, which includes also our collaborators and partners. So if you also wanted to include them, their handles are included. And we might be adding more, more collaborators and partners as we get closer to Sprint. So just also, you know, just ensure that you you're including everyone. Check back in and see if you have all the handles, if you want to include everyone, all the different groups working on this. And the last thing to mention is that at the conclusion of all this, once we've created these 10 fair data things resources, we'll move all these resources from Google, from the collaborative docs that people are using to a repository where people can continue to collaborate and reuse the material in various ways. So we'll be moving that to a library, carpentry repository. So this concludes the the the Sprint logistics part. So move on to Liz. Thanks, Chris. OK, so some of you are probably thinking, OK, that's all very nice and well. But where to actually start? Where to start? What would actually make sense? What would fair actually look like for you in in your discipline? So I think I'm thinking in this way like it might be something about actually working out what fair might look for you in your particular neck of the woods. OK, so are you looking for the people who are all kidded out with their fair stuff like an exemplar, combi van with the best camping set up or the little things that fit together wonderfully and take them to amazing places is. I mean, I'm hoping that that kind of mental image might inspire you on to searching out some kinds of examples or the good kinds of resources that are actually quite useful in communicating fair or alternatively. It might be actually that you don't really have a discipline. You're involved in research support. You might be a librarian and actually what you want is for some really good, useful examples of what fair looks like in practice. So you can actually begin a discussion with researchers in your area or in your at your campus about about all of this data publication. So some examples of those might be obviously so we've talked a lot about this discipline focused resources, but I always like a good pop culture reference. So it could be repositories that are really, really ridiculously good looking, OK, or useful metadata standards. You might be demystifying those identifiers when you're looking for examples of fair in practice. It might be validating a vocabulary or being able to talk about a particularly relevant vocabulary for someone's area of research. And I just wanted to say that we're going to have some sprint driver revival stations. OK, look, to be honest, I was a little bit unsure about this to use some very unfair type gifts in this, but in Australia, we have these driver revival stations around there. So we're tacking on on board with that. And to me, driver revival also involves some kind of sort of reviving conceptually as well. So we will we'll be listing on the ARDC website shortly. Those locations around there where you can drop in. Of course, you can also talk to people on the Gitter channel as well. And one last thing. Oh, sorry, I do need to disambiguate when I went looking for other examples of fair data was that there's also this other fair data thing, but that's not us. And it's I don't know. There's something about fair and data just makes people want to group things in lists and make them in things of 10. But that's actually some another group. So I just want to make sure that everyone is clear about that particular disambiguation. I just wanted to make a little plug for our next webinar on the 27th of November about making data count, which might be of interest to some of you on this line. OK, looks like I'm the only one here at the moment. So no, you're not this great. We're all just lurking in the background. Wonderful. OK, so that's the that's the end of what I've got to say over to over to you, Natasha. OK, so I hope this is giving you a good idea of what's coming up with the sprint. And I hope it's enthused you to participate and register and get an idea together. And it's actually time for questions. There's a question. A question I had sort of can we already do a sneak of some of the driver revival stations that are already mentioned or is that still secret at this point? Which driver revival stations have we got and where do we still have gaps? People that might want to provide a driver revival station. Liz, do you want to announce where yours is? OK, so ours will be at UTS. Probably likely to be in the tower building, but I'll have the I should have the final details of that today. So there will be a central Sydney, fairly central Sydney one at the University of Technology, Sydney, for those of you who'd like to drop in on us. And there'll either be one at the ARDC office or at ANU in Canberra. We just haven't got the final word on that one yet. And the rest are all sort of in play, which is why they're not actually announced yet, but we will put them on the website in the next few days. But if anyone, yes, if anyone wants to host one, please do. And we're happy to provide the cake. But that's an incentive for you. And Chris, any driver revival station out your way? I was about to list, I think there are about 18 in total that have been registered so far. I was just checking. I think it's 17 in total that and there are three University of California. Schools involved. So I myself am in North Carolina, but I work for the University of California system. So I guess you can say, yes, locally, yes, there are three University of California schools and there and there actually, I don't know if this was the question, but there are two resources that people have shared in the folder where we're sharing them at the moment. So if that's what that question was about, just of taking a peek at what's been done, then they're in the folder, which maybe we can share a link. I can't see the questions or comments. So if you want to be sorry, there's a comment from Chris McAvaney about great overview. He has to go now, but it looks great. And he's going to be chatting with his library staff about this. And there's a couple of other people just saying thanks for that. So I think maybe just to give people an idea of what was what people have already volunteered for to create, if you go to that registrations list link, which you can go from the library, carpentry blog or from the ARDC website, you can actually have a look at what people have suggested they want to create a top 10 resource in this actually quite a few different social science ones, which is going to be interesting. And then there's some more specific ones like history. Someone has volunteered to do one in software. I haven't seen it on the registrations page yet, but it's been talked about in Gitter. What else, Chris, can you remember any of the others that people volunteered for? I think I covered sorry. Clicking all these different buttons. I think that covers it. There are a number of other institutions that have contacted us that haven't registered. But I know for certain that the National Library of Medicine is interested in and they're interested in just the medical thing, but there's also a group that was interested in a thing in biosciences. So I'm just blanking on the name of it. But yeah, there are still several that have not listed their their projects on there. So whether they need to get full approval or are there just I need to prod them again. Yeah, so there's one as well on Australian government data. There's another one that came in yesterday. So so there's some interesting different fair topics that you can get involved in to contribute to. Here's a question from Fiona Bradley. Should everyone interested in a discipline try and team up? Do you want to answer that, Chris, or do you? Yeah, I think in general, it's good to have someone, at least one other person to work with a thing on, but you don't have to. And another way you could do this is reach out through the Gitter channel or, you know, in general, through Twitter and see if anyone else wants to to join you. But I think it's generally good to have just one other person you can bounce ideas off of, but you don't you don't have to. You can work on it individually. Yeah, and if the other part of that question is should the I think three different social science suggestions, should they all come together? My feeling is no, if there's three different teams that want to work on this, that's OK, as long as they all know about each other doing this, because it is a global sprint and these teams might come up with different types of examples to use and, you know, and maybe later on there could be an effort to combine them or something like that. But I think for the moment, it's pretty much a free for all. Just dive in there and find someone to work with on this. And if someone else is working on it in a different area of the world, then I think that's OK. You can either team up or just do it in parallel. And I think it'll work just as nicely doing it that way. So I think we've come to the end of the questions and the end of the time. So thank you, everyone, for joining us for the webinar today. And we really hope to see you at the sprint and just go to that Library Carpentry blog to find the links there to register and join the Gitter channel as well. OK, thank you. Bye. And thanks to all our presenters too. You guys. Thanks, everyone.