 Okay, great. Good morning, everyone. Well, good morning from me, that is, because I am in Perth, Western Australia. I would like to start by acknowledging the traditional owners of the land on which we all are, and for me in Perth, that is the people of the Wajuk Nungar Nation. And I'd like to pay my respects to the Elders past and present, and indeed pay my respects to all First Nations people who might be in this meeting today. Very sorry about the delay, a bit of a technical snafu, but I am confident that we'll be able to get through all of the content in the time remaining. We do have quite a few people, so I'm not sure if we'll be able to have much of a verbal dialogue on this topic, but I do welcome comments and questions in the chat window. This session will also be recorded for posterity, so you'll be able to review it. And I also have a link to my slides available there. Actually, I will enter that quickly in the chat window to make it easy to click on. Probably the worst thing about having URLs in slides and screen sharing, you can't actually click on them. So now, not many of you may know me. My name is Matthias Liffis. I am a research software skills specialist here at the ARDC. And my job is to work on national programs to help develop capability regarding software in Australian researchers and research support professionals. By way of background, I studied computer science a long time ago. I'm also a librarian. So this idea of software citation is a bit of a melding of those two domains. Alright, so first up, software. It is, well, the reason why we're all here, of course, you as ARDC partners might be writing software, creating software as a part of our partnerships. Or you probably have already been developing software for quite some time as a part of your work. Now, software is starting to, research software in particular is starting to get an increased focus in the same way that over the past decade, research data was starting to get a bit more notice and attention. And now to be honest, you can't really find a journal that doesn't have some kind of data availability requirements. So there are many reasons why we would like to strongly, strongly encourage as many people as possible to make their research software available and citable and licensed. Although I think I should actually quickly define or at least attempt to define what research software is. Now, there are a lot of very smart people around the world who are working on what this definition of a research software is. And until they have come up with a really nice, pithy definition that I can use in my presentations. At this stage, I define research software as any software that is used to come up with the results of research used in the analysis used in in data management used in data collection. So really anything that lets you come up with the results in a paper and justify those results. Now, I'm specifically excluding productivity tools that you might use to write a paper itself, for example. So if you use Microsoft Word to write your paper, I wouldn't necessarily, I mean you are using Word in your research process, but Word would not be research software in this case. On the other hand, if you have written a script in Python, for example, and you use that script to analyze your data, then that is absolutely research software. Now, we get into a bit muddier territory when it comes to things like research platforms, because some parts of the platform might not actually have much to do with the data analysis itself. Whereas other parts of the platform absolutely do. And look, to be honest, we could probably spend an entire day discussing what is and isn't research software, but we don't have time. So let's try and with the idea that research software is what you is the software used to create the output of a research project. Well, let's keep moving. So why is it that we certainly the ARDC are interested in making research software citable and published and licensed. There are lots of different reasons. And to be quite honest, a lot of these reasons apply to other research outputs as well, such as data and publications and what have you. So first and foremost, there is this idea of research software sustainability. There are some research software packages that are very, very heavily used and incredibly popular. And there is a lot of research around the world that that depend on these research packages, the research software packages. The problem is that a lot of these packages might be only developed by one person or being maintained by one person. And if that person were to suddenly no longer be available to work on that project, then anybody who depends on this research software will be left in the lurch. The software will no longer be updated for new methods, bugs won't be fixed, all sorts of different problems like that. Now, around this time last year, a little less than a year ago, I hosted a presentation delivered by Dan Katz from NCSA at Illinois. I've got a link to the YouTube there and I strongly recommend that you watch that. He goes through all sorts of different things, topics, things to think about when it comes to research software sustainability and how we can make sure that software that is being used today can continue to be used in the future. And I wouldn't be able to do it justice by trying to summarize all of those right now. But in short, there is a bit of a problem with sustainability of research software and making the software a little more rigorously available and licensed does take some steps, I suppose, in making that software a little more sustainable. Next up, another reason why we like the idea of software being made available is that, to be honest, a lot of publishers are now starting to require it. Or, in fact, they have actually required it for quite some time now but haven't necessarily been enforcing it until more recently. And so if you go to any journal that has a data policy, you might find that software is snuck in there as something that they consider research data. Now, of course, data and software are fundamentally different things, but they do work together. And so it makes sense that if the data used to create the results presented in a paper, if the data must be made available, then the software used to analyze that data should also be made available and can be inspected by either peer reviewers or by readers just to ensure the veracity of that software, that it really is reliable and implementing algorithms correctly, bug free, things like that. So it's not just the American Geophysical Union that has a policy on this, but also Nature, possibly the most influential journal in the world. Is it Science or is it Nature? One of the two. They also reserve the right to decline a paper if important code is unavailable. Now, I do welcome any comments or questions in the chat at any time. I've got got an eye on the chat window here and I will try and answer them. Now, here is an example of why available having research software available to others to inspect is good for science. Now, there was a Python script that was used to this is in computational chemistry. And this particular script was used in literally hundreds of different research studies, computational chemistry research studies. And the developer of this software didn't realize that a particular operating system call. So there's the script relies on the host operating system to provide a list of files in a particular order. The developers didn't realize that the function call, which is called the same thing on different operating systems, behaves differently on different operating systems. So the results that you would get the order of the files that you got under Linux would be different to the order of the files that you got under Windows would be different to the order of the files that you got on different versions of OS X. And so this resulted in an admittedly small but still significant error in the results or in the analyzed data because files would be mismatched ever so slightly so that unless you were really looking very closely at the results you wouldn't notice that there was a bug. But certainly the bug was big enough that it really affected the output of the papers. And there has been a prediction that you know literally hundreds of research studies will need to go back and reanalyze their data and re verify the results of their analyses. Now another reason why we would like to make software truly citable and by citable I mean citable in the same way that you would cite a research paper is to help track QDOS. Now we unfortunately live in a world where citations are king and I have railed against the system for a very long time. But for many researchers when it comes to academic promotion when it comes to getting grants when it comes to to real reputation QDOS advancement in their career. They rely very heavily on citations and we have an established system for citing papers that that system is quite mature tracking citations is quite mature. It's still imprecise but the way that this is all handled is now very accepted very mature and it is quite straightforward for a researcher to know how many times each of their papers have been cited and be able to use those citation counts as evidence in academic promotion or performance appraisals and things like that. Now the problem is that we don't have an accepted standard for how contributions to research software are measured. Writing a paper takes a lot of time but there is a gain out of a measurable gain that researchers get from writing papers. Writing software is also very time consuming but there is no formal method for tracking the use of that software the citations of that software and providing researchers with the evidence that the software they're writing is having an impact in the rest of the world or is influencing research in the rest of the world. Now I will show you shortly that some projects are working on methods in which citations of software can be tracked but in order for software citation to be enabled the people who develop the research software need to take some steps to make sure their software can be cited in the first place and I will be going through those. So in terms of Kudos for example this is a tweet look it is from a few years ago already but this piece of software was developed by Syro and we have a researcher who is thanking Syro for producing the software. Now it would be fantastic if Stephanie Watts-Williams also then when publishing her own paper cited that original software and so the people who developed that software knew that it was used, it's formally tracked and can be used in for KPI's. Alright I have a question do I think containers are a must for research software to be publishable to avoid the issue of different behaviors from underlying OS. Now containers unfortunately can still be problematic in those kinds of different behaviors in underlying OS's and things like that. They are very very useful and look as another comment comes up here container portability, cross-platform usability are very important but containers are the only way to manage that. And if you are using containers to rely on providing that kind of portability and cross-platform usability you still need to be very careful in how you construct your container. Oh okay it's my screen sharing not changing for everyone I am now on a slide of a tweet showing some grains in a Petri dish. Well look click on the link as suggested by Marco and you can follow on my slides I'm currently on slide 8. I think containers are a great way to help manage portability cross-platform usability but they aren't. There is a bit of a what's the word caveat emptor you need to be careful when building your containers or making those container images available to others. Alright so here is an example and what I might actually do. No I've got an example I'll be able to show much better a little later on. So as I said earlier there are groups around the world working on trying to solve this software citation problem by building databases that track citations of software and provide software authors with counts of those citations. So for example Zenodo in they've been working on this for a number of years now it's still unfortunately experimental. They have partnered with ADS the astronomical data service to try and track the citations of software that occur in astronomy papers. And so for example here our example is a it's a Python library and it has been cited a number of times in different astronomical papers. Now unfortunately the scope of that is still quite small because of course astronomical data service only has papers from astronomy and doesn't necessarily cover bioinformatics computational chemistry what have you. But the reason why this is possible is because the software authors have already taken steps to make this software citable. And the authors of the papers putting their papers in ADS have cited those cited that software in the same way that they would cite a research article or a conference paper or a data set. So actually here is a good example link I always I'm a little hesitant to do live demos during presentations in case they don't work. So here is a slightly better view of this software. So we have and you'll you'll see a few things here already. For example there are different versions of the software that are represented here in Zenodo sorry I'll scroll back up again. So here is the title of the the particular Python package Python library. There is some information about LM fit. They've applied a license which I'll be talking about later. There is a copy of the a table of the source code available. And here on the right you'll see that each individual version of LM fit has been given its own unique DIY something I'll go into a little more detail later as well. And here on the left still marked as beta after a couple of years there the probably a bit like Google perpetual beta for everything. The we have a list of all of the papers in ADS that have cited this particular version of this software and we'll see that there are 280 citations on that version. So the authors of this package can say to their to their supervisors when they're doing or on the academic promotions board. I have written this piece of software that has underpinned the results in 280 papers. Now to me look 280 is is a pretty good number of citations and I'd be pretty proud if any software I have written had been used in 280 papers. All right back to it let's keep moving. So hopefully I've built a beard of a case for making software cycle and what I'd like to do now is go through an example process for making the software that you develop citable or more easily citable by anybody who will be using it. There are three prerequisites and two steps I did originally have five steps but then I realized that three of the steps were go and get this thing. So I've decided to call them prerequisites instead or rather colleagues suggested that I call them prerequisites instead. So the first prerequisite is that your software source code is managed in a formal code repository and even better if it's a publicly available code repository. I will admit GitHub is probably the one doing the best job of linking up with different services to make software citable. But if you are more comfortable using other services like Bitbucket or GitLab or even you might have a your organization might have its own hosted code repository. You could look into ways to link that up with other services like Xenodo or FigShare which I'll be discussing shortly and building a link between them so that you whenever you do release a new look sorry I'm getting ahead of myself. Let's let's stick to this first please put your code if you are not already put it in a code repository. So you can do formal version control and possibly save yourself in case of catastrophe. Next up an ORCID. I mean this is just for yourself an ORCID an open researcher and contributor identifier is a 16 digits number that has letters in it too that uniquely identifies a researcher or indeed anybody who contributes to research projects. Now look not everybody in this meeting is necessarily a researcher will have research software engineers and and other project staff. But I mean in my mind anybody who is contributing towards the development of the software should be linked to that research software and be given the the kudos for developing that. So when I say an ORCID I mean everybody on the software development team should get an ORCID for themselves. An ORCID belongs to the person who created it and follows them through their whole career. It doesn't belong to your institution although your institution might have a system that lets you link up with their research management software and and write records to your ORCID on your behalf. Talk to your research office or library about it. So after you've put your code in a code repository and all of your devs and and other project team members have ORCIDs. The third prerequisite is a license. Now I'm not in a position to recommend any particular software license to any of you. However there is a resource that was created by GitHub which is now owned by Microsoft. So there's your warning there but it does give a good overview of the most commonly used software licenses and what each license provides. And so licenses aren't just about making your software openly available or freely available to everyone. A software license can also provide you protections and limit your liability and things like that. I'm not a lawyer so I would also strongly strongly advise that you talk to your institution's IP team. It might be that your institution already has a particular license that they prefer to use. And you should absolutely use that rather than whatever license I might suggest that you use. Okay great you've got your code repository. You've got your ORCID and you've got your license and so step one. Sorry slide 15 now if you can't see my slides being shared. Link your code repository to Zenodo or FigShare. Now there are online guides available on how to do this and the reason why we are creating this link is so that whenever you make a new release of your code in GitHub or whatever other code management software you're using. Code management system sorry. Zenodo or FigShare will notice will recognize that there is a new version and they will copy across an archival version of your code and create a new record for that new version and create a new DIY that can be used in a citation as well. Now for those not that familiar with DOIs a DIY digital object identifier is a unique identifier that is traditionally given to research articles, conference papers, publications, journals. In recent 10 years applied to research data as well and then much more recently applied to research software too. And a DIY can be expressed in the form of a URL and it will unambiguously link always link through to the article, the data, the software that it refers to. And the DIY is also the mechanism, the primary mechanism by which citation tracking takes place. So and the reason why we would like Zenodo or FigShare or it could be that your institution has their own data software publication platform. So again, please talk to your research office or your library. And the reason why we would like those data publication platforms to know that there is a new release of your software is because the behavior of the software might change from version to version. Or I mean hopefully your software behavior changes from version to version as you fix bugs as you introduce new features. But it's very important to maintain the rigor of science to know which version of a piece of software was used in the analysis so that you can potentially take changes in algorithms or bugs into account. So look, in version 1.0, there was an issue with that version 1.1 fixes that issue, but everybody who still sites version 1.0, you know, then in their papers, okay, you might have to take this bug into account. Or version 1.1 uses a new algorithm, which possibly changes the data outputs. So we created a link from your code management repository through to Zenodo or FigShare. And then the step two, oh no, sorry, step zero, talk to your librarian or a research office first before you start using Zenodo or FigShare, because it could be, as I said earlier, your institution has its own way of doing things. So for example, the CSIRO has their data access portal, and they like to put, you know, I might be getting ahead of myself, I haven't spoken to somebody from CSIRO for a little while. But certainly for data, for example, they prefer data to go into the DAP, the data access portal, rather than into Zenodo or FigShare, they might have a system for software as well. Okay, so thanks Tom for answering that question. Yes, so there is in Zenodo, the way they handle the DOI versioning of software is that there is a single head DOI or an overarching DOI that refers to the software generally and all versions of the software, but would generally resolve to the latest version of the software. But then each version of the software will get its own additional DOI. So you will have a number of DOIs equal to the number of versions you've released, plus one. And in fact, I did no webinar. Check on. Sorry, if thanks. So I can show you here, for example, we have this DOI up here for the software Zenodo.11813. So I'm looking to my right because this screen is to the right of me. And then each version of the software will have its own unique DOI as well. And those DOIs are automatically created when a new version of the software is released. Okay, so you've linked your repository through to FigShare or Zenodo or your institution service. The next step is to make it really, really easy for third parties to site your software. And that is write a little statement that says when using our software, please site it like this. When I say like this, I mean something like this. So this will look very familiar to anybody who's written any kind of academic paper. It is an entry in a reference list. But however, this is a reference list entry for a piece of software. It mentions the first three authors of the software. Look, originally it did actually have all of the authors, but I thought that was quite long. So I did add the et al there. It includes the date of when that version of that software was published. We have the name of the software, the advanced terrestrial simulator, as well as the exact version itself, version 0.88. A little indicator that we are talking about computer software. Then the database where that software, that version of that software can be found in Zenodo. And finally, the URL or the DOI in the form of a URL that allows the reader to visit the record of that software, see all of the description, all of the metadata and download the exact source code that was used in the paper that sites this. Now, yes, sorry, Tom. I do unfortunately interchangeably use the word release and version sometimes. Not all versions are released. And yes, so, each DOI for each release of the software. However, this release is version 0.88 of the software. All right, now, if you provide this and you include that in the documentation for your software, maybe it's in your, your GitHub repository in the read me. You've said, please site this software as and include this information. Or perhaps you are writing your own paper about your software and introducing your software. This is getting to be a relatively common practice, rather than encouraging people to site the paper, encourage them to site the software directly. Okay, I got a question. Is this citation part of some citation convention like Chicago or APA? It probably is. As a librarian, I have, I actually have a particular contempt for different citation conventions because they're, they're all terrible. But this contains most of the features that are common to most citation standards or citation conventions. It looks quite a bit like APA, but I'm pretty sure that it doesn't adhere perfectly to APA fifth or sixth or whatever version we might be up to. The author of, sorry, whoever it is that is citing you will need to make sure that their own citation formats are correct and adhere to journal standards and things like that. But by providing all of the elements and an example citation, they can at least pull together that correctly formatted citation depending on the information that you have given here in your example citation. Now, this is also a really, really good opportunity to include a little bit of documentation about your software and how others can get it working. I'm not entirely sure whether it's a, you know, is it a proper presentation if you don't have an XKCD comic? But we were talking earlier about portability, cross-platform usability. And so if you make your software a little more reusable by others, you also do then encourage that reuse. The amount of times I've tried to run something that does require only minimal configuration and tweaking and have failed miserably. Sorry, Bill, we will get back to your talking about, actually, you know what, might be able to answer that now. How to cite software within a convention. So that is actually, you know what, that is a real problem that a lot of citation standards don't actually yet have a way or don't have a prescribed way of citing software. Some do, but when they talk about citing software, then they, to be honest, they're mostly talking about citing big soft commercial software packages like Microsoft Word or NNOTE or MATLAB or something like that. They don't necessarily have a way of building or provide a template for building a citation to some research software that is hosted in a place like Zenodo or Invictia. To be honest, I would, if I come across this situation myself, I try to build a citation that looks like it fits within the prescribed style and then see what the editor says. Because to be honest, it is down to the editor to check all the things and making sure that all of your citations follow the prescribed style and things like that. Or you could see if other people have cited software within the particular journal that you've chosen. Okay, another question. How can we choose between Zenodo, FigShare or OSF? Yes, sorry, I haven't mentioned OSF at all in this presentation. I do apologize. I cannot actually provide advice on which platform to use for publishing your software and getting a DOI. I do have a personal preference, but it is, again, the case of different organizations have different ways of doing things. It might be that you have joined a team that already uses a particular thing or it could be that the journal or you might like to be publishing about stuff to a journal. And that journal requires using a particular database or a particular service like Zenodo or FigShare or OSF. It's incredibly subjective and there's no hard and fast rule. But look, I'd be happy to talk about this. Have a longer conversation with anyone about this. Bearing in mind we now have only 10 minutes left. Better get through my presentation. Okay, now I have talked about this. Updated software requires a new DOI, but hopefully you are able to use an automated method in which whenever you do create a new release of your software, that the system that is providing with DOIs knows about that or can monitor that and automatically create a new DOI for you. Now, unfortunately, not all publication systems can handle that, can have that automated link. And so you might, unfortunately, have to do this manually. So bear that in mind when you, if you are unable to have an automated system, bear in mind that when you create a new release you'll also have to manually create a new metadata record and a new DOI because a new release of a software, a new version, is different and will have, or potentially will have different results to previous versions, previous releases. Okay, that's it from me. And I do have in my slides more links and things like that. Now, I don't know whether WebEx lets people who just joined a meeting to see previous messages, I know Zoom doesn't. So there's a link to my slides again. I do have links in my slides through to guides and in fact further reading. So this slide here, I think in the comments, in the notes for this slide, there is a link to an excellent paper that goes through many, many, many different options for citing software or rather making software citable and then constructing software citation statements. I think that's the one that Tom has linked to as well. It's a very, very good paper. We just don't have the time to read through or look at different options right now. Whether any more questions or comments? If you don't have time to put them in the chat just now, please either email Matias directly or through ARDC help desk and we'll be happy to address them. Yes, thank you Julia. So there's my email address there or you can also email the central contact at ARDC.edu.au and I'm more than happy to have a chat about this at any time. Look, I've already talked about it for 40 minutes and I can keep going. Ivan has a question. Yes, so the ARDC does have a DOI minting service. It is, it can be applied to data and software and I don't have enough time, I don't have time right now to tell you how to access it and how to register for it but I strongly encourage that if you are interested in the ARDC DOI minting service to please get in touch with us contact at ARDC.edu.au. Thanks for that comment about chatting with your librarians. It's not always the library in your institution that will be knowledgeable about this. In some institutions there is an e-research office that handles these kind of inquiries. If you're not sure who to talk to within your institution, I strongly encourage you to get in touch with us because we have lots of networks around the country and we know who to talk to in each institution. So yeah, please get in touch with us if you're struggling to find the right person in your own institution. Yes, so if you mint a DOI with the ARDC, assume you use that DOI. Oh, hang on. In that instance you can relate them. Yep, okay, great. Sorry, I was momentarily unsure whether Zenodo will accept a DOI from another system. It's a related object. Yep, okay, thanks Julia. Okay, if there aren't any more questions or comments at this time, again please get in touch with us if you have any more. I bid you all good morning. There's still four minutes of it left for me. Thank you very much for coming and I do apologise again for the delays in getting this session started. I am off to have some lunch. See you later.