 All right. Good afternoon, everyone. Welcome to this webinar on version control with the OSF. My name is Courtney Soderbergh from the Center for Open Science, and I'm going to be leaving this webinar today. So just to go over a couple of quick logistical things before we get started with the actual content, please feel free to ask questions during the webinar or also make sure to leave time for questions at the end. But I just wanted to quickly go over how you can ask questions. So there's a Q&A function which you should be seeing on your Zoom dashboard. So if you click on that, you can ask questions, and then I will be able to see when you ask questions and mark when I'm answering them. If anything goes wrong during the webinar, you think I'm not sharing the right screen, my audio goes out, something like that. You can also ask, you can also notify me of that by using the chat functionality, so you can send to chat. I'll see that and we'll be able to hopefully fix the problem. You can also ask questions after the webinar. So if I go over something and you have a question afterwards, you can always feel free to contact us. And there are a couple of options for that. Anything with the OSF I go over, there are a couple of links to help resources for that. So the first three is the link to the OSF, the link to our basic help documentation, as well as specifically a link to the help documentation about version control, which we're going to go over today. If you have a specific question that is uncovered in the help documentation, you can always send us an email at contact at cos.io. All right, so now that we have the logistics workout, I'm going to go ahead and start talking about version control and how the OSF can help you out with that. All right. So for those of you who are less familiar with the OSF, OSF stands for Open Science Framework. The Open Science Framework is a free open source collaborative platform that helps researchers track, document, and potentially share the entirety of their research workflow. So what I'm going to go through today is talking about what version control is and what are some of the features that are built into the OSF to help researchers have better version control and better documentation. So I'm going to be showing a example project that I've had from before called gender and political affiliation. So when we talk about version control, there are many different ways to go about version control, but oftentimes what this means is just how do I keep track of and manage the different versions there are of a file? How do I keep track of what's changed? How do I keep track of which version of the file is later than the other version of the file? How do I keep track of who is making those changes? Different people will have different systems for this. One thing that I know I used to do and is pretty common is kind of a home grown version control where you may append something to the end of a file name, oftentimes a number or someone's initials or a date or something like that. And sometimes this will and sometimes this will go pretty well for like the first one or two sections of the file, but you may eventually end up with something like this where I have analyses, analyses two, analysis three, and we're kind of going, okay, maybe this is going well. And then all of a sudden we get final analyses two, final analyses, final analyses two, final analyses four, no idea where three went. And then you get something called actual final analyses and actually actual final analyses two. So these are actually files that have the names that are very similar to some names of some analysis for files from way back when I was a first year graduate student. So this is pretty indicative of what I used to do and what I think many people will often do where they'll start out with a naming convention that is supposed to take care of the file versions. And for one reason or another that naming convention will kind of quickly get out of hand and they're left with all of these files with different names and like final and non-final versions with numbers and it's really unclear who was making those changes and what the actual order is. And so it gets it gets complicated to try and recreate how those files changed over time and figure out what is the current version at any moment in time. So I'm going to talk about what are ways that the OSF helps us with this version control. So rather than uploading files with different names when we have to upload a new version of a file, what are some other ways we could do this? So I'm actually going to delete all these other files and we'll just start from scratch, one analyses file. So there are a couple of ways to deal with versioning on the OSF. For files that are that can be edited in a text editor, for example an R file, but this could be an R file, a CSV file, a .txt file, or things like SPSS scripts or SAS scripts or status scripts. I can edit them directly on the OSF if I want to. So if I open this R script you'll see I have an edit button. This will bring up an editor and I can actually make some changes. For example I could say these are the required libraries and when I save those changes, there we go, you'll see that the view of the file has been updated and if I click on this revisions tab, a new version has automatically been created for me. It has the date stamp and it has the name of the user who changed the file. Now this second version of the file is the one that will automatically show up when I go to view the document. But through the revision history by clicking on those old version IDs, I can go back and view those old versions and from the revisions tab I can also download those previous versions. So by having these versions automatically created, I have one clean line of versions going forward. So I know that this one was created after the previous one. I know that I created it but I could always go back and look at those old versions. Now when we think about a project, if I'm the only one working on this file, I may just have one linear line. However if I'm working on this in collaboration with somebody, I might worry about well what if I'm making changes and they're also making changes at the same time, could I accidentally get two versions that are out of sync with each other. So to guard against that, there's this checkout button. So if I click on this checkout button, what that does is it means that the other contributors to this project, the other people who have read, write or administrator access to the section of the project, will be able to view this file but they will not be able to edit it. So they won't be able to upload a new version. They won't be able to use the edit tab on it. The file is basically locked for them and so that's a way to make sure that versions cannot get out of sync. So one of my collaborators could be viewing this R script but they wouldn't be able to make edits until I had checked the file back in. So we have two questions. Malika asked, can the versions have comments like commits and get? That's a great question. So currently you cannot append a comment directly to a particular version of a file that is something that we've had a couple of user requests about. So it is something we're looking into. However, you can comment on a file in general. So for example, I can say there's this commenting pane right here. So I can click on comments and I could say second version has updated documentation and make that comment and those comments are stored with the files but there's no way to connect that comment specifically to a particular version of the file. So I can show you how this would work with a Word document for example or a different R script that has another change. So if I go into another section of my project, for example the questionnaire section, I have a questionnaire file. This is a Word doc file. So you can see that I have some text here that says make some changes and it has two versions currently. It has a checkout function so I can keep somebody from interacting with this if I know I need to make some changes on that questionnaire file but it doesn't have an edit button. So what that means is I can't edit it natively on the OSF. So in order to deal with that file what I want to do is open it up on my own computer and then I could make whatever changes I needed to make to that document on my own computer. I'm just going to change some basic font colors, resave it on my personal machine with that same name. So resave it as questionnaire and then as I did with R script I want to go ahead and upload that file with the same name. So when I go and open that file you'll see that it now says version three and that a new version has been created. So now the text is in the text of the header is in green and I have highlighted the background in dark blue. Not a very important change but a change that's really easy to pick out. And just like with the R script that I edited natively on the OSF you'll see that even though this was not edited on the OSF when I uploaded that new version with the same name the system has automatically created a new version with a new timestamp and the username. So if I was collaborating on this with somebody if for example my boss, Brian, when it looked at the questionnaire document decided that he wanted to make some changes and uploaded a doc called questionnaire.docs. That was a newer version. This would have a new timestamp, a new version number and it would say Brian. So Brian and my changes are all going in one stream forward and we're each when we view the page going to be seeing the most recent version. So Brian doesn't accidentally edit the old version of the questionnaire document that he happens to pick up from his email because he'll be looking at the OSF seeing okay what's what is the version that is currently up there I know that has to be the most current version and I'm going to work with that one. So that's how the OSF deals with versioning of documents that have uploaded but some people may already have version control systems that work well for them. I mentioned the kind of homegrown version control of appending a number or a initials or a timestamp to the end of a document but one popular thing that some people will use to version especially code or analysis scripts is GitHub. GitHub is actually what we use at the center for the development of all the code related to the OSF. So let's say that my analysis scripts didn't actually exist within the OSF they actually existed in a GitHub repository but since they're related to my project right they're related to this data they're related to these methods and materials maybe I want to have those analysis scripts appear in my OSF project just so I can link up the different pieces and parts of my workflow. So rather than having to download them from GitHub and upload them to the OSF I can do something a little bit fancier. If I go into the analysis scripts component click on settings I can connect up certain features to the OSF some of them are storage features like Amazon S3, Dataverse, Box or Dropbox, Big Share but GitHub is one of the options so I can check GitHub and then it's going to ask me to basically import my access token this would normally ask me to if I'd never done this before it would ask me to input my password I have done this before so it knows I am who I say I am and then I get an option of which repository do I want to connect to this project I want to connect my test repository so if I save that and I look at the project now you'll see that the contents of this GitHub repository is appearing in my project. When I click on any of these files the most current version in GitHub is what appears and if I click on the revisions tab it's showing me all the different version IDs that are stored in GitHub and I can actually so I can get that information or I can download them so how this works is a two-way door so I can still interact with this file through GitHub any change I make to the file on GitHub will show up in the OSF but it allows me so I can view the file on GitHub if I want but maybe the collaborator on this project is somebody who doesn't use GitHub maybe they don't really want to interact with GitHub review the file through there it allows them to look at this file through the OSF even though it's actually coming from GitHub if you want to you can give people the ability to interact with this file from the OSF and it will push commits to GitHub you can keep them from doing that if you want to by adding them as read only contributors to this section of the project that would mean that they would be able to view the contents in that GitHub repository but would not be able to make any changes if you add them as read write technically they have the ability to push commits to GitHub there is this edit functionally where I can say you know make even more changes since this is a text editable document and then that change will be pushed to GitHub but you can keep people from doing that if you want either by making them as read read only and so then if I look back at the GitHub history of this file you can see how that commit is put into GitHub right it'll say it was updated via the open science framework and that the commit was made by me if the commit is made by another collaborator on the project it will say their name so I believe right so this was a commit made via the open science framework but it was made by a contributor on the project Jolene Esposito right so we have a question from Jade does OSF track versions in the same way for documents that have been registered on the OSF so Jade is asking about or I believe and please correct me if I'm wrong about this the registration functionality of the OSF so what that means is this is a living project I can make as many changes to these files as I want but there might be certain points in the life of a project that I think are important to keep a read only kind of version of if for example right before I start my collection my data collection or what my study looked like when I submitted it for publication or what my what my project looked like at a certain point during data collection I can create a snapshot of the project at that point in time they'll be read only it can never be changed and that is what a registration is so if I go to my project I have this registration option I'm not going to register this but I'll show you what a registration looks like all right so this is a registration of a project I actually was working on so you can see there's this read only watermark going along if here's an r script for example if I look at this r project I don't have an edit button anymore it will show me the versions that existed up to that point in time so I can go back and look at those versions but none of these can be edited because the registration is view only however this registration is connected to the living project so it says this project is a registration of this project if I go into the living project if I click on the registration tab I can see how many registrations were made when they were made and if I wanted to I could continue to edit these files so registration will have the versions of the files that are in osf storage in the current version of the files in the add-on at the time when the registration was made but since the registration is read only it will not continue to add on versions that are made after the registration to the registration but those will be made in the project um does that make sense jade hopefully yeah you can think of the project as one flow and the registration as just taking snapshots that are saved separately but connected to the project um so there's a there's always a pointer but the registration is what the project looked like at this point in time and so it's not gonna it's not gonna track what happens into the future but it will track what it happened in the past all right so I quickly wanted to go over the last way that the osf deals with version control um we've talked about version control of files both text editable files and other files like word documents we've talked about adding adding in other tools like github that may have um internal version control and how you can still see those previous versions in the osf but I did want to mention really quickly um one other place on the osf has version control that I haven't talked about which is the wiki so this is the wiki it's a real-time collaborative editor um which different people use differently I tend to use it as a way to kind of keep a current description of the state of my project um some people use it like a notepad other people use it um kind of for abstracts and things so just like that r script there's an edit button and I can make changes um so for example I might at some point change my hypothesis and say um women will be more conservative than men you can see that it's automatically updating over here and I save um that wiki now if I click on the compare here um the wiki will actually give me a diff of different versions of my wiki so just like files it will tell me um who made the change and when it was made um but it has the additional feature of allowing me to look at different versions and say okay what's different between the current version and version um four for example um and so it will actually highlight things that have changed between the two versions all right um so that was the different places in the different ways that osf deals with version control both allowing you to connect in your um version control systems that you may be already use or automatically um automatically version and controlling files so you don't have to have a bunch of files with different names trying to track those versions um we have a couple of minutes so if any of you have questions about anything I went over um we do have time to answer them inside the webinar or as I mentioned you can always email us with questions after the fact by going to contact at sos.io all right uh so it looks like nobody has uh any more questions um so thank you so much for attending the webinar this afternoon and have a good day