 Welcome back if you're watching this on YouTube or if you're here on Twitch. Thanks for still being here today I'm gonna teach you guys how to make an R package So making an R package is useful because our packages are the thing that you sell as a bioinformatician more or less Right, it's the it's the software that you write you bundle it up and you then Give it to other people so that they can use your new Fancy method that you developed so our packages provide encapsulation example and testing data It provides the documentation. It provides examples. It provides test And during this like 25 minute lecture We are going to create a new our package called your package name and the nice thing is is that it's all about structure So there's like a 60 70 page document on how you have to write our packages and by just watching this video I hope that you don't have to read those 60 pages and that I kind of summarize them all for you so that you can Do it quickly So to create a new our package you first need to install two things You need to install the R compiler called our tools which you can download from here The link will also be down in the description Make sure that it matches your version of our that you have installed So if you have our 3.0 installed then you need our tools 3.1 If you have our 3.2 or later installed you need 3.3 and every version of our has its own compiler Besides that you also need mixtex and mixtex is needed So you can create the PDFs so the PDFs are the things that have the Documentation and the examples and these kinds of things in there so you can get that from mixtex.org Just go and download it So after you've installed these two tools the thing that you have to do is create a new directory on your desktop And this new directory on your desktop or wherever you want to store it, of course will hold your entire package So at the structure of the folders and files need to match the official guidelines So if you would click this link and go to the guidelines you can see that it's like a 6070 page document that you have to read but we're not going to do that. We're just going to create it so To do or to create an R package you have to run R from the command line So to start the command line in Windows you rush John CM CMD dot access So you you press your Windows key you type CMD you press enter and it will open up the terminal for you That will look like this if you're on Windows 7 or Windows 10 If you're on Linux then you have to open a terminal slightly differently But if you're using Linux you already know how to use it, right? So that that should not be an issue But of course like it's always complex because a lot of people don't use the command line or don't use bash But it's a good thing So first things first right we created a package or we created an empty folder and this empty folder Is called your package name. So that's just the name of my of my package, right? So I'm going to create a package called your package name You will probably want to use a different name So the first thing that if in Windows we go and we open the command line we have to go to the desktop So that means change directory CD desktop Slash right that just move you from where you are to the desktop folder And then you execute this command R CMD capitalized check Small letters and then the name of the package And then what will happen is R will tell you that this is not an official package, right? Because it says okay, I'm so good to check it and then blah blah blah and then checking for file And it sees that this file is missing. So there's no description file. So in the official guidelines It says you have to have a description file So let's create one. So how we're going to create a new file called description all capitals There it's not allowed to have dot txt at the end So make sure that when you're on Windows that you show Extensions in Windows and that the file that you create does not have a txt extension So an inside of the file we're going to add the following So we're saying package double point the name of the package the version the date the title the author the maintainer the depends In our case, we're only going to have our code in there So you need our version three or later a description which is a description of your package And then a license which you can choose from a list of licenses Be aware that this license has to be an open-source license if you want to publish your code on the ground So next step we now added the description file. So we check the package again So we execute the same command as before our CMD check your package name and a note Inside right so packages without our code can be installed without a namespace file But it is cleaner to add an empty one All right, so let's fix that right because it did make a package So we already have an R package. It's just that it doesn't have any R code And we got a kind of note which is not really a warning, but it's still annoying. So let's fix it So we add this namespace file. So the namespace file again has no dot txt at the end It's again written all capitalized And the function of this file So why is it there it is there to load external dynamic libraries and to export functions to the user, right? So for example, I need to use the sql database and I need to connect to sql.dll or something like that But we also need so every function that I'm writing so every r function that I write I need to specify here that I want to give this function to the user Right, so we do the same. So we create this empty file Inside of the file. We're just going to say well this we don't We don't use any dynamic linked libraries no dll's no nothing no c++ code or whatever I'm just going to export a function, which is called my first package function So my first package function. I now promise that this Package that I'm making is is providing an R function called my first package function So I have to make it right so I have to now create an R Folder inside of the your package name, right? So I create an R folder and this will hold all the R code files And all code in these files should be functions because you are not allowed to give the user anything else You can only give the user functions that he can call and work with And I normally have the strategy that I have one file one function right, so After this my code looks like this. So I have my namespace file I have my description file and I have my r folder and of course I have to make a code file Right, so I have to provide this function So again the way that I do it is I open up a new file And I say my first package function in comments and then I say when I wrote it when it was first written First modified always add a header to your files, and then I'm just going to make a function which does nothing Right, so I'm going to say my first package function Assign to this a function with no parameters and inside of the function. There's nothing this thing just doesn't nothing So but that's the minimal function that I can write and of course I have to save this file then as my package function are into the R folder and I always make sure that the file name is The same as the function name, so just that I can find it back easily So again recheck your package, so we execute the command again our CMD check your package name and a warning so the warning says Undocumented code objects all user level objects in a package should have documentation entries So our forces you to write documentation, which is really really good because a lot of programming languages Do not force you to write documentation meaning that these programming languages don't have proper documentation But our forces you to write documentation, so let's add some documentation so Documentation goes into a folder called man for some strange reason. This doesn't need to be capitalized This needs to be like lowercase This file this folder will hold all of the manual files that you create again same structure One file one documentation of a function So inside the month folder. I create my first package function dot R D So our documentation file So now my folder looks like this namespace description our file with the my first package function dot R in there And then I have a manual folder called man and where there's the documentation. I Just copy paste this in this is just a skeletal So this is just an empty Documentation object. It's written in a latech kind of language. So if you're interested in in that then learn latech But what it does it just gives a name and alias a title a description Then how to use it? arguments details values and examples there has to be an example and Don't forget that at the end you have to say slash keyword methods because this is a the description of a function again, recheck your package and Done We've created an R package. We have made code. This package can be submitted to cron So we go to the website. We search submit your package to cron and your your done and It works. So if you want to give a single function, this is all that you have to do So step one learn how to build in our package step two don't know exactly but step three is going to be profit Of course because we have now made this package, right? We want to make sure that it installs correctly So we also run the command to install our newly made package into R So we say RCMD install your package name and then it will show you that it's installing it to the user's folder and blah and blah and blah and then it's done, right? So again, since it's a very simple package No errors are here. Of course, make sure that you check it, right? So start up R after installing your package use the library function to load your package Execute your function and also do question mark my first package function to look at the help file that had been created Right, so the help file here is more or less empty, but it does have a little example So that's your first our package. It's that simple and Why are there 60 70 pages of documentation on how to write our packages? That is because you can add a lot more to your package For example, you can add data, right? Imagine that I'm writing a new function, which does Mendelian randomization Analysis right then of course to test this Algorithm to test this function that I have I need to have some data to test it on But in our case, we don't have any data. Our function doesn't do anything But if you wanted to add data, you have to create a follicle data again. It's all lowercase So for example, I open up R and I just make a random data matrix So I just say make a matrix with a thousand random numbers in there a hundred rows ten columns And I just say save random matrix So I sign it to a variable and then I save this variable to a file in the data folder called random matrix Dot our data, right? So our data is the extension for data in R Remember data also needs to be documented. So besides making this data file in the manual folder I also have to add a random matrix dot rd file. So a data description file So how does the data description look like? Well, it looks very similar Right name alias in this type. I have to give it a doc type. So say doc type is data And now the keyword at the end is called data sets. But for the rest, it has the exact same structures Then are the exact same entries and of course, these are just the entries that go into the help fund Of course, we want to provide testing as well, right because we have now our package We have our data we have our function But of course, we need to make sure that the the function does what we expect it to do So have besides the examples in the manual files We can have more tests because the examples are actually ran when you compile your package So when you compile your package, it looks at all of the examples and it executes all of the examples So that's already the first layer of kind of protection that you have From our right, but we can also explicitly add more testing and to do that We create a folder which is called tests and all of the our files that you put here will be executed Automatically during the building of the package. So when you do our CMD check, but also when you do No, not when you do our CMD install So when you do our CMD check, it will run all of the different files in here So I generally follow this system where I say test underscore zero zero one dot r Then I have another file called test underscore zero zero two dot r and these are just tests, right? So they just Give some input to the function that I provide to the user and then test if the output is what I expected to be So a very basic test, which is a very poor test in this case But this is a test that just randomly fails, right? So again every file that I create has a header But here I'm just saying if Draw a random number if the random number is larger than 0.8, then you just have to stop, right? So 20% of the time Compiling the package will now give me an error, right? It's not a good test because it fails randomly But head this is how you build tests so you build a test by just saying if Executing my function leads to a value, which is not equal to what I expect stop and tell that the test was not successful So some common mistakes when you make an r package when you install a package into r Make sure that r is not running right since you're installing it via the command line You could have your r window open and especially if you're working with multiple monitors then it might be that r is running because If r is running it cannot update the package code Always make sure that you check your package before installing so always run r cmd check First and then run r cmd install and make sure that you add enough testing, right? Use the documentation for quick and very simple tests and then use the test directory if you have more thorough tests, right? If you want to run and do something like a thousand times head then that should not go into the Documentation it should just go into the test folder in the documentation You just so well when I do this then the output is 15 But hey in the test folder you just test like 10 or 50 or 100 different possibilities and so different input Expected output. All right That was it. So thank you guys for being at the lecture