 Hey folks, welcome back for another episode of Code Club. I'm really excited about this next series of videos that I want to work through with you all. So I am a microbial ecologist. I'm really interested in bioinformatics, which is why I make these videos. I know I have a lot of people out there that maybe don't care at all about bioinformatics or microbiology. Please, please, please stay with us through this series because I think you'll learn a lot, even though the application is bioinformatics. What I intend to do is to create an R package and hopefully host that on a website that will allow us to replace a much cherished and beloved website that went down in the last year. What is that website? Well, that's the ribosomal database project. They had a classifier that I know a lot of undergraduates and dare I say faculty at certain large universities might use to take a sequence and classify it using an algorithm that they published a long time ago called the naive Bayesian classifier, which was used to classify 16s rRNA gene sequences. Even though it has this kind of scary name of naive Bayesian classifier, trust me, it's not that complicated. But it's really powerful. And a lot of people over the years have tried to improve upon it and failed, which is kind of surprising because it's kind of a weird algorithm, as we'll get into over the course of the videos. The problem with this, however, is that I don't know anything about building packages. I've never built a package myself. There are people in my lab that are starting to make packages. And so I kind of feel this compulsion to learn how to make packages. Also, I've been using R and writing R long enough that kind of feel like I should have a couple packages, you know, in my list of contributions back to the R community. And so that's what I want to do with you all. But I need to learn how to do that. And so what I'm going to do is to teach you and as I teach you, hopefully I will learn. And so there is a great resource that Hadley Wickham and Jenny Brian made called R packages. And so you can get this from Amazon. I've got a link down below in the show notes where you can go and get your own copy. There's also a website that has the same version and maybe even a little more recent refreshed version of the book that you can get access to for free. And I'm going to basically use that as my tutorial to learn how to make a package in R. What I'm going to do in today's episode isn't start building this classifier, but I want to start by building a package. And if you look at chapter one of the R packages book, you'll see that there's a chapter called the whole game. And in there, they step you through creating a what you might think of as a toy package, a very simple package that allows us to see a lot of the functionality of the tools that are available to build packages in R. So building packages in R, I know at least is fraught with a lot of challenges. And I've seen numerous horror stories of people trying to submit a package to Cran and getting nasty grams back because they screwed something up. Well, Hadley Wickham and his crack band of developers have created a lot of great tooling around the generation and workflow of maintaining and producing packages. And so that's what we're going to use. And like I said, that first chapter of the book really encourage you to read that as well as the second chapter of the book. They're pretty short. And I'm going to go through that first chapter with you with kind of my own color commentary. I am here in our studio, you'll see that I'm using a relatively recent version of our our version 4.3.3 that came out at the end of February on Leap Day 2024. And what we'll go ahead and do is we're going to be using a package called DevTools. So I'll come over here to the packages tab. And I'll see if I've got it installed already. And I see that sure enough, I've got DevTools here. And it's kind of shortening the name a bit. But that's DevTools, right? And so if you don't have that installed, you would want to go into install. And then here do DevTools. And then click install. And that will install it for you. I think I do have the most recent version already, which looks like it'll be version 2.4.5. And to load DevTools into my active session, I'll go ahead and do library DevTools. So the toy package that we're going to create is called regexite. So regex is a regular expression that's kind of commonly used acronym for it. So regexite will be the name of this toy package. There's a lot of other regular expression tools already in R. There's stuff in base R. There's string i, there's string R. There's a few others. I use string R a lot from the tidyverse. And so we're actually going to be creating a function in our regexite package that's already in string R. And what that's going to do is it's going to take a string of characters that are separated by say a comma or period or something like that. And it will then return that string as a vector. And so you can imagine having a comma b comma c, the return would be a vector with the values a b c. Okay, so pretty straightforward. Yeah, the functionality is already baked into a lot of tools in R. But it'll be good for demonstration purposes. So to get going, I'm going to use dir dot create. You could do this in your finder. But I'm here already in R. So I'll use this. And so then I create the directory path that I want it to go to. And so it's going to be off my home directory desktop forward slash reg excite. This is going to work on a Mac or Unix computer, the alternative if you're using Windows, or you perhaps don't want to enter the command at the console is to use the finder that's built into our studio, you can go to files, then say desktop or wherever you want it to go. And then you can do create a new directory by clicking on that and doing that. And then you could type like reg excite. Okay, but of course, that already exists. So it's going to complain at me, we now have that directory created. And now we want to create the package. And so we'll do create underscore package. And we now give it the path, the same path that we used to the directory for our package. And so the name of the directory is going to be the name of the package. So it's really important that you name your directory what you want the package name to be. So again, Reg excite works. So we'll go ahead and put in the path, Reg excite. And this then does a whole bunch of stuff. And our studio and our have restarted. Basically, it's one of the things that does is it creates a R studio project. And you may know from your previous experience using our studio projects. So when you create a new project, our studio relaunches. And so we see that we have now launched into the working directory if of Reg excite, which again is off my desktop. You also know that if you look up here at the top left corner of my console, you'll see I'm in desktop Reg excite. And you'll see the contents of that directory. It leaves up the previous session of our studio. And you'll see here that this was the home directory was my working directory. So you'll see a change there. And you'll see that when we ran create package, it changed the active project directory, it created an R directory, it created a description file, which we'll see contains all this information. And then it tells us some of the other things that it did. I'm going to go ahead and close this R studio window and open up the other one. So again, we have these files and directories that are studio created, we can look at the description file, which has all sorts of great stuff in it, although it's pretty generic at this point, as we go through the rest of the episode, we'll we'll fill that out. There's a get ignore file. Yes, we will be using get and get hub to keep track of our project. And so this file tells get what things to ignore. Let's see, there's also a namespace file. So this document is read only, we should not be typing in this file at all. Again, it says generated by rocks gin to we'll talk about that later when we talk about documenting our code. And then what else do we have, we have this R directory, which is currently empty. That's where our R code will go. And there's also an R build ignore file. So ultimately, what happens is we build this package. And then we want to package all of the data and the code and documentation together to then submit to crayon for it to be available for everyone to use on crayon. We're not going to do the crayon submission today, but that's kind of the idea. And so the R build ignore tells that packaging process, which files to ignore, right? So it's going to ignore my our project file because that's not part of the package. That's a tool that I'm using for developing this package, right? Same with this R project user file. All right. And then there is this R project file. And this again stores all of the different settings that I'm going to use for my project. And I'm going to leave everything the way it is. Great. So I'll go ahead and close these files. And so the first thing that I want to do is go ahead and set up get that's one of the favorite things. I about starting a new project is get in it and starting a get repository for my project. And so they use this package that again comes with dev tools has a use get function that we can use to set up get in our project. So I'll go ahead and use get. And of course, it's complaining because it couldn't find the function use get because I forgot to load dev tools. So we'll do library dev tools. And then I can up arrow in our studio here to run use get it then gives me this dialogue. And then it tells us what's uncommitted. It's kind of like get status. And then it says, is it okay to commit them? And so what I really love about dev tools and these dialogues is that they give you generally two no options and one yes option. But they put them in different order every time. So you have to be mindful of what you're doing. You can't just kind of, you know, dumbly go through clicking yes, yes, yes, over and over. You really have to force you to think about it. So I'm going to go ahead and do one for yes. And so then adding files making a commit with message initial commit, a restart of our studio is required. Are you ready to restart now? Absolutely. Yeah. So number two here. So what I mean, they kind of change the order and the language. Again, this is going to restart our in our studio. Again, I'm back to a new version of our so I need to do library dev tools. So running this library dev tools every time I go into our might get a little bit tedious. And so one of the suggestions they actually make in chapter two is to use a dot our profile file. So dot our profile files are run as R is starting. And so what we could do is create this file and put a library dev tools as a line in that. Generally, we don't want to do that when we're doing kind of data analysis, because it's not immediately clear to the user what tools you have loaded. But if we're developing a package, then it's pretty clear that we're going to have dev tools installed anyway. So we'll be in good shape. I'll go ahead and create a new file. I'm going to make this as a text file. And I'll call it dot our profile. Click Okay. And then I'll put in here library dev tools. And I'll save that. So notice I did this as a text file rather than as an R script. The main difference is that as an R script, it's going to put dot R at the end of the file name. So we should be good there. And we now see that we have that dot our profile file. So next time we start our studio with our we shouldn't have to worry about running library dev tools again. So you'll recall that before that tangent of making the period our profile file, we used get to do an initial commit. And normally there's a tab up here, but I don't see it. So let me see. Yeah, so it's been hidden because because I've kind of got everything zoomed in so far. So I can go ahead and click on this clock. And this will show me the changes or the commits I've made. And so the very first commit was my initial commit. And so we'll see that there. Again, I have added dot our profile. So I'm going to go ahead and commit this by clicking the staged button. And then again, because I have things zoomed in, it looks a bit weird. But here in my message, I'll go ahead and say, automatically load dev tools at start of our session. And I can't see the left side of my screen here, which is really annoying. So I'm just going to have to trust that I've got it right. So I'll go ahead and commit that. And it looks like I spelled it right. So I'll go ahead and close. And now everything is good in my get window. Because I have that tab missing off the right side of the screen, I'm going to go ahead and modify the appearance of my window here. And so I can do that in the main settings, the global settings. And if I go to pain layout, I'm going to go ahead and let's see VCS is version control. So I'm going to I want to keep build and VCS I don't care about tutorial or connections or history. So I'm going to just work with those three tabs. And if I go ahead and do apply and okay, I now see that I've got those three tabs. So that'll make my life a lot a lot better. All right, let's write some R code. Yeah. So I'm going to go ahead and open a new R script. And again, what I want to do is create a function that will take a character string that's got some type of delimiter in it, and then we'll split the values in that string apart by the delimiter. And so I'm going to create a vector of values as a test. So I'll do alpha, beta, Charlie, and delta. And so I'm writing these without spaces. Again, the four Greek letters, the four call signs, whatever they are, with commas in between them, I can run that by having my cursor on the line and doing command enter. And if you do x, you'll see it. And so again, what I want as output would be a vector where alpha, beta, Charlie and delta are four separate values. So do that, I can do str split. This is a base r function with x being that character. And then I'm going to do a split on comma. And so the output then is more or less what I hoped it would be. One thing that you will notice though, is that instead of starting with the one like one alpha, beta, Charlie, delta, kind of like we had up here, we have the double square bracket notation with a one. And that's because the output of this is a list. I can use this as an argument to str to see the structure. And it says it's a list of one, right? And so there's a couple ways around this. One would be to do unlist on this. And so that will make it not a list. The problem with unlist, however, is that if I actually had two values in my x vector, so say I had this string alpha, beta, Charlie, delta, and then I had ABCD as a separate value. And then I did unlist, it would give me a vector that's eight things long, rather than four things long, right? And so it's going to unlist or it'll turn the list into a vector, right? And it'll concatenate all the vectors together. So that's not so great. An alternative that will actually do is to do square bracket, square bracket one. And this then will return the first value. So if there were two values in x, this would only return the first value. So this is my function. And I'm going to wrap this in a function str split one function. And it's going to take as arguments x and split. And then we'll go ahead and put this in curly braces. And I have other episodes where I've talked a fair bit about writing functions. So I'll save that for there. I can go ahead and replace this comma with the split. So now I've got my function written. Normally what I might do is say load this, I'll go ahead and do that. I'll go ahead and load this into my R session. So it's there now if I do str split one on x with split equals comma, then I get back those four values. I don't want to do it this way because I want to create a package. And so over here in the environment, you'll see that I've got my function loaded. I can go ahead and clean out my environment so that that's gone now, right? And of course, x is also gone. So I want to go ahead and reload x. And I can't rerun str split. So I want to create a R script that will have my str split one function in it. To do that, I'm going to use another use function called use underscore R. And I'm going to then call that str split one. This then creates a file called str split one. And that is living in my R directory, right? So now I'm going to go ahead and copy the function into str split one and save that. I'll go ahead and close out untitled one. Again, I don't have str split one loaded in my environment. So to get this into my console, what I will do is load all load all will load everything that's in my R directory into my R session. And so now I could do str split one on x with split equals comma. And it does exactly what I'd hoped it would do. Okay, so load all will be really helpful for us as we're creating R scripts, editing the R scripts and then want to load them into our working session so we can do work with those as we're going through our developmental process. So I'm going to go ahead and do another iteration of committing things. So I'll go ahead and click that stage button and commit it. And here I will say create str split one function and commit. And so that's been done. And we'll go ahead and close that out. And so get is up to date. Now what I'd like to do is check my progress and see how I'm doing at building my package. From the command line here, I can go ahead and do check as a function with no arguments. So it's asking me about installing the build tools. I'll go ahead and say yes. So I'll go ahead and run this package build command that it suggests. This is complaining again. It's kind of sending me through a loop. I don't know what's going on. I do have my Xcode tools installed. That shouldn't be a problem. I'm going to move on from here for now and worry about this later. The check function really makes sure that the package is all together in good shape. And for our next episode, I'll be sure that this problem doesn't persist for I'll do some digging to figure out what's going on. So let's go ahead and look at some of the documentation in this description file. And so this creates a template and we can modify some things. So I'll go ahead and modify the title. So I'll do exciting package for working with strings. And then for the authors, I can go ahead and put in my name. So I'll put Pat Schloss. And then for my email address, I'll do pdschloss at gmail.com. And I'll leave all that stuff. I do have an Orchid ID. But if you don't have an Orchid ID, you can go ahead and remove that. So I'm going to leave it for now because maybe I would come back and change this in the future. And then a description as a paragraph, what the package does. So string manipulation package that goes above and beyond what other packages do in an exciting way, right? Because we're so excited about rejected. Exciting way. All right. So we'll go ahead and save that. I'm going to hold off on modifying the license at this point. I'll go ahead and save my description. And then back in my console, I'm going to use a MIT license. So we'll do use underscore MIT license. Run that. We see that it creates a variety of other things that we see now under the license line in my description. It says license MIT plus file license. Over here in my file window, I have license, which is the license that will go into the package. And license MD is the license that is then readable on GitHub once we get that posted. And so this all looks good. I could imagine maybe modifying Regix site authors to be me, but there might be other authors that come along. And so the Regix site authors would include everybody that's contributing to the project. Okay, so I'll go ahead and close these. And I'll go ahead then and do a commit. And I'll say modify description and add license. Okay, so commit that. Now I'd like to add some documentation to my R function, right? You know that if you do like question mark, and then read underscore CSV, you'll get a help page for that. Or if you go over to the help tab, you'll find the help documentation there as well. So we want to go ahead and insert that for our code. So this is where Roxygen two comes in handy. Roxygen two is a really handy way to comment our code that are can then process to make the documentation. So we will leverage our studio here by going up to the code window, and then coming down to insert Roxygen skeleton. And there's of course a shortcut with keystroke that you can use to do that. This then inserts a template. And so in here, then I can put a title. And so this is going to be split a string. And then to the right of my two parameters, I can go ahead and put a description. So I'll say a character vector with one element. And then split will be what to split on. Great. And then for some examples, I can in here put some R code. So I can do x. And then here I'll do alpha, beta, Charlie, delta. Right. And then I can as an example do str split one on x and the split then would be the comma. And so that's my example. And now I can go ahead and save that. And then back in my console, I'll go ahead and do document. So this is complaining, saying return requires a value str split one line six. And so that the return here should be a character vector. Okay, go ahead and save that. And now we'll try document again, goes through without any errors, updates the regex, updates the regex site documentation. And so now if I do question mark str split one, I find over in my help page that I get a preview of what the help will look like, right? So that's pretty awesome, huh? Great. So that's where the help comes from if you've ever wondered, we will have to write the help and the documentation for each of our functions. So one other thing that I've noticed has changed by using get is the namespace. So if I go to files namespace, I see now it has in here, export str split one, again, I don't want to modify the namespace file by hand. And I notice that in str split one, for that function in the documentation above it has the keyword at export. And that then tells are to export this function so that other people can use it. That's great. So I'll go ahead and stage these changes. And I'll say add documentation for str split one and commit and close. So the book then says to do another check. I think this is going to give us the same error that it had before. Yeah, so I'm going to say I'll try it again. But it's complaining. So we'll just move on from that. And then the next thing to do would be install. So this will install my regix site package into my R session. So this installs it. And I bet if I come over to packages, and then search for regix site, it is loaded into my my session. And it actually looks like it's already been libraryed and it's already been attached. But normally would then do library, regix site. And that would then attach it so that I could of course use str split one, as we've already done, right, I could do str split one with x, and then with the comma to separate it, it's starting to look like a package, like we might use in a normal R session, albeit a very limited package at this point. As we continue to develop more and more functions and more complicated functions, we're likely going to want to be able to test our code. And so there's a package in all our called test that that allows you to automate the testing of your functions. And this is important because while we might all do tests of our functions, like I've done here, we don't typically automate them. And the value of automating them is that I might make a change on this side of the package and the code base. And I don't realize that it changes something on this side, right? So a change over here could cause something over here to break. And so by automating our tests and having a good set of tests, we then we make sure nothing is breaking when we make changes. And so we're going to go ahead and use test that so we'll do use test that. And that initializes things. One thing to notice over here in files is that I now have a tests directory. I noticed I also have a man directory, which has the documentation for that for the str split one. But in tests, there is a test that our script. So this is this file is part of the standard setup. Don't touch it. Okay. But what we're interested in is what's in test that. And so our scripts for testing will be stored in test that and if you're not familiar with the test that package, it has a lot of really powerful tools that really encourage you to go look at. We'll be using test that a lot as we go through the development of our own package. So to then create a test, I'm going to go ahead and do use underscore test with the name of the R script. So kind of like before I did use our str split one. Well, now we're going to use use test str split one. This then in our test that directory creates a file called test hyphen str split one dot r. What we'll find is that these files all start with test hyphen, and then the name of the R script that we're testing. Okay. It opens with some test test language. And so this is going to give us a sense of the general syntax for our testing that will have test that and then a string that is the title of the test comma and then curly braces and inside the curly braces will have some statement that's like expect underscore something right so this is going to expect equal two times two and four and so this should pass right so if I go ahead and save this and then go to build I can click on the test button and this will run through and says everything passes which is good alternatively I could do test open close parentheses and everything passes there as well so this isn't the test I want instead I'll do test that and I will then say str split one splits a string okay and then again between the closing quote and the parentheses I'm going to put in curly braces and do expect equal and then I'll do str split one and I'm the argument I'll give it is a comma b comma c and then the separator all the split equals the comma I don't really need that split equals but whatever and then what I expect to output would be a vector of a b and c so that should pass if I go ahead and test that sure enough that passes and we're good to go so anytime then I run test this test and all the other tests I write will get executed and if I if something fails it'll let me know so I'm going to go ahead and commit this change to add the test and commit and we'll say add test for str split one great I kind of I like to do a commit for any kind of meaningful change but I want the change to be cohesive we're kind of committing pretty frequently probably more frequently than I normally would but I wouldn't want to like I don't know have it test and a documentation for something totally different on the same commit I prefer have those commits handle different changes different sets of change right okay so speaking of change what I now would like to do is to change my code so that instead of being based on str split the base r version of this string I instead want to use the string r version which is str underscore split okay and so we're going to change the name of the function we're going to bring in functionality from other packages and so many or I'd say nearly all our packages are built on other our packages and so we need a mechanism to bring in what we want from those other packages into our own package so to get going on this we'll go ahead and use the use package function and I'll put in string r and so it's adding string r to the imports field in description and so if I look at my description under imports it is using string r I also notice I didn't see this before that it's added this information about test that when I added use test that and use test previously okay and so then it says if you're going to use a function from string r we need to write it this way so string r colon colon and then the name of the function so we'll need to be sure to do that and so now I'm going to refactor my code and so I will go ahead and make this str underscore split underscore one this format of the name is more in line with the string r way of doing it so as the book mentions this might be a preferred way I think it's a preferred way to name things but writing it this way makes it perhaps a little bit more clear that this is based on the string r package and so we'll also change the arguments to be string and then here pattern and then n equals inf and so this would be how many things we want to return and then to modify the code we'll go ahead and add some other type checking to make sure that the strings are properly formatted so we'll do stop if not is dot character on string so if it's not a string then we're going to bail out of here and complain and then we'll do length string less than equal to one what this says then is if our string variable isn't a character type and if it's longer than one then it's going to complain and so we want again because we're worried about getting this problem with lists that we described here we want to make sure that string only has one element to it cool the other thing we'll then do is if a length string is one then we're going to do good stuff right this is kind of the happy path and I will grab this code because that's going to go in there and let's get some white space in here and so again the string our package and then we want str underscore split and then the argument here is going to be string equals string and then pattern equals pattern and I notice I misspelled string here you should have said something remove that and then we'll do n equals n all right so if if the string isn't equal to one so in this case it would be zero right we'll then do an else and here then we'll return an empty character value so character empty so it'll be a character vector but it's going to be empty right so if we ran this in the console it would look like this okay great and so then it will return the last value that it computes in this function which hopefully will be the same thing that we had before so I'm going to go ahead and save this and we need to rename this file to reflect the new name that we have right and so to do that there is a handy dandy helper function in dev tools that we can use which will be rename files and we'll use the current name str split one and we'll change that to str underscore split underscore one and it's saying that this file that I have that's open has been deleted do I want to do that close the file now yes and so this file also doesn't exist so we'll also say close that now right and so our studio sometimes does this funky thing where it lists the file name twice in the files window I find that if I hit this refresh button it updates that and again if I come down to r I now see that str split one dot r has been updated and so we also need to update the parameters here and so and the help the examples rather right and so param will be string and split will be pattern but instead of pattern instead of kind of putting in n what we can do is we can have raw oxygen import the arguments from the function that we're basing things on right and so what we could do instead would be inherit param inherit params so then we'll put in stringer and str underscore replace and I don't want that space there and then I need to update my examples to be str underscore split underscore one with x and then pattern equals that and let's go ahead and add another example in here and I don't want the comment there so we'll go ahead and do a pattern x x pattern and then we'll do n equals two and we'll go ahead and save that and I think that's all good we also need to update our test and so we'll go down to tests test that and then this and so we'll go ahead and modify this to be str underscore split underscore one I guess I should update the test name as well and this should be then pattern and that should be good let me go ahead and run this to see that the test works again build test and it's complaining that the objects listed as exports but not present in the namespace str split one so what I need to do is load underscore all and so it gives us a warning message objects listed as exports but not present in namespace str split one so back in my namespace I see that it is still exporting str split one and so what I need to do is run document and so now I see in namespace it's exported that and I can then do load underscore all and that works okay so we're in good shape to add additional tests I'll go ahead and type them in and I will be right back all right so in the book they had a couple extra tests to go ahead and load checking that there would be errors right from that stop if not if the input length was longer than one and then making sure that string split one will expose features from the string our version of string split and so we see that with n equals two so if we give it a string with three elements in it it's going to break it into a vector of length two so you'll get a and then bc also if we do string split one we should be able to give it patterns and so this then is matching a period and so it's going to split a period b into a and b we've saved this we'll go ahead and test it so running that I see I had two failed tests and one passed not good and so I see in my first test uh that's online two here right um object pattern with one t not found and so I think that's actually over here um yep right here online 18 I have one t so I'll go ahead and save that and then load all again and then I'll redo my test and now everything passed cool so again that test is going to be really helpful for debugging our code as well as making sure that our code doesn't break other functions in our in our package so we've made this package and I would like to get it up onto github now and so we will use a function called use underscore github um so to use github you have to have a github account so I would encourage you to set that up um I'm going to put this on the riffamonus website riffamonus github account so you should be able to run use github open close parentheses without doing anything I'm going to go ahead and use organization equals riffamonus so you don't do that I'm doing that so you would you would drop that out and again have use github open close parentheses with no argument and so this is giving us a dialogue they're uncommitted changes and we're about to create a push create and push to a new github repository do you want to proceed anyway no so I'm going to go ahead and say no there and I'm going to come back to get and I've all these changes right um so we'll go ahead and do this and this was all from our refactoring to use string r instead of base r and so we'll go ahead and commit so use string r rather than base r to do splitting okay I'll go ahead and commit close that's great and so now I will rerun this command all right so it's saying creating the repository setting the remote origin to this address adding that URL creating issues there's one uncommitted file description okay to commit it I will say I agree so number two this was adding the github links to my description and then it pushes the main branch to github and sets origin main as the upstream branch and so now I come and I see that sure enough I have my github repository for this project so that's pretty exciting right so now everything is here and so as I make a series of commits you know maybe I get to the end of the day or I've made a big set of changes I can then push it up to github for other people to get access to which is pretty awesome one thing we'll see on here is that my readme file well I don't have a readme file so I need to add a readme file and we can write a readme file in our markdown so to do that we'll use the helper function use readme underscore rmd and we see that that creates a readme.rmd file adds it to our build ignore because we don't want that to go into the overall package and it's then creating a hook or an action that we will do when I commit if I commit readme.rmd without committing the markdown file or maybe it's the other way around whatever git is going to do something special so I don't have to worry about that which is great so this is some default text that it throws in here this isn't what I want to use so I'm going to go ahead and do a command a to highlight everything and delete it so on the book's website they have a link that I'll put up here for regexite readme.rmd I'm going to go ahead and copy all this code so we don't have to type 69 lines of code because that's not really what's important at this session so I'll go ahead and save that and now in my console I'll do build underscore readme and it will go ahead and render the readme.rmd file as a markdown file and I'll go ahead and commit my readme commit that close close and push when I come back to my version of the repository and refresh I now see that I have the readme file okay so again there's a lot of these great helper functions for building the packages and working with other workflows so we're using the use the readme and github and all these things the use github use readme use test that they work well within packages but you don't have to use them within packages you could also have used those with a normal you know data analysis script outside of a package but at least in our case it really helps to make the workflow of building out our package a lot smoother so at this point they recommend running check again I think we're going to again get the same problem that we had before anyway we could then go ahead and do install that the check is checking things it's not necessarily installing anything whereas install will go ahead and do kind of like we would with an install dot packages right and we now have that one thing I want to point out is that we have this package up on github it's not in crann but it's almost as good so crann has a whole bunch of other tests that they run to make sure everything's on the up and up if I had a package up on github or say I make some changes and it's not been updated to crann yet you could still install it even if you don't know me right even if you're not connected with my project and you can do that with install underscore github and then you could give rifomonas forward slash reg excite and this then will download my version of reg excite from github and then build it out right so there's a couple different ways to do it clearly we could download you could clone the repository and do install which is basically what we do when we're building it but you could also do install underscore github to install the package from the repository that's up on github so this has been a long video I'm sorry for that it's good for me though because I need to go through all this to start learning how to do it and eventually we'll be applying it to our own package and and doing that type of workflow one of the things to keep in mind is like this just seems like so much to remember right but there's a number of functions that we've used that we probably would really only need to use once right so that'd be like um create package use get use mit license use test that use github use readme rmd we run those once at the beginning of creating the package and then it's done in general I don't remember these things right I know that I can come back to this chapter in the book when I make my next package to do it again right and there's also been subtle hints along the way like when we made the description it tells you to you know use mit license to create the mit license right so that's that's pretty helpful and also a lot of these names um they have names that are really helpful like use github it's pretty easy that even I would probably remember that um there's also functions that we're going to use on a regular basis like use r use test use package again use r to create a r script for our tooling um use test we'll create a test based on that script and then use package if we want to bring in functionality from other packages like we did with a string r and then there's functions that we're going to use a lot more so like load all if I make changes to my r script load all will load those changes document if I change the documentation it will then load that documentation and update like the namespace and the the help files and things like that test to run my tests and then check I need to get checked to work um but check then we'll make sure that the package has everything that it needs to to be functional all right so hopefully as we go through developing our own package um we'll get you know pretty comfortable with using these different tools I'm sure we will because you know practice makes perfect at least for me I need to screw things up a few times as you've already seen in this episode before I really understand things and can move them forward many thanks to Hadley Wickham and Jenny Brian for creating these wonderful tools and creating this great resource of our packages really encourage you to get a hold of this or to go to their website uh where you can kind of read along and see a little bit where we're going to go I'm not going to be following this book uh super strictly as we go through I probably will be consulting back to it again as I'm trying to learn things and learn how we can use their tooling to make our development process a little bit easier so anyway I hope you check that out please also check out the paper that I've got down in the description that describes this naive Bayesian classifier and I'll talk to you next time for another episode of Code Club