 If you've spent any amount of time in R, you've no doubt seen things like the addition sign, the subtraction sign. Perhaps you've seen something with a C. Perhaps you've seen read CSV or read TSV. Perhaps you've used something called library. If you've used library, then perhaps you've used that with something called ggplot2, which allows you to use something called geom line or geom point or theme or theme classic. Well, those various terms that I've just listed off are all called functions. You know that they are a function and are if they have a open parentheses next to the name. Now, I mentioned addition and subtraction. Those don't generally have a parentheses with them, but those are also a special kind of function. Heck, even that arrow operator that allows you to sign one value to a variable is also a function. Even the keyword function is a function. Well, in today's episode, I'm going to show you how we can take the code we've been working on to read a philip formatted distance matrix into a function. If you've been watching the previous episodes as we've been developing this code, you know that I've rerun about 25 lines of code. Every time I change the file or every time I kind of change what's going on in that body of code, that is not very reproducible. And it's not really sustainable. So functions allow us to keep our code dry. A dry is an acronym that I've used before for don't repeat yourself. And so if you take the same code chunk and repeat it many, many times, it's not dry because you're repeating yourself. But if we can put that chunk of code into a function, now our code can be dry because I can call that function name wherever I want to say, read in a philip formatted distance matrix. And I will then get that functionality. And so I want to spend a little bit more time in today's episode helping you to develop some more concepts and understanding of how functions work and how you can write your own functions to help keep your code as dry as possible. Before we convert our code for reading in a distance matrix to a function, I want to start with a little bit more simple example of functions to help relay some of the concepts that we need to know for creating our own functions. So the first thing that functions generally need is a name. So I'm going to create a function that converts degrees Celsius into Fahrenheit. I'm in the United States, we're like the only place in the world that uses Fahrenheit, I still don't have a really good handle of what Celsius means. So such a function might actually be helpful to me. So I'll do C to F. And so C to F is the name of the function. I like to have my function names be descriptive. So this is called snake case where everything is lowercase. And if you want a space, you insert an underscore, don't want to have hyphens, because hyphen also means minus. And so if I had C hyphen to hyphen F, it would think I would be subtracting two from C and F from two, right, which is not what I want. I want it to be C to F. And so that is the name of my function. I then use that assignment arrow because I'm going to assign the output of a function to that variable. And what is the name of that function? Well, it's function, right? And so we are going to define a function using the function keyword. And so that function keyword is really important. And inside of those curved braces, those parentheses, we then put an argument list. And so what we might put in here then would be Celsius. And that would be my temperature. That would be the input in Celsius. And we're going to output Fahrenheit. So then the next part of a function is the body of the function. We will then give the special sauce as I like to call it of what's going on. And so the conversion would be nine divided by five times Celsius plus 32. And so if I want to load this, I then load it into my R session. If I look at the environment in the upper right corner of my R studio session, I see that I now have a function loaded C to F. It's a function that takes in Celsius, right? And so then to call C to F, I can do C to F. And I can give it a temperature, right? So I could say Celsius equals 20. And so that would tell me it's 68 degrees Fahrenheit outside. Now, there's a few nice things to know about running functions. I don't actually need to say Celsius equals 20, right? I could do C to F 20. And it'll output 68. So R is smart enough to know that if there's only one argument, Celsius, then that if I give it a value for an argument, that that goes with the Celsius. Well, what happens if we have a couple argument? So this is a little bit silly, but say I want to add freezing, right? And that would be 32, right? So I could then do freezing as another variable in my function. So I could then do C to F, 20, 32 up. And I just fell into a trap that I commonly do where I changed the function, but I forgot to reload it. And so that's an important point that if we change the body of our function or change the arguments of our function, we need to reload it so that we can then go ahead and use that new functionality that we've put into the function. So sure enough, now that I've reloaded the code, I get C to F 20, 32, 68. And you'll notice that again, I didn't put in the argument names. And that's because R is being smart. And that if I give it two arguments, it knows the first argument goes for Celsius, and the second goes for freezing. What it's not smart enough to do though, is if I do C to F, 32, 20, it's not going to give me 68, because again, it's assuming that first value is Celsius, and the second is freezing. That's not the case. If I wanted to do something like this, where I do 32, 20, I could do freezing equals 32, Celsius equals 20. And that way, it is telling our, okay, I don't know the order of the arguments, but I want the freezing to be 32, and Celsius to be 20. And you can get a little bit silly with some of this, say I wanted freezing to be 32, and Celsius to be 20. If I know there's only two arguments, I can say freezing equals 32, and then not label Celsius for 20. And I'll get back 68. So again, R is smart enough to figure this all out, thankfully. Now, of course, freezing is always going to be 32 in my case. And so I might like to use a default value for freezing. So I could say freezing equals 32. And so again, if I reload this, I can again do C to F on 20, and it should be 68. And sure enough, it is, because it's using the default value of freezing to do the calculation. Of course, I could come back and use freezing 32, and Celsius 20 like this, and I get the same thing. So say you want to do something a little weird and do C to F 20, and then freezing equals 15, I would then get back 51. Again, this is also a very simple one line function. Let me show you what we would do if the function were perhaps a little bit more complicated. Well, if we had more than one line in the body of our function, we would need to use a set of curly braces. And these are the keys to the right of the P on your keyboard when you hit Shift. Those give you the curly braces. And that curly brace defines the body of your function, right? And so again, I can run all this. And if I do C to F on 20, we should get back 68, our good old friend, right? And so what this allows us to do is to have multiple lines in the body of our function. I'd be pretty hard pressed to come up with an example where I don't use the curly braces to define the body of a function. Generally, my functions are going to be more complicated than a single line eventually. If not from the beginning, then yeah, like I said, eventually, they will get to be multiple lines. And so it's nice to from the outset define the body of the function with those curly braces. So to help flesh out what this might look like, if we get things a little bit more complicated, let me break this equation into two steps. And so here we'll do multiplication as the nine fifths times Celsius. And then we'll do multiplication plus freezing. Get all highlight and run all that. And we'll again do the C to F 20. And we'll get back that 68, right? So again, we can have multiple lines in here. So generally might start assigning values to variables as the code gets more complicated. And so I could assign this multiplication plus freezing to the variable F. And so now if I do C to F, I don't get any output. To this point, we haven't been assigning the output within our function to a variable name. And so the function returns the last outputted value. What we did here on line three is we took multiplication plus freezing and assigned it to the value F. And so that is the last executed result inside of our function. And so that doesn't really have a numerical value that's getting outputted to the main session in R. What we could do is we could say F, right? And so now the last outputted thing from the function will be the variable F. So now if I do C to F 20, I get back that 68. Alternatively, what we could do instead of listing that variable itself, which is a little bit more elegant is to say return F. And so this is telling our return F. So now when I run this, again, if I do C to F 20, I get back that 68. One thing that's important to know about the return function, however, is that if you put that return function anywhere but the end, then everything after the return function will not get run. So of course, we'd want return F to be the last executed code in our function, because we know that when it runs return, it's going to kick out of the function. Okay, so that's enough to cover here in talking about functions. Let's head over to the code that we've been working on to build out that function for reading in a file up formatted distance matrix. Let's look at our code that we've been developing for reading in a file up formatted distance matrix here in read matrix R. If you want to get this code down below, there is a link that you can go to to get this file as well as the data that I am working with. So again, this is kind of a stream of consciousness code. And as I said before, I could change the distance matrix that I'm reading in by changing what file I am reading in. And so I want to make this dry. So I'm not repeating this every time I have a different distance matrix that I want to read in. So I want to turn this into a function. So we need a function name. And so what I will call this is read matrix. And we again use that assignment operator, and then the function keyword. And we're going to give it one argument. And that is going to be the file name. And again, I like to use descriptive file names, as well as function names, my file names are generally nouns. And my function names are generally verbs, like what is happening, right? So I have 31 lines of code. So I need to insert those curly braces to define the body of my function. In our studio, if I put in the open curly brace, it automatically gives me that closing curly brace. So I'll go ahead for now and delete that closing curly brace, come to the end, and put in the closing curly brace. One of the things I also like to do in functions is in the body of my function, I want everything indented over a notch. So I can highlight all the code, hit the tab, and it'll go over two spaces or whatever you have a tab defined as in your settings for our studio. So great, we have a function. But as I said, I want to replace this file name with the variable file name, because as a user, when I call read matrix, I'm going to give it my simple break artist disk or whatever distance matrix I want to use. So I will then put in file name, and maybe I'll go ahead and clean up my tabs here. And so that all looks good. So I will go ahead, save this. And I will run this so that I now have read matrix in my environment. And I can do read matrix. And then I can do my simple break artist disk. I'll run that. And I don't get any output. Why don't I get any output? Well, perhaps you've noticed the last two lines of my function are these row names and call names functions that take the value of samples and assigns it to the row names and column names of the distance matrix. This is what we covered in the last episode. Well, that's not generating any output to the screen. So nothing is getting returned from the function. So again, I could do disk matrix. But I'd rather do return disk matrix to make it more explicit, what's happening here in the function. So because I've modified the code, I need to go ahead and highlight everything, reload it into R. And so now I can do read matrix. And I can do the mice simple break artist dot disk, run that. And now I get out my nice distance matrix, right? And I could very easily now change the file name that I want. So instead of my simple, I could do mice break artist dot disk. And it's emitting a bunch of the roads because this this distance matrix is much larger. But you can see that I can read in these different distance matrices with, you know, running the same function, but giving it different file names. So here I've created distance matrices from three different files, without having to rerun those 30 some lines of code. All I have to do was run read matrix parentheses, and the name of the distance matrix file that I want to read in. Of course, I could then also assign this to a variable. So I could then say disk matrix equals read matrix and the file name. And so now disk matrix contains that distance matrix. And I can go forth and do all sorts of cool things that will happen in the next episodes of code club. So you know what happens in those next episodes of code club? I'd love it. If you made sure that you've subscribed to this channel, you click the bell so you get the notifications of when things happen. And really, we are on a journey here. And if you're just joining us, welcome, please be sure to check out the playlist that I have up here and encourage you to go back and see how we've gotten to this point in this development of this current series. Moving forward, we're going to talk about a few more things having to do with base are, but eventually we're going to move on and talk about how we can analyze this distance matrix, how we can visualize its values. So again, you want to be subscribed so you know when all that good stuff happens.