 It's LinkedIn Learning author Monica Wahee with today's data science makeover. Watch while Monica Wahee demonstrates how to replace values with variables and vectors in a plot in R. Hi everyone, in today's data science makeover we are really just going to focus on a makeover of your code. You will see in this video we make a box plot using R and the Gigi plot 2 package, but that is not the point of this video, believe it or not. If you want to learn how to use Gigi plot 2 to make a box plot, please watch my other video. I'll put a link in the description. The purpose of this video is actually to show you this trick that you can use in R, which is where you replace values and actual code with variables and with vectors holding sets of values. Now why would you want to do that? Well let's do it together and then it will probably become clearer to you why you would want to do that, but I'll also give you a hint as to why I think you might want to do that. Let's first start with making a bunch of assumptions about our current situation. See this code up here? This is code that you just need to accept as our demonstration code that I'm going to operate on. I reappropriated a lot of this code from my box plot video. In that video we read in a little data set called city compare. In that data set each row represents a hospital in either Boston or the Minneapolis-Saint Paul area. Each hospital had a value for a continuous variable staff beds and so I used R to demonstrate how to make a box plot of that variable and I could show how to separate the box plots by city by using the hosp city variable in the data. So that's all that is happening with this code. I'm reading in the data calling up ggplot2 and making a box plot. Let's highlight and run. Okay there's our box plot. It's got labels on it and one pink box and one gold box. That's what we expected. Let's close this. Now let's look more carefully at the code we used to make this plot. As you can see by the way I formatted the code, each line of code in this box actually has a combined command in it. Either it is talking about one value like for the x labels and y labels, see where it says staff beds and compare cities, or it is talking about a set of values like with the fill in the geome underscore box plot option saying pink and gold and the x limits of 20 and 1500 in the xlim option. So let's offload those values into variables and vectors. That's what I do down here. Okay you can see what I'm doing here. I named a vector fill underscore colors and I put pink and gold in it. Then I created the variables x underscore label and y underscore label and filled them up with the values staff beds and compare cities respectively. Then I created the numeric vector x underscore limits and put the x limits in it. So now I can replace all the hard coded values with these variables in the ggplot code, which is what I do down here. See how instead of fill equals pink comma gold, I now say fill equals fill underscore colors. Since I set the fill underscore colors vector equal to pink comma gold, now r will replace that vector with those values. And the same thing will happen for our variables x underscore label and y underscore label and for our numeric vector x underscore limits for the x limits. So I'm going to run this code now just to prove to you that you get the same thing when you use these variables and vectors as when you hard code it. But I can't forget to run the variable and vector code up here. Let's highlight all of this and run it. Okay there, I proved it to you. It worked. So that was not really a data science makeover mainly because that plot is still pretty ugly. It was more of a code makeover. So why did I just do all that variable and vector replacement in my code? Obviously, to demonstrate that to you, so I guess the right question is why would I ever do all that replacement in my code? Well I could think of three reasons right off the bat. First, let's say I wanted to automate this code. Like I was going to put this in a dashboard and have the user be able to select which colors or x limits they want. The vectors I made would make a placeholder for those variables once they were set by the user. Secondly, having a set of vectors and variables setting all your values is nice if you want to try out different values. Let's say I hadn't settled on pink and gold. Then I could just keep editing this vector and rerunning the plot until I like the colors I chose. That way I don't have to edit the actual plot code each time and risk messing it up. Finally, if you use these techniques, it's easier to standardize output across different analyses because you can keep calling up the same values through using these variables and vectors. For example, let's say I wanted to do more box plots of other continuous variables from these hospitals like patient days or patient revenue. I could just keep using the fill underscore colors vector and that would ensure that each plot had the same colors even though the variable being plotted changed. So in the end, these are good techniques to use if you want to modularize your code, which is a good makeover. In data science, you don't just want to make over your data, your code deserves some attention too. So for this data science makeover, make your code gorgeous. Thank you for watching this data science makeover with LinkedIn Learning author Monica Wahee. Remember to check out Monica's data science courses on LinkedIn Learning. Click on the link in the description. Thank you for watching. If you liked the video, then please give it a like. Also, I invite you to look around my channel and if you like what you see, subscribe. I hope you are enjoying your day.