 Hey folks, I'm Pat Schloss and this is Code Club. One of the things we've really been focusing on in recent episodes is how we can deviate from the default conditions of the appearance of our figure. We've spent a lot of time talking about various arguments that you can use in the theme function to get kind of the overall appearance of your figure to suit your own personal aesthetics. One of the other areas that often needs some types of customization is changing the labels on our X and Y axes. We can have different types of data represented on our axes, so perhaps we could have discrete data or categorical data, things like different months or perhaps different races or different sexes. We can also have continuous data, things like weights or percent of people that agree to say receive the COVID-19 vaccine. So we can have different types of data represented on the different axes, but the question is how do we get those labels on our axes to look the way we want them? Well, that's exactly what I'm going to show you how to do in today's episode of Code Club. I'm here in our studio. If you watched the last episode, which I encourage you to do, you will know that we are on the highlight branch, which I have over here on the right side with my Git tab. You can get the highlight branch and the main branch and all the other commits from this project. If you go down below in the description for this episode, you'll find a link to a blog post that has all the information you need to go ahead and get caught up to where we are for today's episode. So you can play along. So again, here is the code that we have. It's basically the original slope plot that I made several episodes ago with a few little tweaks. Let's go ahead and run this and see what it looks like. Again, this is our basic slope plot that we've been modifying to make look a little bit more attractive than a figure with 15 lines and 15 colors for 15 countries. Just way too much information. So we're going to go on a different path, a different branch, if you will, to see if we can improve this. One of the places I want to start with trying to improve this, as I said, is looking at the labels that we have on our X and Y axes. So I'm not talking about the title. I'm talking about the actual values here on the axis. So we have August and October. Those are in lowercase. I'd perhaps want it to be capital A August and then like say 2020. And then so that those are things that we'd like to change. Also on the Y axis, maybe I want it to go from 50% to 100% on the Y axis. Maybe I want these values to come out every 5 percentage points or maybe every 20 percentage points, right? There's a lot we can do here to manipulate the appearance of our axes to again, give a slightly different appearance to our figure. The first thing that I want to modify is changing the appearance of those months. Now, a while back, we had an episode with the same data where I actually used factors to change the label of the month on the X axis. I told you at the time that that wasn't the way I do it typically when I'm, you know, doing my own data visualization for my data. So I'm going to show you what I do in today's episode and we'll do that right now. I'm going to come down to my code for building out the figure. And I'm going to put in here after labs, but before theme, I'll do scale X discrete. Now, what I'm going to be showing you today are a couple commands that are scale underscore something underscore something that first something is either going to be X or Y. So I'm going to show you scale X discrete, but know that if you had categorical data on the Y axis, you could use scale Y discrete X or Y you use what's relevant for your application. I'm going to use scale X discrete because I have those discrete or categorical values on the X axis. And so I will do breaks. And I'll give this a vector with the C function, where I'll say, August, and then October. And then my labels, I'll do the same thing where I'll do August. But I'm going to do tick 20 for August of 2020. And then October, tick 20 to make it clear that that's also of 2020. And I realize I misspelled labels here. So we'll go ahead and leave that. And then I'll go ahead and add a plus sign to the end of this. So this gives me my stylized X axis labels of the capitalized month with the tick 20 for the 2020 of many episodes ago, like five, I think five episodes ago, I showed you how we could use factors to set the order of our months. This works out because August and then October is alphabetical. It's not something we really need to worry about. But factors is a way that you can have a surefire way to get those months in the right order. Say we had September and October. By default, we wouldn't get the right order. One of the things we did in that factors episode is I use the labels argument. And that then allowed me to map August all lowercase to this August space tick 20. That was convenient for things like this where we're trying to see the month and the year on an axis. But it was very inconvenient if I wanted to calculate the difference between say, August, or between October and August, because I would have to do quote, October space tick 20 quote, minus quote, and so forth. So it gets kind of tedious because you need those quotes because we've got spaces in the name, we've got punctuation like this tick mark, that's just a pain. So my preference would be to leave it as lowercase to August lowercase October. Simple, right? Because then I can do October minus August, we're good. And then to use scale X discrete like we have here, to get the right appearance. Now, what about the y axis? I perhaps want this to go from 50 up to 100%. Well, the function we're going to use is very similar to scale X discrete, it's going to be scale Y continuous though. And so here we can then say limits equals, and then we'll give it a vector. And so I'll do 50 to 100. And I'll also put a plus sign at the end of this. And now I see that I have my limits going from 50 to 100, which is pretty convenient. Perhaps I want to change the interval of these values. And those are again called the breaks. So just like we have breaks on the x axis for the August and October, here we have breaks of 50, 60, 70, 80, 90, and 100. Say I want it to be 50, 75, 100. How would we do that? Very easy. Again, we'll come back up to scale Y continuous. And I'll do breaks equals 50, 75, and 100. Now we have 50, 75, and 100. Very convenient, right? We could just as easily go every 5 percentage points rather than every 25 percentage points. I think I'll ultimately go back to having it be every 10 percentage points. But one thing that you can see here is that because we're using theme gray, which is the default theme, we have these grid lines for the minor grid lines for the minor breaks on the y axis. How can we get rid of those? Sure, we can change them in the theme by getting rid of those by using that element blank, but we can also get rid of them here in scale Y continuous, where I can come in here and I can do minor breaks equals null. And so null means I don't want any minor breaks. Otherwise, you could again put in these values or whatever values you want to have the breaks at to draw those minor grid lines. And so now we see that those minor grid lines go away. Again, we could have just as easily instead of null use something like 50, 60, 90, and 100. And then we see we have those minor grid lines every 10 percentage points. So that's not what we want, right? We'll go ahead and turn this back to null. And I'll set my breaks to be every 10. And one way we could do that other than writing out 50, 60 all the way, would be to do seek 50, 100 by equals 10. And so that means go from 50 to 100 by units by 10, 10 point steps. And so looking at what this appears like in the terminal, we can see that this seek function does the same thing as writing out all those values, right? So writing out five numbers isn't that big of a deal. But say I wanted like every five percentage points, then it gets pretty tedious. So now we have those major grid lines at 10 percentage point increments from 50 to 100. We don't have any minor grid lines. So I'm describing here how to use scale y continuous. Again, it works for scale x continuous. But sometimes people will say, Well, what about using chord Cartesian? Couldn't we also use that to set the x and y limits? And yes, you can. So there's a subtle but important difference, though, between using scale y continuous and chord Cartesian. So for now, I'm going to comment out my scale y continuous, but we'll come back because that's what I want to use. So we can do chord Cartesian, and then I'll do Y limb. And just to have something look different, we'll do 60 to 90. And so this looks a bit different, doesn't it? You perhaps noticed that China, this green line, which we're familiar with at this point, was initially I think up like at 95%. Right. And so what we see with chord Cartesian is that it still draws the line, but it's effectively zooming in on the y axis. And we see that also down here for France, which was kind of falling off the bottom. Now, this is an important point, because if I come back to scale y continuous, so let me comment out my chord Cartesian for now. And I use the same limits. So say I go 60 to 90, I'm going to get a warning message. And so it says removed three rows containing values, geom path. So what does that mean? Let's look at the figure. So now we see in our figure that when we use scale y continuous with limits that are within the range of our data, that we lose those lines, we no longer have the line for China, we no longer have the line for France, because it says, well, you know, one of its points is outside of the limits, we're going to remove those data. And that is why we are getting this warning message about removing three rows containing missing values. So it basically filters the data using your limits before plotting it. That's what scale y continuous is doing. Whereas if you use chord Cartesian, it doesn't do that filter. Again, it's it's an important but subtle difference. The place that this often comes up as being important is for transformation statistical transformations, things like geom smooth, right? So if you're trying to draw a line through data, the line will look different if you're filtering your data to the range versus drawing a filter drawing that line through all of the data. Another approach would be to do y limb. And here we can give it our paired values, we can do 6090, add the plus sign. And again, we get that warning message that it's removing the data. So y limb, x limb, do the same type of filtering that scale y continuous does. And so know that if it's outside that range, it's turning that value to an na. And then when you say draw lines between points that have na values, it's going to complain and it doesn't know how to plot, you know, na values either. Again, I'm going to comment that out. And I'm going to roll with scale y continuous for the rest of this episode. The last thing that I want to talk about today in manipulating the appearance of our axes is the amount of padding that is added to the axis. So one of the things that drives people nuts when they're making like a bar plot in art, and by the way, we shouldn't be making bar plots, but that's another episode. Anyway, is that you might have zero here on that tick mark, but then you'll have this space down here at the bottom. And so people would like there to not be a space between the bottom of the plotting window and zero, right? So how do you do that? Well, that is achieved using a special argument to scale y continuous called expand. So if I do expand, I have to give it a vector of two values, I can say zero comma zero. And so now we see that we no longer have that padding above 100 or below 50%. Again, think of this maybe like in the case where we have zero on the y axis, you don't want value perhaps underneath that zero. So again, expand equals zero gets rid of all the padding, the default is to add 5% of padding, I could come back easily and change that from zero to let's say to say 10%. So I do point one, and that gives us more padding outside of those two ranges. If you wanted to put more padding on the top, then on the bottom, then you need to use a special helper function that goes with expand, which is called expansion. And here you can give it values of what to add or what to multiply. So I'm going to go ahead and do multiply. So I'll do mult equals, and then I'll give it two values. So I'll say zero. So I don't want any value padded below my minimum value on the y axis. And then on the top, maybe I'll add 10% padding to the top. And then I need an extra parentheses here and we should be good. And here you can then see that we have no padding below the 50, and more padding above the 100. The other argument that you could use instead of mult would be add, and that's going to add values to the bottom or to the top. I'm not so concerned about having different padding on the bottom and top, and I'm going to change my expansion to 3% on the top and bottom. So we see that we get a tighter padding on our y axis. It's not quite as big as that 5%. I don't know if you can see that with your eye, but anyway, I'm cool with the way this looks. I think that looks pretty respectable. Now let's think about the x axis that we have all this real estate to the left and right of our two months. How would we tighten that in? Well, the default expansion for scale x discrete is 0.6 expand. And let's do start with zeros. And so now we see we have no expansion outside of our two months. That's probably not desirable. So let's go back and instead of zero and zero, let's go ahead and do 0.3 and 0.3. And so that gives a little bit less padding on the left and right, I might go ahead and make it even tighter. Let's go ahead and do 0.1 and 0.1. So I like having this reduced expansion to the left of August and the right of October. I feel it gives more real estate to the lines showing that change in people's attitudes towards receiving the COVID-19 vaccine. And I think that our axes now look pretty good. Yeah, the rest of the figure looks pretty hideous, right? 15 colors, 15 lines, 15 labels, that's just way too much. In the next episode, I will come back to this and we will get rid of these 15 different colors to find another approach. And again, the hint is that the that the branch is called highlight that where we can highlight a individual country to tell a story about that country. And I think ultimately that is what's going to really make this figure shine and perhaps be the figure that we would want to use, say in a publication or in a tweet or you know, whatever. Anyway, be sure that you're subscribed so you can see that episode when it's released and we'll talk to you next time for another episode of Code Club.