 Hey folks, we're in the midst of stepping through a variety of approaches at looking at temperature variation over the past 140 years or so. In the last episode, we generated a plot that's called the global temperature index plot. Basically, shows you the deviation in average temperature relative to the average temperature between 1951 and 1980. We drew that as a line plot and we fit a line through that curve, trying to replicate something that was shown on NASA's website. Well, in today's episode, I want to replicate another plot that's actually very similar to what we did in the last episode. In this episode, we're going to generate this plot, which is a bar plot showing, again, the same information that was in that line plot, but showing it as a bar. And each bar then is represented with a different fill color corresponding to that deviation from that average temperature between 1951 and 1980. What we're looking at here comes from a website called showyourstripes.info. I think they actually use a different standardization approach than what we're getting from the NASA GIS data set. We're going to roll with what we've got, but I'll try to make the styling look like this approach. Again, on this website, they've got the warming stripes that we'll get to it maybe the next episode, but we've got the bars and they also have bars with scale. Yeah, sure enough, they use 1971 to 2000. But I want to look at the bars without the scale, because I like this minimalist look. And there's the question of, well, how do we get rid of the axes? And how do we put the dates at either end of that x axis? And how do we line it at zero? So we'll figure that out in today's episode as we head over to our studio. I have a new R script created called temperaturebarplot.r. As always, we seeded it with library tidyverse. If you want to get the code and the data that we're working with down below in the description, there's a link to a blog post that can help you get set up and away we go. Alright, so we'll start with read CSV, reading in data, the data directory, we want that glb.tsblabla.csv file. As we saw in the last episode, it has a header line. So we need to do skip equals one. And it's got a special NA character, which is an NA or a blank, but three stars. So we'll go ahead and load that. And I actually forgot to load tidyverse. So we'll go ahead and do that. And so we have a table of 143 years worth of data going back to 1880, all the way to current, we have 12 months worth of data shown across the screen here. But also, there is the J through D, J hyphen D, so January through December, December through November, January, January, February, and so forth, right. And so what I want to look at is this J through D getting us the calendar year for those 143 years. So we'll go ahead and do a select on year. And I also want that J hyphen D. I don't want these names, though, because they're capitalized and they're formatted funky. So I'll do year equals year. And then T diff equals that. And so this gets us our two column data frame, which we are now ready to plot. So let's go ahead and feed this into ggplot on AES, our x will be the year, the y will be T diff. And we'll of course do a geom call. So geom bar generates a bar plot, which has a summary function built into it. We don't want that we want geom call, which takes the values as they are and plots that on the y axis. And so we get something that hopefully looks a little bit like what we have over on the website, we've got a long ways to go, we'll get there. So something I want to take care of that I forgot to do in the last episode was this warning message of removing one row containing missing values. So that's not a big deal, we can come in here after the select, we can do drop underscore na, and that will remove any rows that have missing data, run that, and now we see that that error message goes away. Excellent, we're in good shape. I now want to use a new theme, which we theme void. And so theme void as we see strips away all of the theming and gives us just the bars. And so I think that looks pretty slick. The next thing I want to do is go ahead and think about how we can add the 1880 and the 2021 onto the plot. Do that, I'm going to modify my code a bit. And so up here I'll put t underscore data as the basically the data that we are generated from reading it in selecting and removing the na values. I could then take t data and then pipe that in to build the plot that we already have. But I want to do this because I want to algorithmically or programmatically get those first and last dates. And so to do that, we could take t data and we could do slice. So slice gets us rows and I can give it two numbers. So I can say one and n colon the end function. And what this will give me is the first and last row of the data frame. Of course, this assumes that data are in the right order. I guess if I was a little bit worried, I could always do a range year. And then pipe that into the slice and we get the same result, right? Okay, so now we have our years. So I'm going to modify that t diff column by using mutate. And I'll do t diff. And I'm going to set it equal to zero. So all the values of these two dates will be zero because I don't want the year positioned at the t diff value. I want it right at that, you know, y axis equals zero point. Okay. And so here I will call this data frame annotation. And again, if we look at annotation, we should see that it's got the two years and the two differences. So those positions, right? And so in here I can then do geome text. And on the, I'll say data equals annotation. And then AES, I will say label equals year. How to plus. And so now what we see is we have the years on either end, right at the zero point on the y axis. I now want to bump those off to the left and right a bit. But before I do that, I'm going to go ahead and output this to the file format and size that I want, because as I start moving things around, things are going to get messy. Okay, so we'll go ahead and say gg save temperature, temperature bar plot.png. And I want this to go into my figures directory. And I'll do a width of seven and a height of four. Very good. So this I see now instead of a white background, it actually has a gray background. I think it's actually transparent. So we'll want to change that background to be black to match the color that we have in the web version. And like I said, we want to bump those dates off to the left and right a little bit. So I'm going to come back up to my annotation. And I'm going to make a new column that I'll call x. And so that will be a year. And I'm going to add to the year column a vector. And so I can give it two values. So I can give it say like minus five and plus five. And that should then bump things off left and right a smidge. And then down in here, I'm also going to then say x equals x. And I now have those dates on either side of the plot. And I think that looks pretty good. I'm now going to flip the color. I'm going to make the background black. And I'm going to make that text white. So while I'm here in GM text, I'll go ahead and do color equals white. And then I'm going to add a theme function. And I will then do plot dot background. And I'll do element rect. And I'll do fill equals black. And so now I've got my nice black background. I've got my two years at either end of the plot. And it looks good. Now what we need to do is to modify those bars to be colored by the temperature difference. And what we could do is again, we can change our AES to insert fill equals t underscore diff. And so this gives us a gradient going from dark blue to a lighter blue. I don't want that because I want my scale to indicate that if it's right around zero, that at zero, right? And so I would like the zero or the things close to zero to be white. And the things that are below that to be dark blue and the things that are above that to be a dark red. To achieve that, we are going to add a scale. So we'll do scale fill gradient. And we can then give it a low value. And I will go ahead and put in here dark blue. And then my mid will be white. And my high will be dark red. And we'll then also go ahead and do midpoint equals zero. So I want the midpoint to be at zero. And I gave it the wrong scale fill gradient. What I actually want is scale fill gradient two. So scale fill gradient is a monotonic gradient, right? Going from one value to another. Scale fill gradient two allows you to have basically two gradients, right? So we're going to have from a low value to zero, from zero to a higher value. So we do want scale fill gradient two. So sure enough, we see the blues below the axis and the darker reds above the axis. I'm noticing that the blue isn't as dark as I would have expected it to be. I think that's because this is negative half. And this is one and a half. And so this is basically three times of red. This is as blue. So one thing I want to try is going ahead and adding in here limits. And we'll do C minus 0.5 to 1.5. And that doesn't really get us anywhere. If you're curious what's going on on our legend, what we could do in there would be to say legend dot text, element text, color equals white. So we can't see it because the default color is black. And so it's fading into the background. And so sure enough, we now see that yep, our zero does cross at the white. If I hadn't included that midpoint, then I think, you know, we'd get something wildly different, which we don't want. Ultimately, we're going to remove this legend because our figure doesn't have that legend. All right, we're going to have to use another function called scale fill gradient n. So I'm going to go ahead and comment this out for now. So we can have something to look back to to kind of compare how these functions relate to each other. So we'll do scale fill gradient n. And then we'll give it colors. And it's gonna be a vector of colors. And so I'm going to give it again dark blue, white, dark red. And then we're going to give it the values. So we're going to rescale the values. So it's basically going from negative one to zero to one. But of course, then it'll rescale things back out to negative a half to one and a half. So we get the negative a half being the same level of blueness that one and a half is of redness. Hopefully this makes sense. If not, we'll see it here soon enough. And so we'll do is rescale is the function we'll use. So we're going to give it a vector of values. And so I'll do min on t underscore data dollar sign t diff. And then comma zero comma max t underscore data dollar sign t diff. All right, then I need to give it limits so that it knows what colors go on each end. And so we'll then go see as a vector. And I'm going to grab the contents of this vector up here that had the min and max t diff value, but I don't want the zero in the middle. And then we can then add this into the rest of our ggplot pipeline. And so that looks pretty good, right? We see that we've got the darker blues at the bottom. And that as kind of, you know, we kind of become more and more positive. We get to the darker reds. One thing I noticed looking back at their figure is that I don't get the sense that it's really a continuous gradual change in color. I think they've actually got bins of colors. And so there's a way to do that with another scale gradient fill or scale fill gradient function. And so that's going to be instead of scale fill gradient, which I'll copy here. Again, so we have it. And so the alternative to scale fill gradient n is scale fill steps n. And so what you can see is that this basically defines steps, right? It looks at the extreme values. And it makes bins that kind of fit nice compartments, right? And so by default here, it's putting it into four different bins. I can define more breaks in the data, kind of like you would if you're building a histogram, right? And so I'll add that argument of n dot breaks. And let's try something like seven. And so here, again, you can see the scale, I'll remove this eventually, but you can kind of see, you know, the scale being broken down into these different groups. Let's see what happens if we go up to like something like nine. And that gets us a little bit more definition on those negative values. Again, if you go back to this version of the plot, my eye is telling me that that's what they're doing, that they're basically stepping through the color gradient rather than having a true continuous gradient. Again, I'm going to leave the code in here if you want to come back and check it out later. You're more than welcome to do that, of course. And let's just kind of cleaning up the code here because that's what I do. All right, the final thing that I want to do is that this version of the figure has a title, right? Global temperatures have increased by over 1.2 degrees Celsius. So we need to put in our own title, right? To add that title, I'm going to add another geome text. And I will say x equals 1880 because that's on the left side of the time series. I'll do y equals one. And I'll then do label equals, let me grab that text again, which was global temperatures, yeah, by over 1.2 C since 1880. And let's add that. Also, I know we're going to want color equals white because otherwise it's going to fade into the background. Unfortunately, it is centered at that position, right? So we want to justify it to the left. So how do we do that? Well, let's clean up our code a little bit here. We can do h just equals zero. And so that's a horizontal justification. So that'll be left justified at that position 1880 and one. So let's go ahead and insert the degree sign. To insert the degree sign, we can use Unicode. If I Google for Unicode degree symbol R, I can kind of quickly get to a result. And it tells me that down here, I can try temperature and then it's backslash U00B0. So the that is the Unicode part, right? And so you can Google for any type of Unicode symbol that you might want. Google is your friend, because these are things that I just have no business memorizing, right? And so we can plug that Unicode symbol in there. And sure enough, we now have our degree symbol looks pretty nice, right? One final thing that just bugs me a little bit is that these were the numbers. So 1.2 and 1880 that I think they actually used 1850 that they used for their plot with their data. I don't know that this is actually true with my data, right? So I'm going to use the glue package to again programmatically insert the actual values with my data. So again, we'll come back up here. And I need to get the maximum increase in temperature as well as the last year. And I'm going to use glue to insert all that, right? So I'll go ahead in here, I'll do library glue. And then down here, I can use the glue function wrapped around my current title, right? So I'll do glue on that, put a closing parentheses there. And then for my year, what I can use, perhaps, is this was the min. So in curly braces, I'll do min t data dollar sign year. And close parentheses and close curly brace. We'll also then change this to be the max. So we'll do max t underscore data dollar sign t diff. Close parentheses, close curly brace. And sure enough, we now have our temperature inserted in there, right? And so we've got 1.02. Maybe we only want it to go to be 1.0. So let's quickly think about how we can do that here. And so I don't want to do all that in here. And so what I'm going to do is go ahead and save this or design this as max t diff. And so now I need to make that variable, right? And so now I'll do that as my max t diff, right? So I've assigned the variable here, and now I'm going to insert it down here. But I want to round it to one significant digits to the right of the decimal point, right? So I'll do round on that to one, right? And so if I look at this, I get back to value one. And so one is what's going to get inserted. So because it was actually 1.0, it truncates off the zero. To get that zero back, I can wrap all of this in the format function. And to that, I can then say n small equals one. And now if I look at this, I now get that trailing decimal point. And sure enough, we now see global temperatures have increased by over one degree since 1880. And that all looks pretty good. One final thing is that we have this legend cell that doesn't belong. So let's go ahead and get rid of that legend, of course. And so we can then do geom call show dot legend equals false. So that legend of course goes away. I think we've done a really nice job of mimicking what was on the web version. I'm pretty happy with the way this looks. It is a little bit removed, I feel like from the actual data in some ways, if I had to give a critique, I think it tells the story. So clearly after some point, temperatures have been increasing every year. Year on year, they seem to be going up. What's challenging about this visual though is, well, I don't know what year that is, because I don't have a grid line linking to a date on the y axis. I also don't really know how much temperatures have gone up. I guess I have that in the title here, right? So like, these bars that are at like one degree Celsius, I guess you could say, well, maybe this should be the one that's at the temperature we're reporting there, right? But again, it just gets a bit confusing. So the lack of context again, for the year, or for the amount of change. And again, we don't know, I guess we kind of know what things are here, like they look like they're about, I don't know, half of what they are up here. It's hard to say, right? And so again, I think that's where that other version that we found on the website that has the scales, might be a bit more friendly to use for our audience. But, you know, I think this is compelling. It does tell a story, albeit without that information that might just make it just a little bit more helpful. Anyway, encourage you to play around with us. I don't typically use bar plots, when depicting time series data like this. And so that's a little bit different than what we saw before against the same data presented in a slightly different way. Let me know what you think down below in the comments between these two versions. Do you like this version or the line version better? In the next episode, we're going to take on another day of visualization. So you don't miss that episode, please make sure you're subscribed. You've clicked the bell icon and you've given me a thumbs up. As always, please tell your friends about what we're doing here, run through this on your own. And we'll see you next time for another episode of Code Club.