 Hey folks, in this series of episodes I'm looking at a variety of approaches to visualizing global climate change. Recently I found this figure over on Wikipedia created by Ed Hawkins, if you've been watching my recent videos you know seem to have a love for Ed Hawkins' data visualizations. I think they're really cool and they're not made in R and so that gives us some extra challenges to figure out how to implement these types of visualizations in R. Even if you don't care about climate change, which who are you? Then I think it's really valuable to try to imitate the work of others using the tooling that you have and it helps us to learn that tool all the better. As simple or complicated as this figure might look, there's a lot of it in here that you might look at and say, hmm, how would I do that in R? Well, I'm going to show you how to do this in R today. So let's head over to RStudio and get going. As always, I have a R script started here with library tidyverse to get us going. If you want to get the code and the project as it currently stands with the data and everything, head down below in the description. There's a link to a blog post to get you all set up and ready to roll. I'm going to fire off library tidyverse and then we'll do read CSV. My data is in the data directory and it's this horribly named file GLBTS plus DSS, whatever. Who knows? The first line is a header. It doesn't have heading names. That's in the second line. So we'll do skip equals one. And then my NA values, as we've seen in previous episodes, are three stars. And so now we can see that, yeah, sure enough, we get the year and the different months down below. We also see we've got December. We've got January through December, which we've been using December through November and then the four different seasons. What I'm interested in for this episode are the different months because if you recall that plot I showed you on the x axis are the different months, and then each line represents a different year. And the y axis then are these changes in temperature. The data that I have is normalized between the years 1951 and 1980. Ed Hawkins is a couple decades later, but we'll get the same effect. It's the same globe, right? So what I want to do is extract out the 12 months. And something that's kind of cool about R is that it comes preloaded with certain vectors. So if I do month dot ABB, I get the 12 months. And fortunately for me, these are the same formatted month names that we have in our column headings, right? So what I could do is I could do select, and I can then do month dot ABB. This then gives me 12 columns of months. But of course, I also want the year, right? And so that first column year, I can do year equals capital Y year. And now I've got my 12 months as well as my year. Wonderful. We now need to tidy this so that again, we can have all the months in one column, all the years in one column and all the temperatures in one column. To do that, we'll use our old friend pivot longer. And we'll do everything but year. And then the names will send to month. And then the values will go to, and there I will say t underscore diff. And sure enough, now we have that three column data frame. I'll go ahead and call this data frame t diff, so that we can save it and work with it in ggplot. So now we can go ahead and make that initial version of the plot by taking t diff and piping this to ggplot. In our aesthetics, again, across the x axis, I want to put the month under y. I'm going to put t diff. And then I'm going to group by year. So group equals year. And I'll also do color equals year. And then we will do geome underscore line. And there we go. We have our initial version of our plot. Again, each line represents a different year. And we have our months across the x axis. Clearly, we have a lot of work to do. The first thing that stands out to me is I'm getting this warning message about removing seven rows containing missing data from geompath. It's a warning so it doesn't really matter. But I think if I take t diff and I look at the end of it, I think I'm going to see a bunch of NA values. Yep. And there I have six NA values. And I'm sure if I looked back, yeah, so like these are months that don't exist yet, right? I'm recording this towards the end of May. So I'm pretty sure that 2022 May and June are missing, right? So I could double check that by doing tail n equals 10. And sure enough, April, May, and so forth are all missing. And so, yeah, and so those are the values that it's complaining about not being able to plot in geomline. So what I can do is I can come in here and I can then do drop underscore NA. My warning message goes away and we're good shape. The next thing that I'm worried about are my months. They're in alphabetical order rather than numerical order. So what I can do here in this pipeline is I can pipe this into a mutate on month. And I'm going to define month to be a factor. And we'll do month. And then my levels will be the same vector month.abb because that had those in the correct order. And so now, if I go ahead and plot that, now I see that I've got the months in order from January all the way across to December and we're in good shape. One thing that sticks out to me about the original version though, is that you can see the preceding year. So like December of the preceding year is kind of represented at the left side of the plot. And January of the next year is represented on the right side of the plot, right? So my lines extend back from January and forward after December. So how would I go about engineering that in there? So the way I'm going to engineer this is to create three separate data frames. I'll create a data frame of the last December and the next January, as well as the TDIF data frame that I already have. So I can go ahead and take TDIF and I will then do a filter for month equals equals DEC. And this will give me all of the December data, right? And now what I could do would be a mutate on year. And so this is going to be the year minus one, right? So year minus one. And then month, I'm going to call this last underscore DEC. And so now you can see I've created a new data frame for the last December. And I can do the same type of thing, but for January, so I could do a TDIF, pipe that to filter month equals equals jam and pipe that to mutate with year equaling year plus one, because it's the next January. And then I will also do month equals next jam. Right. And so now I've got the next year's January, right? Very good. And what I can do is I can call this last DEC. Right. And I will then call this next jam. And I'm also going to borrow this line where I mutated month to make it a factor, because I'm going to do that once I've concatenated all three of these data frames together. So I'll store that down here. And I will go ahead and make sure I've got these three data frames loaded. Now I can do bind rows with last DEC, TDIF, and next jam. This then gives me the combined data frame. I can double check that things look right if I do count on month. Right. And so now I see I've got last December, next jam in here. The numbers vary a little bit, again, because we had those months that were missing from 2022. That's great. Now I'm going to bring back in my mutate to make the factors. But now I also have that last December and the next January. So what I'll add to this will be the, I think I called it last underscore DEC, month abbreviation, and then next underscore jam. And I'll add a parentheses here. And I will go ahead and feed this into my ggplot. We now have last December in that first spot on the x-axis and next jam in the last spot on the x-axis. Again, looking back at, again, the original version of the figure, basically drew the x-axis so that it cut off the last December and the next jam. But we could see the lines going to those points. So what I want to do is change the scale on my x-axis. I can think of it as going from, say, like one to 14. I basically made two months. And so then I would want to set my limits on my x-axis to go from like one and a half to 13 and a half. Right. So to basically lop off that first and last month. So to do that, I'm going to go ahead and create another variable here that I'll call month underscore number. And that's going to be as dot numeric on month. So a factor is a vector of characters. And it's then preserves the order. And that order is what you get back when you put a factor as the argument as dot numeric. So again, if I look at these two lines, I then see that I've got a month number corresponding to last December. So I don't have to think about, you know, what am I indexing on? So I would rather have January be one, and December be 12, last December be zero, and next January be 13. To do that, I can go ahead and subtract one from these values. And so now I see last December is zero. Good. And so I can now feed this into GG plot on the x-axis, I'm going to put month number. And so again, I can see I have my month number starting at zero, which again is last December, going up to 14. So let's go ahead and throw on a scale x continuous on this. And my breaks, I'll do one to 12, again, the 12 months, my labels, I will then do month dot ABB, again, that vector of months. And I will also then go ahead and add on a chord Cartesian. And I'll do x limb from one, one to 12. And there we go. We now have our January over to December. And we've got that effect of the lines going backwards. I think we basically have an extra half a month on either end. And I think that looks pretty good. I'm happy with with how things are coming along. The next thing I want to take on is the coloring. And so the coloring that I think they're using is a from the Veritas package. They may have made this in Python. And Veritas package is fairly common. It's a package that works well for dealing with red, green colorblind deficiency. And also for like a gradient. And so our has a special gradient function. So we could do scale, color, Veritas. And I'm going to use the C, because that provides the continuous scale, throw that in there. And so now we can see that we've got that kind of dark purple for down by 1880, and the yellow color for more recent. And so, yeah, it looks like they did use the scale color Veritas. Again, if you didn't want to use the scale color Veritas, you could, of course, go ahead and use like scale color gradient to get the same kind of effect and pick your upper and lower bound on those those colors for the different years. Good. Now what I want to do is let's go ahead and start changing the theming to get the back to be a dark gray and the plot window itself to be black. I'll do a theme. And we'll do plot dot background. And so we'll use elements, rect, fill equals black. And then we'll do panel dot background, elements, rect, fill equals, I'm going to throw in a hexadecimal. Let's do CCCCCC. So I flipped the backgrounds. The black should be the, I believe the panel and the panel should be the plot. So let me go ahead and fix that real quick. Get those confused. So plot and panel. And then also this CCCC is a very, a lighter gray. So let's go ahead and do a set of threes. So now we have the panel itself has the black background, which is good. Let's see if we can maybe lighten up the plot background itself. And I'll go up to fours. That looks pretty decent. The next thing I want to do is get rid of those white grid lines. And so to do that, we'll do panel dot grid equals element blank. That then of course gets rid of the background. The next thing that I want to do is go ahead and add my horizontal line. So I'll come back up here and before running geom line, I'll do geom h line. And then the argument I want to use is y intercept equals zero. So it'll cross at the zero line. And my color I will set to white and add that in very good. We can see that white line behind everything now. I'm noticing that I'm losing my access labels. So let's go ahead back into theme and add those to be white. And so I will then do access dot text. I'll do element text color equals white. So that white now appears to be more clear. And I'm noticing that on the y axis, I have values every half a degree, whereas in the original was every two degrees. So maybe what I'll do is I'll come back up to my scales and do scale y continuous. And I will then do breaks equals, and I'll do a seek from, let's do from minus two to two by zero point two. It won't use all those, of course, but that'll give us pretty good range of values. So now we have every two tenths of a degree, we have a value there. The tick marks, of course, are getting they're black. And so it's hard to see with that dark gray background. So we can go ahead and hear that and do access dot ticks. And I'll do element color line. And then the color is white. And so now those those ticks are white. But they're pointing outwards rather than inwards. And so I can change that by doing access dot ticks dot length. And I can then say unit, and I'll do minus five pt. And so the minus five obviously is going to be in the opposite direction. And so it should be about five points into the plot. And so now we see our tick marks moving in. Of course, the original did have a white border all the way around the figure. So let's go ahead and add that. So I could change the axis using access dot line. But I think what I'll do instead is come back up to my panel background where I have the fill being black, I could do color equals white to get a white border. And so now I see I've got a thin line all the way around can make a little bit thicker by doing size equals one. So that gives me a bit of a more bold white border around my plotting window, which is good. One other thing I'm noticing is the original, of course, had tick marks on all four sides. So that's something that we don't often see with ggplot generated figures, we generally only see the tick marks on the bottom x and the left y. So to do that is actually relatively straightforward. Up here on scale x continuous, we can do sec dot access. And then we can say dupe underscore axis as a function. And we can add the same thing to our scale y continuous. This then gives us the same y axis on both sides and the same x axis on both sides. So I can turn off those extra labels by doing name equals null, and then labels equals null as arguments to dupe access for both of these. And so now I see that I no longer have that extra labels. And we've got our inward pointing tick marks all the way around. Let's go ahead and fix up our access labels or access titles. So we see that on the original, the x axis didn't have a title. The months are pretty obvious. And the y axis is temperature change since pre industrial times degrees Celsius. So to do that, we can come up here and I'll go ahead and do labs x equals null and y equals temperature change since pre industrial times. And then we'll use square brace and then see and they use a degree sign. So that is you 00 B zero. That's the hexadecimal code for a degree sign. And so let's go ahead and add that in. And so yeah, we now see that we've got that title there, we got the Unicode correct, but that does need to be white. So we'll again do access dot title, element, text, color equals white. And then for the title, we'll go ahead and do title global temperature change since, and I'll go ahead and put in 1880. You can look back at the previous videos to see how you could programmatically generate that. But I'd rather spend more time getting the styling right on our figure than worrying about using glue and all that. And so we now need to do plot dot title, element, text, color equals white, very good. Now, I'm noticing that, you know, the size of fonts and size of titles is a little bit off. I'm also going to be wanting to change the size of my legend, because that changes as the size of the figure changes, I'm going to go ahead and save it to the size dimensions, the format that I want it to be in the final version. So I'll do g save. And I'll put this into figures, and I'll do temperature lines dot PNG, my width, I'll do eight, my height equals 4.5. So my version has a slightly different aspect ratio than the original, I picked mine because it's the aspect ratio of a thumbnail for YouTube. But I would still like to make the fonts on my access labels and my titles a bit bigger. I'm also noticing that the title should be by month, and should be a little bit larger and should be centered. And then we'll go ahead and tackle that legend. So again, we'll go down to access title, and let's try size equals 10. And access text, let's go ahead and make that size equals nine. So that didn't seem to do anything. Let's go ahead and try this at 14 and 13. So I think that looks a lot better. The title is a little bit too long. Maybe I'll just turn them down just a little bit of a smidge. And let's make the title 13 as well. And so I think that is pretty good. It's the same length as that x axis title. Let's go back to our main title, of course. And we want to say by month. And then when we look at our plot title, let's go ahead and do H just equals 0.5. Again, that will center justify the title. So I'll go ahead make the size 15. And so I think that is a pretty good similar similar size to what they had in the original. Great. The next thing we want to take on is the legend. So again, here in the original, you can see that the legend is the same height as the plotting window. It also has breaks every 20 years. Whereas my current legend, well, it needs some help. So we'll again come back and we'll then see how to do that. So we'll go ahead and do legend dot title. And I'll say equals element blank. And then I'll do legend dot background. And I'll do element wrecked fill equals NA to get a blank transparent background. We're getting there, trust me. So let's now go ahead and add more years and make the years white. And so again, we can do that up here in the scale color veritas. And again, we want to do breaks equals. And we'll do a seek from 1880 to 2020 by 20s. And then for our legend.txt, I will do element text color equals white. We have labels now every 20 years. So now we want to stretch that. And we can do that again down here in the theme by doing legend dot key dot height. And then we can do unit. And then my height on my figure is 4.5 inches. So I try four, and then the unit will be an inches. So that did not work. I'm not I think maybe each year or each separation is four inches. I don't know quite what happened there. Maybe I'll just go back to pt. And so we'll do pt. And then let's try four there. And that's way too short. So I think what I do here now is just kind of futz with the number until I get the size that I like. So let's try 40. That's that's better. Let's see if we go up to 60. It's a little too long. So let's try 55. So I think that looks good enough. One other thing that we might want to do is go ahead and add a white border around our our legend or the color gradient, right? So after doing a little bit of googling, I found out that I could come up to scale color veritas. And then I could do guides guide equals guide color bar. And then there's an argument here called frame dot color with a U. So if you don't use that U, that argument doesn't work. And so then that will equal white. And that now gives me a white border around my gradient. I can make that line with a little bit thicker if you'd like. And so we'll do frame dot line width. And let's say one very good. So the last thing I noticed that they have that I don't have is that they've got a line for the current year that's a little bit bolder than all the other lines. And they've got it labeled with the year. So looking at mine, you know what? I actually noticed something weird that right here where my crosshairs are, it's starting in January and doesn't go back to December. And then it comes up a couple months. And then there's like this like straight line with no movement in it. And so I wonder if I'm not thinking about my previous December and next January correctly. And so again, if I have last December, I actually want that to be associated with the next year, right? So if it was December of 21, I would want that to be last December of 22. So not minus one, but plus one. So we'll go ahead and make that a plus. And then I'm going to want minus one here. So let's go ahead and rerun everything and make sure that those weird artifacts go away. That looks a lot better. We now see that we do have the partial 2022 line in here. And it doesn't just kind of shoot all the way across the figure. And it does appear that all of the lines go back before January. Okay, so that's good. Now what we want to do, like I said, we want to make that bold and we want to put a label on it. So what I will do back here in where I was doing this mutate, I'm going to create another column, another variable that I'll call this year. And then I'll say year equals equals 2022. Again, you could programmatically figure out what the current year is. So if you're watching this video next year, you could you could plug that in. I'll leave that for you to figure out. And so this now will create a logical column called this year, right? And so false and truce. And so I can use that as a aesthetic, right? And so I can then do size equals this year. And let me go ahead and wrap this around on a separate line. So what I need to add is a scale line manual. And so let's go ahead and add in scale size manual rather. And then my breaks, I'll set to be false and true. And then my values will be, let's do 0.25 and one, see if that works, we might adjust those warning messages go away. I now see that my lines are pretty good fitness or thickness. And that this other line for 2022 is a pretty good thickness. I'm pretty happy with that. I forget exactly how it looked before what the default thickness was. But those seem to be about the same thickness as what Ed Hawkins had. I do need to get rid of the legend for the line thickness. And of course, I can do that back up here in geom line by doing show dot legend equals false. So of course, when I turn off the legend for geom line, it gets rid of the legend for the gradient as well. So that's not exactly what I want to do. So I'll come back up to geom line and get rid of that show legend equals false. And then in scale size manual, I'll do guide equals and then in quotes none. And so that gets rid of the scale size legend and brings back my gradient legend, right? So again, we can use that guide equals none to return to return off the legend for that individual aesthetic. Now we need to add in our label to indicate that that thick line is for 2022. So I'm going to come back up here into my DG plot. And you'll recall that I did all this stuff where I was binding together those three data frames, then feeding that into GG plot. I'm going to create a new data frame called T data that I will instead feed in there, right? So if I do T data, T underscore data on that, make sure I've got this run, and that works good. And so with T underscore data, then let's go ahead and create another data frame that's going to be from the year 2022. So we'll then do a filter on year equals 2022. That gets me those four, right? And so then what I'll do is a slice max on month number, right? And so that gets me the latest data, right? And actually, I could go ahead and do a slice max on year. So I don't have to say what year it is, right? So this should work next year is what I mean, right? So yeah, we get the same result there. And so I'll then call this annotation as that. And so now I can add in here geome text data equals current or annotation, right? And then AES. Let's go ahead and do so the x is going to be the month number, right, which I've gotten there. So that'll be like three. The y will be T underscore diff. This all gets inherited anyway from the ggplot. But let's roll with this. And then we'll also then do label equals a year, and then color equals year, of course. And then we'll go ahead and outside of this, I'll do inherit AES equals false, just just in case, right? I think the grouping might cause problems. And I don't know who knows, it's probably not necessary, but let's see. Good. So we have our 2022. It's justified right on the point. So we could, of course, come back up here to our geome text. And I could then do hjust equals zero to get it to be left justifies. Let's do size equals two. That made it a little bit smaller. Let's up it to five. So we'll come back up here and do five. So that makes it bigger. As always, there's always a little bit of fiddling with the code to get it just the way we want it. I want to nudge it over to the right, just a smidge. And in geome text, I will add nudge x. And let's do 0.25. And I also want to make it bold. So we'll do font face equals bold. And so maybe that's a little bit too far over. So let's try 0.15 on that nudge. Again, I think this is a pretty cool way to represent the data. In the next episode, we're going to take it up a notch, and we're going to play around with polar coordinates and using polar coordinates to represent this data. So you don't miss that exciting episode. Please, please, please be sure you've subscribed to the channel. You've clicked the bell icon to get all the notifications. And so you will be notified when that episode is dropped. Tell your friends about what we're doing here. I've loved seeing how people have taken these visuals as a starting point and then added their own styling to give kind of their own personal, you know, a way about them. I love seeing that. Thanks for tweeting those at me on social media. Keep them coming. Keep playing with us. That really shows me that you're really grasping these concepts. And that's just wonderful. All right, take care, and we'll see you next time for another episode of Code Club.