 Hey folks, in the last several episodes of Code Club, we've been looking at a variety of approaches to build climate spirals. Climate spirals are a line plot depicted in polar coordinates, where each lap around the spiral represents a different year. And so these were originally made by Ed Hawkins at the University of Reading and quickly became quite iconic. More recently, NASA released a more updated version with newer data from their data set. And until we've been building out these different versions, based on Ed Hawkins' version as well as the NASA version, to build GIFs or MP4s that you could kind of watch the spiral evolve over time. What I want to do in today's episode, though, is show you how we can build an interactive version of a climate spiral. And we're going to do it in three dimensions. And so we'll have the polar coordinates. But then we'll also have a Z layer, which will be the time, right? And so you can imagine that climate spiral in 2D now like a slinky pulling it up and being able to visualize it and kind of dig into it, if you will, using interactivity. So there's two frameworks that I've used in the past to build interactive data visuals. The first is RGL. I find that RGL maybe has a little bit of limitations to it and isn't as easy to use. The second approach is plotly. I've talked about plotly maybe about a year ago. And so plotly is great for building web based interactive visuals. So you could make a visual, put it into an HTML document like a web page, and then your viewers could play around with it and move the data around. If you're coming from the microbiome world like me, perhaps you could think about making a three dimensional ordination, or you could have an ordination and two dimensions, plus a time component. So I'm not going to do that. But what we're going to do today is to take our climate spiral and bring that into plotly to make an interactive three dimensional visualization that I think you'll find is going to be pretty cool. So before we can plot the data, we need to get the data. If you want to get the data that I have, along with all the code that I'm going to generate down below in the description is a link to a blog post that will get you everything you need. I'm going to go ahead and save this R script into my code directory, or call this climate spiral plotly. And we'll go ahead and add to this library tidyverse. And we're also going to need the library plotly. So you might need to install plotly if you haven't installed that before. Great. So we are going to need to get our data in. And we've done this in previous episodes, but typically a copy and paste over and over again. But I want to do it from scratch for anybody that's just joining us for the first time. And so we'll do a read CSV data forward slash, and it's this GLB file. Go ahead and run that. And that brings it in as a CSV. But why isn't it parsing it? Oh, right, because the first line of this file has a header, and it's just kind of a comment that we don't really need. So we can say skip equals one. And so now this reads in and parses it properly with the year and the 12 months. Very good. And so now what I want to do is to do a select on the year. And that was a capital Y year. And then I want all of the months. So then I can do month dot ABB. And so now I get the year plus the 12 months. I didn't comment on this before. But the full version has kind of seasonal ranges. So like, and also an annual average from January to December, December to November, but then like December, January, February, March, April, May, so forth. Right. The other thing I noticed when I ran this with the select was that it gives me a warning message that if we use a vector to get specific columns, the best practice is to use all of as a function around the vector of names we want. And so we can go ahead and add that here. So we'll do all underscore of month ABB. And again, what that's doing is telling the select function that we want all of the values in this external vector in this month dot ABB vector. And again, if you haven't seen that before, month ABB is a built in vector that has the names of all those months. And so now when we run this, of course, we get that and we don't get the warning message about using all of very good. So now we need to get this to be tidy. So we'll now do pivot longer. And we'll do everything but the year. And we'll do names to and I'll do month. And our values to all say t diff. And so this is a temperature difference. Each month is a difference for that month, relative to that month over the years of 1951 to 1980. And so that's why the numbers look a little bit weird. They're not the absolute temperatures that you might expect to see. So I'm getting a error message that it can't do this pivot longer, because January is a double and April is a character. And that reminds me that in the original CSV file, they indicate na values with three stars. So now if I do na equals star quotes, and then three stars, if we run that, we now see that April is a double. Whereas before when we ran it, let me come way back up here. Yeah, we see that April was a character, right? And so I should have noticed that those months from April out are formatted a little bit differently than January, February, March. Anyway, so we should be good to go now. Yeah. And so now it pivoted longer. That also means that there's na values in our data frame. And I think that that is the last several values. And so if we do slice tail, and let's do n equals 12. Yeah. So basically, from April through December of 22, I pulled these data down in March, or I guess in April, and they didn't have the full April data yet. Those are na values. And so I could replace that slice tail with drop na. And now if I do slice tail, n equals 24. Yeah, we no longer have those na values, and everything looks good. So the next thing that I want to be thinking about is that my month is of type character and I'd rather it be a type of factor. And that's because the month, the alphabetical is not their order, right? It's January through December. And so I can then do a mutate on month to be a factor of the month column. And I can say levels equals month dot ABB. And so now when I run this, I see that this column month is now a factor. If I hadn't done that, then it would order my months alphabetically. Now it's going to order my months chronologically as the order of month ABB. Excellent. I'll go ahead and for good measure do an assorting. I'll do a range, year and month. I don't think that's going to change anything. I realize that I have a lowercase year when it's an uppercase up here. I'm going to go ahead and change that. So I can in the select, I can say lowercase year equals uppercase year. And so then I need to change that capital Y to lowercase all the way throughout. I prefer to work in lowercase because that way I don't have to worry about if this variable capitalized or lowercase, it just it's always lowercase, right? And so now we've got those data sorted appropriately, and we're in good shape. Now what we need to do is build out our x, y and z coordinates. And so we'll do another mutate, we need x, y and z, right? And so to get x, y and z, I first want to think in terms of polar coordinates in terms of the angle theta, as well as the radius. So my radius, I'm going to think of that as being my t diff column. And then my theta, we saw this before. But we can think of that as being basically the month number divided by 12 times 2 pi. And that'll give us the radians position in that polar coordinates. And if if this is novel to you, go watch the last episode where I talked about building out one of these climate spirals without using cord polar that describes all the trigonometry. Okay, so to get that, I want to also have the month number. And to get the month number, what we can then do is as dot numeric on month. And let me just double check that works the way I anticipate it working. And so yeah, so now we see we've got the month number for each of the months, right? So like September is number nine. That's great. Okay, so I can then put a comma here, radius t diff theta then is going to be month number minus one because I want January to be at zero position. And then that divided by 12. But I want that all times two times pi, pi is a built in constant in R, which gives us 3.14, whatever, right? And so now what we get is our month number, our radius and the theta. So now what we need to do is to convert our radius and our theta into x and y. So our x, we can say radius times the sign of theta. And our y is the radius times the cosine of theta. And then our z is going to be the year. So now we have all this wonderful information that we can then use to make a plot. So let's go ahead and save this as t underscore data. And as we saw before, we could take like t data, pipe that to ggplot aes x equals x, y equals y, and then do color equals year or z say, and then we can do geom path. And that gets us our climate spiral. One thing I'm remembering from the last episode, however, is that our radius for t diff is going to be negative, right? So again, if I look at t data and look at my t diffs, I've got negative values in there. And so that does funky things of giving us a negative radius like we see here. And so what we did in the last episode was added a constant to all of the radius values. So let's go ahead and do plus 1.5. And so now we can see that that hole if you will inside of the donut. Okay, we've done this already. But we want to do now is to make an interactive version of this where the year is in the z axis. So what we'll do now is go over and learn a little bit about plotly. If you go to plotly.com, it brings you to this website. plotly is a JavaScript HTML based tool that allows you to make interactive visualizations that are web based. So if you go to docs, and then go to graphic libraries docs, you'll see that there's a variety of tools available for interacting with plotly. There's Python, there's our Julia JavaScript. There's a plotly GG plot to graphic library, there's F sharp, which I've never heard of before in that lab, and then something called dash, what I'm going to be interested in is mainly this plotly are open source graphing library. There's also a GG plot to add on, which basically allows you to take a GG plot based visual and feed that directly into plotly to make an interactive tool that way. I find that that's a little bit limited for what we want to do within three dimensions. So I'm going to work within the R based open source library, which will allow us to build a figure directly within plotly. And so going into that page, we find that they've got a really great source of visuals that I kind of scan through and I find something that looks like what I want to do. And then I see how they did it in that visual. So again, there's fundamentals kind of the basics of using plotly, some basic two dimensional charts, statistical charts, variety of scientific charts, financial charts, maps, artificial intelligence, 3d charts, that's what I'm interested in. But we'll also see there's sub plots, ways to make facets, transformations, various controls, some animations, and a variety of other things, right? So what again, we're interested in our 3d charts. And specifically, I'm interested in 3d line plots. And so again, my approach when I'm learning something new is to look for visuals that look like what I want to do, but in that new framework. So in this case, plotly, right? And so 3d line plots looks a lot like what I want. So let's go ahead and click on that. This opens the 3d line plots in our page. And we can kind of look through here and see, you know, basic 3d line plot. And so one of the things that you might notice is that I can click on the plotting window, and spin this around, right? I can use my scroll around my mouse to zoom in, and to zoom out, right? And so there's a tool tip so that if I hover on a different point, I get a bubble that pops up with the x, y and z coordinates, right? And I can make line plots like this. And so this then would be the code to make a very basic line plot like you see here. So let's go down. So we can make a 3d line and markers plot. Hey, this looks a lot like what we're doing, right? This is in polar coordinates, we can see that they've got x and y and z, like we do, right? So again, I didn't intend this, but it's it's pretty cool to see something like a climate spiral here. One thing that they're adding to this is that they've got, if we scan in, they've got markers on the different points of our line plot, which isn't what I totally want. Also, this has a color scale that's coming from the Veritas package, which we saw a while back when we were looking at Ed Hawkins version. I'd prefer to use the blue to white to red color scale that we saw with the NASA project. Ah, look, it's almost like we asked for it, custom color scale. And once you know it, it's also in polar coordinates, right? And so they've got this kind of cool spirally thing that's got a custom color scheme. Perfect, right? This is exactly what we want. What I'm going to do is I will go ahead and copy this into a new R script here in our studio. And I'm going to go ahead and run these different lines. And so as I'm running this code, I don't want to just copy and paste the code. I want to think about what is the code doing, right? And so, so we load the plotly library. There's something here about a count of 3000. So maybe this is that there's going to be like 3000 different points being represented. They then create X, Y, Z and C as vectors, but they're empty vectors. They then use a for loop to go through all 3000 values of count. R then is the radius, which is the count or I, the counter, the stepper variable times the count minus one. So again, that's going to be the radius. The X and Y then are, yeah, so it's basically the same transformation that we saw before. They're using C X comma, and then the trigonometry that we've seen. And what that does is that basically concatenates on a new value to all the previous values of X, Y, Z and C. I'm not sure what the C is. Maybe it's color. And so then they make a data frame from those four vectors, and then they throw it in and make a plot. And so again, looking at the syntax for this plot, we can see that the syntax is a bit different than what we're used to, right? So they give plotly the data, the data frame, right? So in our case, that would be like the T data. They then define X, Y and Z. And they're using this tilde X, I think to say the column X within data should get mapped to X, the Y column in data should get mapped to Y, Z to Z, right? And that it's then using a type scatter plot, a 3D scatter plot, and the mode lines. Then there's some styling here that we can add for the line, right? And so the width being for the color is getting mapped by the C column from the data data frame. And then again, remember, this section of the web page was for a custom color scale. And so what we can see then is that we've got this color scale argument going to this list to define the line. So when C is zero, it should be this color in hexadecimal. When it's one, it should be this color. So I think I have a handle of what it's doing. Let me just run all this and make sure it works like we saw on the web page. And sure enough, here is the visual that we had up on the web page. So again, what I want to do now is I'm going to go ahead and copy this code that they use to build the figure into my climate spiral. I'll go ahead and comment out the ggplot and I'll paste that in. Maybe I'll remove the fig because I don't know that I definitely necessarily need all that. And we'll tab this over a little bit to make it look nicer. And so again, in our data, that's going to be t underscore data, we have then x, y and z. Again, type scatter 3d lines. And then we need we're going to let's leave it with the four. And then my color, I'm going to make that t diff. So let's go ahead and run this and see what it looks like. Of course, we'll have to adjust along the way. Very cool. There is our climate spiral. And you can see again that we can pan around spin it around play with it. And it's pretty cool. And we can zoom in and we can kind of look click on different lines and see what the temperature diff was for those different positions. Very cool. All right. So I'm not such a fan of this color scheme. And so what I want to think about is how can we adjust this color scale to get the color scheme we want. And so let's go ahead and mess with the zero and one. I'm getting the sense that zero to one is a scale. So it's basically transforming t diff to be a number between zero and one. So let's go ahead and make our zero the lowest level. I want to make that blue. So in hexadecimal, it gets two characters for red, two for green, two for blue. So do 0000ff. And then for red, of course, will be ff 0000, because again, red, green, blue. Fantastic. We now have our blue to red. Of course, I want that to go through zero at point five. So let me go ahead and now see if we can modify this. And so I'll do 0.5. And for that, that's going to be white. So it's going to be all f. So 123456. Let's see what happens when we do that. So very cool. We now see that goes from blue out to red. And I'm pretty happy with the way that that all works. And it's looking pretty well. And it's amazingly simple to have gotten this far. The next thing that I want to do is go ahead and change the tooltip, right? So I don't care so much about the X, Y or the Z being shown. What I'd rather have would be to have the month, the year, and then the temperature difference. And so we can modify the appearance of that tooltip and what information is being portrayed. To do that, let's go back to the plotly and see if we can figure it out from one of the demos that they have. So let's go back to the gallery of different options. And across the top here, I remember the first was fundamentals. So let's look at the different options or things that they have available for looking at fundamentals. And we can see that there's all sorts of different ways to change the marking and styling of what's going on. There's one here called hover text and formatting. So let's check that out. So again, we're in this hover text and formatting in our page. And so if I hover over each of these points, I see I get a different label. So typically what we got before would be like the X, Y coordinates. And so what we're seeing here is that when I cover this point here, I get text D. So then coming up here, I notice two things so that there is a line in here for text, as well as for hover info. And so text then is the text that shows up. And so I imagine that could be a vector of values. And then hover info is text, which I think is telling plotly that the hover info should come from the text column. So again, we'll do text equals and I'll do T diff. And we'll modify this later to make it look better. And then we'll do hover info equals text. So now returning to our climate spiral, we see that when we highlight over a different point, we get, yeah, we get the T diff for that particular value. Cool. So now I'd like to make it look a little bit nicer. So what we'll do is create another variable in our data frame that will be the label. And so I'll do label equals glue. And so I'm going to use the glue function, which comes to us from the glue package. So I need to do library glue. And again, what I want is the month and the year and then the temperature difference. And so I'll do month and space and then year. And then I'll do a backslash and occurs me I need this all to be in quotes. And then I'll put in here T diff. And so now if we look at T data, we get a column that's got our month, year, and at least on my zoomed in version, it gets trimmed off. So then we'll change our text to be label. And so now what we see is that we have the month, the year and the temperature difference. So that's pretty cool, right? So I can then scroll in here and grab a point, any point. And I see that I've got blue at negative 0.03. So something that has me a little bit worried is that here I have a positive value. So July 1983 has a T diff of 0.18. That should be red, not blue. And so again, if I kind of look through these different values, 0.1 is kind of large for the values we're looking at here, the range we're looking at, and it's still blue and that should be red, right? So my concern is that it's scaling all of the values to between zero and one with the midpoint at 0.5 being white. I think what we need to do though is tell it what T diff value needs to be at 0.5. So let's come back to fundamentals and see if there's anything that sticks out to us about modifying the color. Let's see. Let's see. I thought I saw something. So built-in color scales. Let's look at that, and maybe let's look at discrete colors. So the built-in color scales, they've got things from Color Brewer that allow us to put in specific color palettes. Maybe the Divergent would work for us, but still we need a way to tell it what's at that midpoint. So I'm not seeing anything here. If we look at discrete colors. Yeah, so this is like setting specific values for different variables. And that's not really what I want either. So I think we're going to have to turn to our friend Google. So I think what I'll do is I'll search for plotly color scale. That's the name of the argument that we're modifying in the list to set that color range. And let's do midpoint. And so this first link is for a Python version. But let's check that out because maybe that will be useful in the long run. So again, continuous scales. Of course, it is Python, but maybe it'll translate over to R. And so as I look down here, I see color ranges. And as I read through this minimum to maximum range of data mapped to the zero to one input range, again, in our color scales argument, we had blue being zero, red being one, kind of reading through this a little bit more, I see C min, C mid, C max for the various arguments for things like color axis, C min, so forth, right. So the C mid is intriguing to me. So I'm going to go ahead and search for C mid. So here they've got C min within the marker next to the values. And so I think that could be it. So next to the color scale, perhaps, or next to the color and the color scale, what we could do would be to do C mid equals zero. Let's see if that works. And so now that looks a little bit different, right? And so if we look at 0.5, that's now a reddish color rather than a bluish color. And negative 0.15 is a bluish color, right? So I think we were successful and that 0.2 is a little bit red, pinkish, but it's still pretty white. So I think that C mid was what we needed to associate 0.5 with a specific value from T diff. And again, it's a bit of troubleshooting and kind of searching for something that looks like what we want, getting the right Google terms. In this case, we were taking an example from Python and bring it over to R. I don't really know Python, but the plotly library in general is consistent across these different languages. The language like Python or R is the interface to the plotly library. And so I think a lot of the documentation will be similar ish, right? In a relative context. I don't think it's going to change anything necessarily, but I could also go ahead and put in my C min and C max. And so my C min is going to be T data, it's going to be the minimum of T data, dollar sign T diff. And C max then will be the max of that, right? So max on that. And like I said, I don't think that really changed much of anything. Finally, what I want to do is go ahead and remove the x and y coordinates. So back in the fundamentals page on plotly, there is a special button for axes. So the first thing we see is that this tutorial explains how to set properties over two dimensional Cartesian axes, namely, namely x and y, other kinds of subplots and axes are described in others. So for 3d axes, the axis object is seen. So let's go to 3d axes. And so then this is how to format the axes of 3d plots and are with plotly. Wonderful. And so again, what we have is that we would add to their figure a layout, right, where we have seen equals list x axis y axis z axis. Just kind of looking through here for examples where perhaps they've removed or change the axis values or labels. So let's start with removing the title. And so I think what I'll do is x, x, and y and z. So let's go ahead and copy this down. And I'll put that right up here above the plotly. And so my title for x will be nothing for y will be nothing. And z, I'm going to go ahead and remove that also, because I think if I leave in the year, that'll be pretty obvious. And so let's go ahead and run all that. So we saw down here is that we need the scene layout. So I'll copy that. And then I'll put that at the end of my plotly and put that on another line. So I now see that my x and y axis titles are gone. Let's go ahead and put these on separate lines. So it's easier to see what's going on. So as I look down through here, I see information about changing the grid color, changing the range, or types of values on the axes, but not exactly what I'm looking for, right? I'm looking to remove the grid line, as well as the tick labels. So I'll come in here to plotly, remove grid lines are how to horizontal grid lines and plotly are. So let's go down here, axes. This takes us back to the page that we were at before. So let's look for grid lines through here. So toggling axis grid lines, let's go to that. So again, this is the 2d page, not the 3d page, but maybe it'll work anyway. So there's an argument here of show grid equals false. So let's come back up here to our x, x, and we'll do show grid equals false. And let's put that in for our y as well. And so we've gotten rid of those grid lines. There's still a zero line in here. Coming back to this page, I noticed that there's also toggling axis zero lines. The next thing down. So that's zero line equals false. And then way back up here, we'll go ahead and add that as an argument for our zero line. So that zero line is gone. I do like having the grid lines there for the years, because again, it's easy when it's at an angle to kind of see where the line is. We do have those labels on the axes that I'm not a fan of. There's also toggling axis labels I see here. Axis tick marks can be disabled by sitting the show ticks axis property to false. So let's try that. So we'll come back up here. Show tick labels equals false. And we'll do false here. Great. So let's give this a shot. We've lost the x and y axis ticks. And that looks pretty good. That's exactly what I was hoping for. Great. So the next thing I want to do is save this to an HTML page so that I can look at it and perhaps put it up on a web page, including the blog post. Again, if you go down below in the description, there's a link to a blog post. I'll put the HTML version of this interactive over on the blog post so you can play around with it and see what it looks like. So to save this, I need to assign the plot object to a variable. I'll call that P. So the function that we want comes from a package called HTML widgets. So I can go ahead and do library HTML widgets. Make sure I've got that loaded. And then we'll do save widget. And we give it the widget. So the plot that we want P, as well as the file name. So I'm going to put that in quotes, I'm going to put it in figures. And we will then do climate spiral. Plotly.html. So there we go. Those lines are a bit thin. I would like them to be maybe a bit thicker. So it's kind of easier to see what's going on. If we came back to our code, you'll notice that we had width equals four here. Let's go ahead and put that up to 10 make it a bit thicker. So that looks a lot thicker and more robust. One thing I'm noticing, however, is that we seem somewhere have to lost the zeroing of our data at white. So here, April 1973, we've got 0.27. That that should be a reddish color. So I think what I'm going to do is come back in here and seem I know the things we're working before we added the C min and C max. So I'm going to go ahead and turn that off. And there's our April 1973 is that reddish color. So it looks good. One final thing I'm going to do because I can't help myself is to add the degree sign. So we'll come back up here before we call it a day. And here where we've got our T diff, we'll go ahead and add the Unicode, which is the U00B0 and then a C for degrees Celsius. And there we go. We now have our nicely labeled interactive plot. And again, I think this is really cool. This is basically what we had been making in the static version. But again, the ability to zoom in and put your cursor on different points to spin it to look at what's going on. I think it's just really powerful and really attractive. Of course, this only works in an HTML based environment like on a browser. This won't obviously work on a PDF. So if you're in science like me, you know, we can dream of the day when we can have interactive visuals on for our papers, right? But we're still stuck in this mode of thinking about a physical paper, a physical PDF, so to speak, rather than using the interactivity that comes with, you know, working with the internet. Anyway, I hope you found this interesting as another way to visualize these data. Again, with the GIF or an MP4, as the creator, we're giving the view to our audience. Although there is that animation here, we allow our audience to interact with the visual, which I think is just really powerful. And I think this is a really cool instance for three dimensions where we can put that time through the years on that z dimension. Let me know what you think down below in the comments. Please share this with your friends. And we'll see you next time for another episode of Code Club.