 So often when we're making data visualizations, we have just way too many variables. And when we can limit the number of variables, we have too many levels within each of those variables. If you've been following along, you know that I've been looking at data from 2020, looking at different countries' willingness to receive the expected COVID-19 vaccine, which hopefully you've received, I know I have. But we've been looking at these 15 countries for this one variable, right? But the challenge that is, how do you have 15 colors? How do you label it appropriately? When the lines are all way too close to each other, right? That presents a unique challenge. Well, a couple years ago, I came across a book by a cold, noose-bummer naflic called Storytelling with Data. And I think this is a really helpful book. It's not written towards scientists, but more towards general data visualization practitioners. Maybe that includes you, who knows. But one of the take home messages from her book that really resonated with me, and that I have heard repeated time and again by people more in the design field, is to take your visual or take whatever it is you're trying to design and make it all gray. Yeah, make it all gray and then identify the one thing you want your audience to take away and add color to make that easier for your audience to see. So instead of having 15 colors, have two colors, gray and then your primary contrast color to make it clear what you want them to see. So in today's episode, we're going to take our slope plot, and we're going to take Cole's strategy and advice, and we're going to turn it all gray, and we're going to highlight one of those lines and see if that doesn't really help to tell a compelling story. Now, of course, there are 15 possible stories here. But at the same time, I can't tell all 15 stories. And if I give all 15 stories, then my audience doesn't know what I think is important. So I'm going to tell them what I think is important. And that is going to be my country. I'm in the United States. And so if you're, say, in India, well, I'd encourage you to make the line highlighted for India. But we're getting ahead of ourselves. And I'll show you how to do it for the United States, and maybe one or two other countries. And then later, you can do it for your own country. So I'm on my highlight branch. Remember, this is a feature branch where we're going to try a different visualization approach, as I already talked about in my introduction. And we have our code, which again, if you go down below in the description here, you'll get a link to my GitHub repository where you can get all the code and the data that you'd want to follow along with me. So we start the script by loading the packages that we're going to use to build our visualization, at least as it currently stands, tidyverse, it's kind of an all purpose data manipulation and visualization. Show text allows us to add fonts from Google fonts to make our fonts look a little bit more attractive than the default aerial font. And then GG text allows us to embed markdown and HTML into our access labels and titles, right, to make it look a little bit more polished also. Then we read in our data from the CSV, we get things formatted so that we can then read them into GG plot and build out our figure. And we've got some theming that goes on as well before we finally save it out as a tiff. So this is our slope plot again with those 15 countries and 15 lines and 15 colors for those 15 lines, it's a bit of a mess, right? Again, what we're going to do is we're going to start by turning all of the text gray. And we can do that by coming up to geomline and I'll do color equals and I'm going to put in a hexadecimal for kind of a light gray, I'll put in all a's. Again, if you give the same character repeated six times, you'll get a gray color. So that will turn the lines to be all gray. And also then we can come down to our theme element text, I'm going to go ahead and make the text elements of my figure gray. And so I'll do then color equals that same a gray for my title, I want that to be black. So I'll do color, and I'll put in the hexadecimal, you could write black, but I like having the hexadecimal because it gives me a lot more flexibility later, I find. And so here, I'll put in six zeros, and then a comma at the end of the line. Also, I see I have color gray here. I shouldn't need this. I'm going to go ahead and delete it, because I've got this text element text up here where everything is gray. And so that should kind of, you know, cascade down to the other text elements in the figure. So our title is black, like we specified, and everything else is gray, except for our access labels. So let's go ahead and change our access text to be gray. So we'll do access dot text, and we'll do element text color equals and then we'll do that six a's. And I'm also going to change my access ticks to be element line, and I'll do color, the six a's again. So I also want the background to be white. So I'll do panel dot background equals element rect fill equals white. So that'll be all f's. And then I also want to get rid of my grid lines. And so to do that, I'll do panel dot grid equals element blank. So now all our text, all of our lines are grayed out, even those access text, the the two months and then the percentages on the y axis. And so this is kind of our blank canvas for getting going. Again, for this exercise, I'm going to do the USA. But again, I'll show you some quirks along the way for, you know, problems you might run into if you happen to use one of the other countries. Okay, so I'm going to come up back to my data. And what I want to do is add a column to my data data frame that indicates whether or not that country should be highlighted. Right. So what I'll do here then at the end of my pivot longer is I will add a mutate. And I will then say highlight equals, and I'll say country equals equals USA. Right. So in parentheses, I'm going to put country equals equals USA. So if the value in the country column is USA, then the value in the highlight column will be true. If it's not USA, it will be false. Okay. So now if we run this, so we see our highlight column is false for all these countries. And of course, the USA is at the very bottom. So what I could do would be data pipe that to tail. And I now see that USA has two true values for the August and the October. And so those then will be highlighted in my visual. So now what I can do is I can come back up to my ggplot. And instead of color by country, I'm going to color by highlight. And I'm also going to remove this color equals gray. So I now see that my line for the United States is this teal color. And that all the other countries are that salmon color. One thing I want to show you though is if instead of USA, we had used say like Brazil, here country equals Brazil. So as I zoom in on the line for Brazil, something you should notice is that it's running behind two of these other lines. I think this one here at about 83% is Canada. And I forget what this is. I think this is India, right? So India and Canada, it's going behind those. And so what happens in ggplot, as we've mentioned before, is before it plots, it converts any text to a factor. And the default order for a factor is alphabetical, right? So Brazil B will come before Canada and India. And so it will be plotted first, right? So it's laid down. And then the lines for those other countries that fall after come last. And so this is why the USA wasn't a problem. It was on top of everything because you was the last country in the 15 countries we have. So USA comes after United Kingdom, right? So how do we get that line on top of those other lines? What we can do is we can come back up to our mutate and we're going to revisit an old friend where we will do country equals FCT reorder. So we're going to reorder country by highlight. And so we're going to take country and we're going to reorder our countries according to their value in the highlight column. So again, false true, zero, one, right? And so what's going to happen then is all those countries with a false or zero will be laid down first. And then the country with the true or one value will be laid down last. Now if we zoom back in on that line for Brazil, we see the line for Brazil is on top of the lines for Canada and India, and it's just easier to see, right? So with 15 lines, it's not so bad. But if you have a lot more lines, then you really would want that line to be on top. So you can more easily see what's going on. And also if our line was down here, it might get a little bit jumbled. So again, we can use FCT reorder to make sure that the country that has the value of highlight equals true is on top. So I'm not so concerned right now for Brazil. I want to focus on the USA. So we'll go ahead and put USA back. So next what I want to do is I want to turn all those salmon colored lines for false where it's not USA. And I want to make them gray. And I want to make the true line highlighted. I'm going to make it a blue color, right? So we can do that as we have in the past with our scales. And we can do scale color manual. And my breaks are going to be false and true. And then my values are going to be my gray color. So those A's. And then my blue color. Make sure I get my plus sign. My USA line now is blue. And my 14 other countries are in gray. I'm going to do one other thing to try to make that USA line pop a bit more. I'm going to go ahead and make that line a bit thicker while maintaining the thickness of the lines for those 14 other countries. Again, we can come back up to GG plot. And we haven't mapped anything to the size aesthetic. So again, I'm going to do size equals highlight. And maybe I'll go ahead and put this on a separate line. And now our size of the line, the thickness of the line is going to vary by whether it's false or true. And because I want to be able to specify that value, I'm going to do scale size manual. And this is going to be very much the same as what we had up here for scale color manual, I can even copy this. And then I'll replace that gray with 0.5, which I think is the default thickness. And the blue, I'm going to replace with two. So the United States should be a thick blue line. And the 14 other countries should be a thin gray line. Now, we also have these legends on here, which I'm not so crazy about. So let's go ahead and remove those. And we can do that backup in geom line. We can do show dot legend equals false. One thing that kind of sticks out to me about this blue line is that it's kind of like a rectangle, right? And so there's a little argument that I don't use very much, but I think I'm going to use here called line end, and that allows you to change the appearance of the line endings. So I believe this line ending is called but B U T T as in kind of two lines budding up against each other. And what we could do instead is round. And so if we come up to line end, I'll do line and equals round. So I think that round end does really give a nice look to my line here. Again, this only seems to be a problem when you've got a thicker line, like we do here for size two, you can kind of see the kind of a round end on those thinner lines. But again, those lines are generally so thin, but you don't notice how boxy they are. I kind of like that. And, you know, you do you. The next thing that I want to take on is my title, right? So at this point with the way I'm showing the data, the title doesn't really do much to help tell the story. It needs to say something about, you know, well, what is that line? What country does that represent? So let's go ahead in. And we're going to change the title. And I think I'm going to go ahead and add some data into the title to help tell that story a little bit more concretely for my audience. I'm going to come way back up to before my GG plot pipeline. And I'm thinking about the title I want to have, I'm going to say something like between August and October of 2020, people's intention to receive the COVID-19 vaccine dropped by X percent in the United in the USA. Okay. So that's kind of what I want the title to say. Right. So we'll do that. But I need to inject some information in here from my data, right? And so I need to inject this percent change. And you know what? I want to generalize this so it's not just about the United States. So if I recreate the plot for Brazil, well, would it be dropped by or increased? Would it be USA or more likely Brazil if I'm looking at Brazil? And what would the percentage be, right? So I'm going to use a function from a package called glue, and that function is glue. And glue allows us to inject information into our text strings. Again, that's from a separate package called glue. So I'll do library glue. So we'll take data and we'll pipe that into group by country. And then we'll do a summarize. And I will summarize to create the difference. So I'll say diff and there's a function diff that allows us to look at the difference in those pairs of values for each country. So we'll do diff on percent. And so now we see our 15 countries along with that difference, right? So we could inject the country name and the difference, although we don't want to, you know, say the United States dropped by negative 9 percent. That would seem weird. So we need to add some more to this summarize. So what I'm going to do is I'm going to say direction equals if else diff less than zero, and then I'll say dropped. And then otherwise I'll say increased. And as I'm writing this, I'm thinking of Canada and India where it stays flat. And so I'll let you figure out how would you modify this to make Canada and India work for what we're trying to do here. Okay. So we'll now have direction. And so we'll see that USA dropped. Great. But I don't want to say it dropped by negative 3%. So what I'll do back up here then is I will take diff and I will apply the absolute value to that. So do abs on diff. And so now what I see is that the USA dropped three great. And so here now I will do highlight data and make sure that's all loaded. And then I can take this and embed the information into my glue statement, right? And so instead of dropped, I can put in the pair of curly braces and I can do highlight data dollar sign dropped. And then here instead of the X again, I'll do curly braces. And that will be diff. And then instead of USA again, I'll do the curly braces highlight data dollar sign country. And it's not happy with me. Ah, so that's direction not dropped. All right. Gotta get the names right. And so now if I look at title, I see, ah, I'm producing titles for every country. And that's not what I want, right? I only want the USA at this point. So again, I'll come back up in here for highlight data and I'll do filter country equals equal equals USA. And so now if I look at my title, I now see that statement. And again, if you're in China or if you're in Brazil, your sentences will look differently if you filtered on those different countries. Now we'll come down to labs title. And I'm going to replace this title with my title. And let's see what this all looks. Well, we got our title in there, but the text is really big. And it's doing all sorts of funny things with the appearance of our figure. I'm going to go ahead and shrink the size of my font. So back up in plot title, instead of 28, I'm going to go ahead with 20. So that fits a little bit nicer. That does seem a little bit wide for the amount of data that I'm showing and not having those labels and everything. So why don't I go ahead and make my figure rectangular rather than square. And to do that, again, we can come back to gg save and change width to four. And so now I feel like that's a lot tighter and a lot more attractive than having it spread out. And I think that looks really good. Something I want to do, though, is I want to highlight dropped by 3% in the USA as blue to really make that crystal clear that that's referring to this line. So I'll come back up to my title statement. And I will come over here, and I'm going to add a span. So this is an HTML tag. And we'll do style equals. And then in single quotes, we'll do color colon. And then I'll put in hexadecimal my blue. So I'll do 0000FF. And again, you use whatever color you want to do. I think blue works well again for decreasing. And then we'll go ahead and back out our span. Now we have that part of the title that connects to the line highlighted in blue. And again, I think that looks pretty attractive. Something I'm going to add to my caption is an indicator that each of these lines represent those different countries that we're surveyed. And to do that, again, we can come back up to labs. So caption across 15 countries, each line represents a different country. And that's running off the right side. And so I think what I need is an element text box simple and put it in for a caption. And we'll see what these looks like. So that element text box simple automatically puts in a line break so that the text wraps rather than just kind of continuing off the right side of the screen. One final thing that I'm going to do though is I'm going to go ahead and remove these access ticks, because I don't really think they help us very much. And so access ticks, we have element line, I'm going to remove that and replace it with element blank. I like the suggestion from Cold-Nose Bomber and Affleck of turning everything gray and then highlighting the story that you want with color. We don't need a bazillion different colors if we're trying to tell a story about one country. And so I would really encourage you to think about what is the one story you want your visual to tell. Now, I know when you're publishing papers, you know, space is at a premium and you're trying to pack a bunch of stories into one figure. But if you think about it, most of the figures you generate are probably going to be for a presentation. So you have more real estate there, so to speak, to make multiple figures where you're telling one story with each slide, with each figure. And I encourage you to try this strategy there, where again, you turn everything gray, or perhaps you start with a multicolored figure. And then you say, now I want to focus on the United States and look at what's happening here in the USA or Brazil or whatever country you're interested in, right? And then perhaps you could look at a couple of different countries and build out the stories for those countries. I think this approach of turning everything gray and then using color sparingly to really highlight your story is really attractive and is a great idea. Again, the book that I got this idea from is Storytelling with Theta by Cole Nosformer-Naffleck. She also has a podcast that I strongly encourage you to listen to. I listen to every episode. And even though she's coming at it mainly from like a business informatics visualization perspective, I still get a lot out of it as a scientist who's kind of working in a very different domain than most of her clients. She previously worked at Google and teaching people at Google how to improve their data visualizations and just does a lot of great work. So I'll put a link down below to a Amazon link where you can go ahead and find that book. And I encourage you to check it out. She's got some other great tips in there. And she does, the whole book is written with Excel and I just really admire how much she's able to get out of Excel. And so it's been kind of cool to take some of those ideas and see how we can apply them using art. Anyway, keep practicing with all this and we'll see you next time for another episode of Code Club.