 Hey folks, I'm Pat Schloss and this is Code Club. If you've been watching the past 30 episodes of Code Club, you know that I have been doing a deep dive really digging into COVID-19 vaccination data. We started by looking at data from Ipsos where we've looked at a variety of ways to represent data indicating people's preference or intention to receive the COVID-19 vaccine back in August and October of 2020, so a year ago as I'm recording this. Fast forward, we now are in 2021. The vaccine is available. Hopefully, like me, you've been able to get vaccinated. And what we now have is data for the 15 countries that were surveyed in 2020 to see whether or not people were actually able to get vaccinated in those 15 different countries. And so the question I've been grappling with over the past couple episodes is how did people's intention relate to whether or not they're actually able to get the vaccine? It's one thing to say I want to get the vaccine. It's another thing to actually go out and get the poke in your arm to get vaccinated. And of course, there's political reasons why in some countries or in some places people might not be able to get the vaccine. And then there's political things like here in the United States where people think that vaccination is a partisan issue, which is just the stupidest thing in the world. Anyway, like I said, I hope you're vaccinated. I am relieved in some ways that I am vaccinated as well. In the last episode, I showed you four different ways using ggplot that we could represent the same data set. And I came upon two different types of visuals that I thought did a pretty nice job of telling a story about the discrepancy for some countries between people's intention to get the vaccine and whether or not they actually got the vaccine. The first plot that I was interested in following up on is really a dumbbell plot that I converted to show an arrow instead of the dumbbell. The arrowhead end of the line is where we're at today and whether or not people actually received the vaccine, whereas the other end is the percentage of people in each of the 15 countries that indicated whether or not they wanted to receive the vaccine. The other plot that I was interested in displaying with this is called a dot plot. And the dot plot that we're going to make shows the difference in people's ability or actual getting the vaccine compared to their stated intention to get the vaccine. I like the dot plot because instead of asking my audience to visually calculate what's going on in that arrow, I'm showing them the actual number, right? I'm plotting that point. And so the dot plot shows the difference between intention and realized vaccination, whereas the arrow plot, as we'll call it, shows the actual intention and actual vaccination rates. So I want to move those forward and today I'm going to make a polished figure that I will take the two figures, combine them together using a package called patchwork, which we actually talked about in a previous episode. We'll put it all together, we'll make it look nice, and I'll introduce a few other concepts about patchwork that maybe I haven't talked about in previous episodes. And I think in the end we'll have a figure that I'll be pretty happy with and hopefully you'll agree that that's a pretty effective way to convey a message about the discrepancy between whether or not people have actually been able to get vaccinated when they stated that they wanted to get vaccinated. I'm here in our studio with my comparison figure dot r script. If you want to get this script, as well as all of the back history of this project, down below in the description there's a link to a blog post that will take you to github, where you can get the entire repository. We've been working with this under version control. So again, we read the data in from our world and data. This contains the actual percentage of people in each country as of the end of October that were fully vaccinated. We also have data from ipsos looking at people's stated intention to receive the vaccine from August and October of 2020. We then joined it all together. And then as I said in the last episode, I looked at four different ways of visualizing the data. I'm going to go ahead and join all the data together, get that read in. The resulting data frame ipsos OID contains the 15 different countries, their vaccination rates, and their intention to receive vaccination rates. Now I am interested in the dot plot, as well as the dumbbell chart. And so I'm going to go ahead for now and delete these other code chunks for building out those other figures just because they're kind of in the way. So now I've got my dot plot and my dumbbell chart so to remind you what the dumbbell plot looked like. Again, we have ordered the countries by the difference between their current vaccination rate and their stated intention to get vaccinated rate. So India was large discrepancy, whereas Spain actually got vaccinated a higher rate than they stated they wanted to. So that's the dot plot. If we look at the arrow plot, and you can see we've got these arrows pointing the direction from 2020 to 2021. So I think the first thing I'm going to do is go ahead and reorder all of the countries in my ipsos OID by their difference rate, right? And so here in dot plot, I actually done that. So I'm going to go ahead and pull that out of here and tack that on to the end of this pipeline and load that. And then if I look at my dot plot, that's in the right order. And then my dumbbell plot, that's also in the correct order now as well where we have the biggest difference, biggest negative difference at the top and the biggest positive difference at the bottom. Go ahead and clean some of this up. So now I'm ready to combine these plots with each other. To do that, I am going to start by naming the plot to a variable. So I'll call this dot and then this one I'll call dumbbell. And so I have dot and dumbbell. So you can assign a plot to a variable name and then you can do all sorts of things. You can add things onto the plot as we'll see with patchwork. You can actually add the plots together. So to use patchwork, I'm going to come way back up here and I'll do library patchwork. One of the cool things about patchwork is that I can use addition to add two plots to each other. So we'll do dot plus dumbbell. We then get our two figures arrayed side by side with each other. Again, we'll do some cleaning up here. I think I prefer this side by side perspective because I can put countries on the y-axis and then have two different variables plotted on the x-axis. Alternatively, if we wanted to make it a top and bottom, then we could use dot divided by dumbbell and that would put dot at the top and dumbbell at the bottom. As I said, I want to use the same y-axis. So I'm going to go back and use the plus sign to add these to each other. We're going to make this look a lot better and along the way we're going to learn a little bit more about patchwork and we'll kind of review some of the concepts about using themes to make it look a bit more polished. So something I want to emphasize is that dot and dumbbell are each variables and just as like we normally think of a variable like ipsos or oid as data frames, representing data frames, dot and dumbbell represent figures. They represent plots and so we're adding them to each other. They actually all have their own styling. So I can add to this theme classic. So what we see when we add theme classic is that it's applying theme classic to dumbbell, the last plotting object. It's not also adding it to dot and so this is something that I think is a common problem that I know I've had using patchwork, is that I really want to get my two figures looking well on their own and then join them together and then add any extra formatting to the overall plotting window using annotations that you can do with patchwork. So I'm going to go ahead and remove this theme classic from the end of that addition and I'm going to put that at the end of each of these other pipelines and so again we'll load these. So now what we see is that we have theme classic applied to both the dot plot and the dumbbell plot and again the difference is that we put theme classic at the end of each of the code chunks for building those plots rather than at the end of building the combined figure. As I mentioned we're going to want to format the two figures individually and we'll come back and do more of that here in a moment but we'll also want to add attributes or labels or annotations to the overall figure the combined figure right. So one question you might ask yourself is where would we put the title? So you know we might think about doing labs title equals let's do countries are not meeting their peoples stated desire to receive the COVID-19 vaccine. So again as we saw with theme classic that's only adding the title to the second plot to the dumbbell plot. What we want is for that title to be applied to the entire composite figure. So instead of labs what we want to use is plot underscore annotation that then gives us the title going all the way across the figure and in our case it's going way across the figure. What I'm going to do I'm going to go ahead and save this to a file so do gg save comparison figure dot tiff width equals six height equals four. I prefer to use gg save to do the fine tuning of my figures rather than the plot panel in our studio because I'm going to want to set the dimensions and if you change the dimensions then you're going to change the way things look when you finally export things from our studio and so I prefer to do it in the format in the size of the document that I finally want to use as a publication or tweeting out or who knows what I want to do with it right. All right so again we have our title going all the way across and we see that it goes off the right end of the screen and so I would like to apply some theming using something like element text box simple from gg text to get that to wrap and maybe change the font and some other things. So let's go back up to our libraries and I'm going to do library gg text so that I can bring in element text box simple element markdown which allows me to insert markdown and html and some other nice features into the text elements of my figures and making the theming look that much nicer. So we have this plot annotation with the title. To theme our title we give plot annotation the theme argument and then we to that we give the theme function and in here I can then say plot dot title element text box simple. So let's run with that and just make sure everything works and now we see that we get our title to wrap which is again one of the nice features of element text box simple. We could of course add to this we could say like size equals 20 face equals bold. So now we have our larger font it's bolded and it's automatically wrapping all because we gave plot annotation the theme argument and the theme argument then takes the theme function. If I would have added theme out here right if I'd have done like plot dot title equals element text color equals red that's not going to change anything right that is only going to change the title of my second figure right and so we need to set the theme inside plot annotation again by assigning that to the theme argument of plot annotation. In our figure patchwork took dot and dumbbell added them together and it gave them the same width but maybe I don't want them to be the same width. I would actually like that the dot plot to be a little bit more narrow than the arrowed dumbbell plot. So how do we do that? We can add to plot annotation plot layout and then we can say width and then we give it a vector of relative widths right so I could do one and two and that makes the dot plot half the width of the the arrowed plot right. So we're going to be changing things a little bit right we're going to be getting rid of these country names so that's going to expand a little bit so I think what I want to do is make my dot plot a little bit wider so maybe I'll do two and three and as always we can always come back and fiddle with things later but know that you can change the widths of your figures. Also if we had instead stacked them right so if I do dot forward slash dumbbell I could then do heights instead of widths and so the top plot is two units and the bottom is three. Of course this looks horrible so I'll go back to putting them side by side and using widths. So there's a few things I want to do to this figure to make it look more polished and more presentable. The first thing I'm going to do is change the labels that I have on the x-axis. Again for both of those we need to go back to the pipeline creating those figures and so here I can come back for dot and do labs and I will do x equals difference between actual and intended vaccination rate and then I'll also do y equals null because I don't need it to say country and I'm going to guess and put in a line break right about there and then for my dumbbell plot I'll do the same thing we'll come in and do labs x equals 2020 intended and 2021 actual vaccination rates and again guess the middle point right about there and we'll also do y equals null for that and add that and so here we have our titles for our x-axis. I'm not going to get too worried right now about the title I have on the x-axis for the dot plot because I think things are going to move around. We might come back and change the font so I'm not getting this truncation on the right side of the title. Yeah so the next thing I want to do is let's go ahead and remove these country names and actually I want to remove the entire y-axis from this dumbbell chart and to do that we'll come back up here and we'll do a fair amount with the theme function so I'll do axis dot text dot y equals element blank basically going to make a lot of things element blank axis dot title dot y equals element blank they don't have a title there anyway but whatever axis dot ticks dot y equals element blank we've got an extra period in there and then let's also do axis dot line dot y equals element blank and so that then gets rid of that y-axis and brings them together and I think something that would really help my audience to connect the dots if you will all the way across is putting in a grid line for each of the countries. To get that grid line I'm going to come back up to my dot plot do geome h line with y intercept equals country and that needs to be in an aesthetic function it's because we're mapping country to the intercept and then I'll do color equals gray size equals 0.25 and I'm going to go ahead and copy this because I want the same grid lines down here with my dumbbell plot so go ahead and load all this yeah and so now we have the grid lines that show the connection for India all the way across for Germany all the way across and I think that looks pretty nice I also see that because we now have more spacing that the right side of my title here isn't getting truncated. One thing that we did with the ipsos data before was to abbreviate some of these country names right it's like United Kingdom could be UK South Korea could be s kria United States could be USA so let's go back into our code and see if we can't recode those so that we have a more abbreviated name so here in ipsos oid I'm going to go ahead and we will do a country equals recode on country and we're going to give it the current name and the new name so we'll say like United States equals USA so let's run this just to make sure it all works and so sure enough we see that the United States is now USA so let's do that with the other countries United Kingdom equals UK South Korea equals s kria South Africa equals s Africa and that should be good so we've got those abbreviated names one little thing that I'm going to do is I'm going to remove those tick marks because I think they're just too bold considering the grid line right next to them I think the grid line kind of acts like a tick mark already so again because that was part of the dot plot I need to use the theme function attached to dot to remove that so we'll do theme and we'll do axis.ticks.y equals element blank and that gets rid of those tick marks the final thing that I want to do is I want to change my fonts I want to make my title font patch one and I'll make everything else monserat so it doesn't scream default fonts and because I think that was one of the cool things that we figured out how to do in this series of videos is how to use fonts from google fonts to spice up the appearance of our figure so we'll come back up here and add the library show text and we'll do font add google and we'll do patch one comma patch one so this is patch row hyphen one is the name of the font family that we'll be using and what this is saying is add the patch row one font from google fonts we could make that more explicit by saying family equals patch one and then we could do font add google family equals mont serat and then mont serat so we'll go ahead and run the library and those font add googles so that we can use these fonts with the current session we'll then do show text auto so I'm going to add to the theme function for the dot plot I'll do text equals element text and I'll say family equals mont serat get that and I think we've got the right parentheses yep and then for the dumbbell plot we'll do the same thing I'll do text equals element text family equals mont serat right and so now that takes care of like all of the text elements of the two figures but I also want to use that patch row one for the title right and so here to my element text box simple I could add family equals patch row one and let's give this all a run and see what the final product looks like very nice we've got our custom fonts in here one thing I'm noticing is that the mont serat font has a little bit more space to it than the default aerial that we get with normal gg plot and so what I'm seeing is that I'm losing the rates on my title here and so I'm going to go back and I'm going to make the x axis titles a little bit smaller so I'll come back up and we will add to our dot plot access dot title dot x element text and I'll do size equals 10 and I'll need a comma there and I'm going to copy this same theme option down below for my dumbbell plot so I want those titles to be the same size we'll go ahead and make get that loaded yeah so now those axis titles fit and they look pretty attractive one final thing there's always something isn't there I'd like to have a 20 on the x axis of the dot plot here because it just seems like these are hanging out in like the nothingness again we can come back up to our dot plot and I will add to this scale x continuous so we'll do limits equals na to 20 so the na says scale x continuous your algorithm you figure out what is the appropriate lower end but I want you to use 20 as the upper limit and then we can do breaks and I'm going to use the seek function and I don't necessarily know what breaks I'm going to use I know I want them every 20 but I don't know what that lower end is right and so I can use seek to go from minus 100 to 20 by 20s and so that should then give me breaks every every 20 percentage points we now have that positive 20 value on the x axis which gives pretty good bounds on the difference between actual and intended vaccination rates and I'm really happy with the way this looks I really appreciate you sticking out this series with me I'm sure some of you were like man Pat is still talking about this data set but you know what I've learned a lot in going through these 30 episodes and thinking about how we can represent the data how we can you know work with the data to get it into a format that we can make attractive visuals how we can make things look a little bit different than they might normally I hope you like this version of the figure I'm positive that if you had to do it you'd come up with something different right and that is a okay and in fact what I'd love for you to do is feel free to tweet at me your version of this figure you know maybe you picked a different combination of figures different pairing of figures to tell this story that's awesome tweet it at me let me let me see what you've done and we can continue the conversation over on twitter of course feel free to leave any other comments that you have down below in the comments of this video so please practice with this and we'll see you next time for something new here on Code Club