 My goal for this episode is to critique these two figures so that in the next episode we can do a mashup between those two figures to take the good things and make a improved dumbbell chart of these two figures. Hey folks, I'm Pat Schloss and this is Code Club. If you've been here before, welcome back. If you're brand new, well welcome. And I also am going to put a link in the upper corner of the screen here for you to go and watch the rest of this playlist that we are in the process of developing. Over the past six or seven episodes or so, we've been trying to recreate two figures looking at data from August and October of 2020 from people surveyed across the world, 15 different countries, kind of prospectively asking them whether or not they would be willing to receive a COVID-19 vaccine when it eventually became available. Of course, it is now available. Even though the data are old and maybe a little stale at this point, I thought it was an interesting set of data to work with. And again, to try to create this version of the Ipsos figure solely using our tools without trying to poke or prod anything using something like Adobe Illustrator or PowerPoint or any of those other tools people use to try to make their figures look a little bit better. Also, over the course of about five episodes, we went and took that Ipsos version of the chart and modified it to look like a figure that I received in an email newsletter that I get from a group called chart R, where they applied a little bit more styling to the figure. And I thought that was a lot of fun. And I know I learned a ton about using ggplot to do things with color, to make strips in the background, to add new fonts, to add those arrowheads, just a lot of really cool little things that, you know, maybe you're not going to make a plot like this for your own data. But hopefully there's little elements of what we did to create this chart R version that you can now implement in your own figures. If you like this type of work, well, you know what, I strongly encourage you to sign up for an email newsletter that I produce. I've put out two or three releases so far. And I would like to have you join as well. So if you go to riffamonus.org and go to the bottom of that screen, there is a very simple sign up, you'll receive a weekly newsletter with, you know, my random thoughts on things, some practice exercise problems using R and other practices and reproducible research, as well as some announcements about other things we have going on in the riffamonus professional development program. So please sign up and you'll get your first email next Friday. As I mentioned in the newsletter that I sent out last Friday on August 27th, I've been binge listening to a podcast hosted by a guy named Chris Doe who is the CEO of an organization called the future. And their goal is to teach a billion people how to make a living doing what they love. He's primarily a designer, a graphic designer, motion designer. I find the content just really fascinating. And he was talking to a host in one of the episodes and sadly I misplaced it. And this this guest was commenting on how he remembered reading a writing from some philosopher that was talking about a utopia, where you couldn't critique things without first having created it yourself and tried to improve upon it. So that's basically what I've been doing over the past six or seven episodes is to develop something right to develop what ipsos and chart are developed. And now what I want to do is think about well what worked well and what could we do better. And so my goal for this episode is to critique these two figures so that in the next episode we can do a mashup, if you will, between those two figures to take the good things of those two figures and make a improved dumbbell chart that has aspects that I like at least of these two figures. Along the way, what we will no doubt learn is that the people that develop these figures, they had certain limitations, you know, the data may be limited, there's limitations of the media, and that there's just constraints on what you can do. And so there is no perfect visual. And so while I can come up with a critique, and I can then try to implement that critique and try to overcome those problems, my version is also going to have limitations. And so that's just something that we need to learn and we need to learn that lesson really well. I'm going to be working on the two figures here on my tablet and marking it up with my pencil. And so the first thing that I want to do when I look at one of these figures is say, you know, what do I think about this? What do I see? And I'm not going to apply any judgment, right? So I could be looking at the most hideous figure with all sorts of horrible colors and fonts and whatnot. But what do I see, right? If you strip away all that stuff, that's perhaps a little distracting. What do I see? And when I looked at this figure for the first time, one of the first things I said 15 countries, the US is in there. Where is the US? And so I found the US and that we were second to last. So we're not doing great, but we're better than France, at least, right? The other thing that I found to be really interesting is that, you know, sure, we dropped about 3 percentage points, but, you know, that's probably within a margin error. And I kind of figured that that was kind of holding steady at about, you know, 64, 67% kind of high 60s, right? So at a qualitative level, perhaps we're not moving that much. So the second question I had was why were India and China doing so well? They were at really high levels, 87% for India. And in August, 97% for China, even in October, where numbers had fallen of 85%, right? So why were they so much higher? And also, what happened in China? Why did the numbers fall so precipitously from August to October? The third question I had was, it seems like all of these countries are falling off, except for countries like Germany, Mexico, and South Africa. And in those three countries, actually the numbers of people saying they would receive the vaccine increased, right? So again, this got me thinking and asking lots of questions. And I think any good data visual is good if it can provoke you to ask more questions. Hopefully it also answers questions, but I like this because it was provocative and it got me thinking about other questions that I would want to know. As this figure was placed in the initial report, it was the very first figure in the report. So it really set up the rest of the report for this August-October comparison data. And so I think they then use this as a spring board to get some summary statistics and then do further analysis later in this report. So now that I've stripped away any kind of judgment and thinking about what the figure is saying to me, I also want to maybe think about who is the audience for this. And so I kind of considered that the audience is most likely policymakers. So this has a very professional tone to me. It's a lot of black and white, dark, subdued colors. We'll talk about why that might actually be a problem here in a moment. But this kind of looks like something that would be going to people in think tanks or people that are going to maybe restyle this to reach out to the general public. So I think if the audience are public policymakers, then I think it works pretty well. As I mentioned, I think this is a good figure because it's provocative, right? I have lots of questions. I also like how they dealt with the paired data and that you can see that they have, again, what we had talked about before, a dumbbell chart where they had the two weights, if you will, if you can think of a dumbbell or barbell, being that the two points in time from August and October, and then the handle being kind of a connection between those two points. And I think that worked really well to pair the data. We'll talk about other ways that we might do that in the future, but it was certainly better than having a bunch of points for the 15 countries from August and then a bunch of points from October without any connection to them. And so I liked how they did that. And I think there was a pretty good trade-off to having those two points connected and then having that connected with the country name. Some other approaches that we might think about like a slope graph might get a little bit too busy because the country names might be all running into each other. I also really liked the caption and that they told me the number of individuals, they told me the ages, the countries. As we look at the original version, they tell us the source, who they worked with to generate the data, how they created it, that you can get the data by going to that link. So I really like what they built into the caption. They didn't need a lot of words to say that, but I thought it was really effective. So now let's turn to thinking about what we don't like about this figure. And again, these are my own opinions. You might have different opinions too. And if you have other opinions of what you like or don't like about this figure, by all means, down in the comments, please let me know. So the first thing that I struggled with with this figure is what is it saying? What is the sottori? So is this figure trying to shame certain countries like the United States or France, right? Or is it trying to highlight the declining numbers of percentage of people willing to receive the vaccine from this depiction of the data? It's not immediately clear. They've sorted the data based on people's responses in October. They haven't sorted it by the decline or by the increase in people willing to receive the vaccine. So it's a little bit confusing what they're trying to say. Again, this figure was part of a bigger report where the headline was something about declining numbers of people willing to receive the vaccine. But if this figure was going to stand on its own, then it's a little bit more challenging. And again, if the story is about declining numbers, then I'm not totally convinced that this is the best way to show the data. My next two critiques of this figure come back to the fact that I don't really like clutter in figures. I like to keep things as simple as possible. So when I look at the legend, I have two problems. So first of all, the total agree is repeated, right, for both the October and the August caption on that legend. I like this legend. If I didn't mention that before, I kind of liked having this as an alternative to having that legend off in the right hand margin. The other challenge that I had, there's not enough contrast in the colors of the October and August data for me. I kept struggling to remember which one was August, which was October. Again, there's two points. I know I should be able to keep track of this. But I think now, after making seven videos or whatever it is about this figure, I finally remembered that gray is August and blue is October. It shouldn't be that hard. We could put a more muted color for August and a brighter, more contrasting color than this kind of teal color for the October data, which would hopefully kind of visually signify to me that, okay, these data from October, and they're comparing back to August. So the third thing I didn't totally like about this figure is that there's so much text on here. And so some of that is good because they are labeling the points. The scientist in me wants to get rid of all the text labels, all those percentage labels on the points, and have people look at the x axis and then map up. At the same time, you know, maybe that makes things harder because you're kind of always looking down and up, down and up across the figure. So I think what I might prefer to do is remove the percentages on the x axis, because we've got those numbers next to the points. So one problem with this, though, could be that it's a little bit misleading. It could be misleading if you don't know the full range of the points. But as we can see, France is at 54, and China had this point up at 97. So it is really everything from 50 to 100%. And I don't think it's that deceiving. And, you know, the reader is going to have to trust us on some level that we're not doing some weird compression in the x axis. And I think if we keep those grid lines in, it'll make it clear kind of where those 5 percentage point increments are. So this is one of those stylistic points where we're going to make a decision perhaps for simplicity that may have detrimental effect on clarity. We don't know until we ask somebody how they interpret it, you know, with our final version of the figure. So the fourth critique I have of this figure is that there is a line in here for the total. So these are all countries along the y axis total is not a country, right. And so an idea that I had was what if instead of having the total, you know, we could have a vertical line to indicate where the total where the average was from October, and we could have, you know, a vertical line indicating where the average was for August. And so kind of having those vertical lines would be a relative position that we could then compare our data to for each of the countries back to those averages. Again, if the point is comparing change, right, if we're looking at change for all these countries, then maybe having total in there isn't super relevant, except that it is kind of showing the average change across all 15 countries. I don't know, I really struggle with that. And again, it comes back to what is the point of this figure. If it's to compare countries, then I think it would be best to have those vertical lines. If it's best to compare kind of the change in opinions, then I don't know, maybe we live it in there, or maybe we choose a different way to represent that change in the data than these barbell charts. But if our goal is to compare the change in people's receptivity to receiving the vaccine, then maybe we just get over our qualms about total not being a country, we just leave it there in the y-axis. Or alternatively, we start thinking about a different way to represent the data, perhaps something like a slope chart where the x-axis would perhaps be the date, so August and October, and the y-axis instead of the countries then would be percentages, and we'd have a separate line for each of those countries. That's not a perfect solution either, but maybe we'll talk about that in a future episode. The chart R version of the figure kind of provoked a lot of the same questions that the Ipsos version of the figure did. I actually saw this chart R version before I saw the Ipsos version. So all those questions I had about the Ipsos, I also had for the chart R version. So let me tell you what I liked about this figure. First, I think it's pitched not so much to a technical audience, but to a general audience. Whenever I see this type of styling, I think, well, that's not, you know, professional or technical, right? I think black and white is technical and colors and funny fonts, that that's more of a whimsical or more kind of public access to science. I don't know why I feel that way, but I just feel like this is being pitched to a more general audience. And heck, I received this as part of an email newsletter looking at data visualizations that half the time they're talking about Marvel movies. So this is not really like, you know, a very technical audience that's receiving this figure by any means. The second thing I like is that the title is different and it tries to tell a story. And so we'll come back and say why that might be a problem for this particular figure, but at least has a title that tries to kind of communicate something. So vaccine skepticism by country. One of the other things I liked about this figure was the legend. I like that they have the question from survey as the caption of the legend, it's at the top. And so as I'm reading the figure, or if you kind of go back to that episode I had on the Z pattern, they go to the title and then they come back to the legend, which is showing us the question as well then as the two dates and the colors that correspond to them. I feel like these colors have good contrast, which at least helps me to understand kind of the difference between the two points. I like the contrast of those two colors and how they linked it to the question at the top of the legend. I thought both of these figures had pretty nice ways of approaching the legend. The other thing I kind of liked were these arrows and that they show the flow of time, right, and so that we can easily see that the blue was August and the green was October. And so there's this flow of time. The final thing I'll say about this figure that I liked is I like the compact nature, right, and that this is a compact format. It's not more landscape, it's square. And so the data are much closer to the names than they were for the Ipsos version, where the Ipsos version was more rectangular, the points were quite a bit of ways from the name. And so I like that. And I hadn't thought about that, that kind of thinking about the dimensions of the figure and helping it to kind of compress the data to be close to the country names. So what don't I like about this figure? Well, first of all, the data do not measure skepticism. They measure people's willingness to receive the vaccine. That is not skepticism. That is willingness, right? So vaccine acceptance or vaccine willingness to receive by country would have been a better title. Again, these numbers on the bottom are acceptance. They are not skepticism, right? If we wanted to look at skepticism, then we'd probably need to flip the x-axis and do 100 minus whatever value we see on the x-axis. And then we might want to sort the countries by their skepticism. As it is, it's sorted by acceptance of a potential vaccine. So again, I find that problematic in the title. So again, they tried to tell a story, but I think they got the story backwards. Another big problem with this figure is that this wasn't the actual question that was asked. So if a vaccine for COVID-19 were available, I totally agree I would get it. No. The question was, I strongly agree I would get it, or I somewhat agree that I would get it, right? And so these numbers are the total of somewhat and strongly. I not totally agree, right? So totally agree. And, you know, American English would mean that like, I'm going to get it. I'm absolutely going to get it, right? Whereas here, it's a total. It's a sum. It's a summation. So I mean, that's maybe a small point, but I think it's, I think it's significant, right? Because it's indicating that these people are like, you know, they're gung-ho to get the vaccine when it's perhaps more of like a somewhat, right? So maybe a little bit of hedging on whether or not they'd receive it. When I was talking about the Ipsos figure, I commented that one of the things I always notice is when there's just too much stuff in the figure. And so although I like the arrows and I like the dumbbell chart aspect of this, I kind of wonder if we need both. What if this instead had just been an arrow pointing to the left and this one been an arrow pointing to the right, right? Perhaps we could have different colors so that if it was a drop in receptivity to receiving the vaccine, it'd be red. And if it was an increase, it could be blue. Who knows? I think where that falls down, however, is in these cases of say like India and Canada, where there was no change. And so unfortunately, one of the problems with this layout where they're only giving you one of the two numbers, remember the Ipsos version had both numbers from August and October, is that I'm starting to wonder now, was there August data, right? So there were August data. But the way this is shown, I don't get to see the August data. Again, with the Ipsos version, they had, you know, for India, they had 87% on both sides of that point. So I think in some cases, it's more thinking about using an arrow to show the direction of time as an alternative to a dumbbell chart like this. But I think you probably need both and you probably need both numbers to make it clear that you're not missing the data. So I'm not sure what to think about combining the arrows with the dumbbell chart. I think if you pick good colors with as high contrast, then maybe that will do a better job of showing the flow of time than what we saw with the Ipsos version of the chart. That's something I want to experiment with in a future episode. But the arrows, you know, might also work as an alternative to the dumbbell chart in a case where you had a difference for all countries say between August and October. So that's, you know, a technique that's worth keeping in your back pocket for another day. I'm not going to fault them for putting the arrows on. I thought it was a nice touch, but it might just kind of clutter up the figure. So those are my thoughts on the two figures, things I liked and things I didn't like. I tried to have more things I like than what I didn't like about the figures to always try to stay positive, right? Anyway, my goal now for the next episode is to come up with a few ideas of things I liked from both figures and to meld those together to make a new figure, as well as thinking about some of the challenges both figures had and propose new solutions to overcome those challenges as we make a revised version of those two figures. So the first thing that I want to make sure that I do on these figures is to have an angle, to have a story that I want to tell. And I think the story is actually this headline from the report that the figure came from. COVID-19 vaccination intent is decreasing globally. That should be like the title of this figure, right? And then we could have a subtitle again between the title and the body of the figure that would be kind of like what we had for the legend of the chart R version, or it was actually the title of the Ipsos version, but getting the text right. So I think that would be a good subtitle to make it clear to people what the actual question was as we're measuring, you know, vaccination intent globally. The next thing I want my remixed figure to incorporate is that alternating colored background, maybe we'll do something with like a gray and a white, perhaps to be a little bit more subdued than what we saw in the chart R version that might also help us to find more contrasting colors to get the August and the October data to be more distinct from each other, and perhaps easier to recognize for their time points. So thinking about that problem of having total on the Y axis, I think what I'd like to do is remove total from the Y axis and to use the legend scheme that was found in the Ipsos figure where we used those text boxes to indicate the legend and to draw vertical lines that have the colors of the two dates of points that then connect up to those two text boxes. I'd like to give that a shot and see what we get there. It might not work as well as I think, but it's worth a shot. Next, I'd like to remove the X axis text. I'd prefer really to get rid of the numbers next to the points, but I think those are important for a more general audience so they can see the numbers next to the points and also for, you know, any viewers that they're not again scanning vertically on the figure to try to connect the point to the number down below it. Also, what we'll do is I'll go ahead and try to make the figure square to have that square orientation. We have a more compact framework so that the country name is closer to the actual data. So those are my plans for the next version of the figure. Please make sure that you're subscribed to the channel so that you know when that episode is released, and of course we will look at that and it will not be perfect, but it will be our next iteration on these two great figures. And, you know, from that critique of whatever I make, we'll make another version of the figure and maybe another version after that. Who knows? I think this is a fun set of data that's really allowing us to explore different aspects of data visualization and using ggplot and the R software environment. I'd strongly encourage you to take one of your figures or one of your favorite figures from the literature and do a critique like I did on these two figures, and then if it's your own figure by all means go back and see if you can't make that figure better. I look forward to building this next version of the figure with you and for you to then give me feedback. I think that's a really important thing is that I can make all the figures in the world, but until I share it with people out in the world, I don't know how they perceive it, right? And so again, that's a place where the developers from ipsos or chartar make themselves vulnerable, right? They release this to the world and they then get feedback on whether or not we like the figures. Well, we need to kind of get feedback before we release it to the world. So definitely I will be looking for your feedback and you should also be looking for feedback of your figures with your peers in your community around you. All right, keep practicing and we'll see you next time for another episode of Code Club.