 A big, big problem that I see in scientific communication is that we expect our audience to do way too much work to draw the inferences that we want them to see. This is true of our audience, our lay public, but also if we're talking to our scientific colleagues. An underlying reason for this is that we approach the composition of our figures in an unscientific manner. Well, in today's episode of Code Club, I'm going to share with you the concept of the Z pattern and how we can use it to more effectively communicate what we're trying to say in our figures. Stay tuned and I'll show you how we can do this in R. Hey, folks, I'm Pat Schloss and this is Code Club. In the past several episodes, I've been talking a lot about a rubric that I'm trying to develop to assess and critique different visuals that we might want to make. We've been kind of exhibiting this using a scatter plot or of an ordination that I made for data that I published a number of years ago in a supplemental figure that at the time, frankly, I didn't put a lot of thought into and it kind of shows. So anyway, we're using this as an example to build out different ideas that I want to put in this rubric so that when we get a new visual, we can critique it and perhaps then have the skills and tools we need to make it better. Today, as I've already mentioned, we're going to talk about the Z pattern. What is the Z pattern, you say? Well, if you know the shape of a Z or Z, as my Canadian colleagues and probably British and other people that speak English outside the USA, it's the shape of a Z. And it's the idea that when we look at a document or a webpage or anything, really, our eyes instinctively make a Z pattern. I talked last time about pre-attentive attributes. Well, the Z pattern is really a pre-attentive attribute that my eyes start in the upper left corner of the window. They go to the upper right, then they go down a layer to the bottom left, and then over to the bottom right. This zigzag or Z pattern can be repeated many times across a document. If we take the Airbnb website as an example, when this page opened up, where did your eyes go to initially? Well, if you're like most people, your eyes probably went to the upper left corner to the Airbnb logo, right? Have you ever noticed that all the companies have their logos in the upper left corner? I wonder why? Anyway, from there, then your eyes scanned across the right. And as you're going across the right, there's all sorts of useful information that you can now key in on the website. Where are you going? Your check-in dates, your checkout dates, the number of guests you're going to have. And then on the far right side is perhaps kind of your information about how you can sign up and whatnot. And then your eyes say, oh, what's down here at the bottom? And your eyes then scan across this very attractive image. And so for more information that you can then again scan across the right and get more information that the people at Airbnb want you to see. This layout is not accidental. The web designers at Airbnb know what they're doing because they know about the Z pattern. So this is the current version of the figure that we've been trying to improve as we critique and learn more about R and GG plot. Again, the data is taken from a paper that we published seven or so years ago in MBIO. And while this is better than the original supplemental figure that we built, again, if we take this now and critique it from the Z pattern perspective, again, our eyes start in the upper left corner. Well, there's nothing there for me. I don't know what I'm supposed to do from there. So now I scan to the right where I see my legend. I now scan down and I see, oh, NMDS X is two. This is an NMDS. I then scan over. I see the data. I can see the patterns. Perhaps not totally sure what I'm supposed to be seeing. I don't totally have a context for these different names. And then I come back and see NMDS down at the bottom. My eyes are kind of all over the place. And me as a designer of the graphic, I'm not really helping my audience to read through the diagram, again, taking advantage of the Z pattern. The other thing you might recall is that in the original paper, the caption that went with it is totally nondescript. The title is overall structural differences between experimental groups. My goodness, how boring. What is the point? What's going on in this figure? We have no idea. This does not help us at all. One of the other things that we know, if we take a look at one of these other figures from the paper, is that we generally put the caption, the figure title, in a paper below the figures, right? Again, I'm going to kind of do my Z thing here, and then I'm going to look at the caption. It just doesn't help, right? Now, for a scientific publication, we are not going to change the way that we publish papers today, right? Maybe that's something we can look forward to in the long term. But if you're giving a talk and you're showing a figure, well, then you can do anything you want. And I would encourage you to do anything you want to make the graphic easier for your audience to understand what's going on. There you have the power. And again, this underscores the point that if you've got a figure from a paper, do not merely copy and paste that figure into your talk, because it's a different audience. You have different sets of constraints and really different sets of opportunities. All right. So I think we have an idea of what we can do to improve our figure. And let's go into our studio and see if we can try to implement some of those ideas. So I have the code here that generates the ellipsis around the three different categories from the figure I just showed you. So that will generate this figure, which I showed you previously. So if you'd like to follow along, and I totally encourage you to, because that's how you will learn the most. Down below in the notes here, there is a link to a blog post for today's episode that will get you the code that I am starting with today. And so if you run all that code, you should get this figure. And that again, is going to be our starting point. All right. So one of the first things that I'd like to do is let's put in a title. And then let's maybe move our axis labels to again take advantage of that Z pattern. What if we move this NMDS axis to up to the top, and NMDS axis one over to the left, and let's see how we how that looks and we'll go from there. All right, we can come down to labs which we already had started. And we'll do title equals. And then we'll say let's say healthy individuals have a different microbiota from those with diarrhea. And those with diarrhea who are positive for C difficile. Okay, and we'll put a comma at the end of that. And get our tabs right here. And again, if we run this, we now see that our title runs off the screen. I can go ahead and insert some line breaks. And we can insert line breaks. I'm going to put one after the from. And we can go to the back slash N. And I will do this iteratively. So I know where the line breaks should be. So we need one between who are that looks good. It appears that my from now has gotten bumped. I'm going to leave that there for now, because in a moment, we're going to see that we need to readjust things. One of the things I don't like with this title is that it's left justified on the axis, the y axis. What I'd rather do is have it justified on the left axis title. And so we can change that in the theme here by doing a plot dot title dot position. And we can then put in plot. So that will justify the title to the left side of the plot. And we now see that's bumped over. And I think that looks like pretty good justification. So we've got a we've got a declarative title now, right? So again, we approach this figure, we start in the top left, we read what we have to say, healthy individuals have a different microbiota from those with diarrhea, and those with diarrhea who are positive for C diff seal. So then I read to the right, I see my legend, I come back to NMDS dot access to I see this as an ordination. I can then see the graph, right? This is kind of like that pretty picture at Airbnb. And I can now interpret my graph. So that's I think a good start. Something I'd like to do is let's see what happens, or how we can maybe move this title to be justified up at the top of the axis. Again, this is a very different way of thinking about composition of a scientific graph. We're so used to having the title be centered justified on our axes. So we can then do axis dot title dot y. And then we can do element text. And then we can say h just. And h just is the parameter that if I do zero, that should justify it at the bottom, right? And if I do h just equals 0.5, that's going to put it in the middle. And if I do one, that will put it up at the top, right? Very slick. And so you can use values between zero and one to indicate where to justify the text. So that axis now is up the upper left. And so again, we can read the title, come back, we can see Oh, this axis is an MDS axis too. And then we scroll over, we come down and go back. So let's maybe put this and MDS acts as one to be left justified. And we'll do the same idea dot x element text. And then h just equals what? What do you think we should put there? Well, we're going to put zero, right? Because zero would be left justified, one is right justified. And so now we see that we've got our axis label at the left side. Okay, so this is very different already from how we are used to thinking about scientific publications. Great. One other thing I might want to do is that the spacing between this title and the title of my plot is kind of close. So let's go ahead and add. So we've got plot title, let's do plot dot title. And let's do element text. And then we can do margin equals margin. And I'll do top or yep, bottom because that's the bottom of the title equals one and I'll do unit equals lines. And need a comma there. And so we get a little bit more spacing now between our title and the axis label. So that already looks, again, better using this z pattern, which is which was really nice. Something I'd like to do that we've already talked about is that C difficile here is a vertical font, it should be italicized because it's a bacterial name. Similarly here C difficile is vertical. And we'd like that to perhaps also be italicized. So we can make use of a package called gg text. And if you haven't installed gg text already, I'd encourage you to do that. I will do library gg text. Load that make sure it's it's installed and loaded. And what I can then do is I can come to C difficile and I can use mark down to get that to be italicized. So I'll put a star on either side of C difficile there. And then also in my legend for the labels, I'll also put stars, not parentheses star, on either side of C difficile. So I'll do that for scale color manual, as well as scale fill manual. So I need to come down here now and modify this element text to be element markdown and give that a run. And we now see that it didn't work. And that's because it doesn't like those backslash ends because what element markdown does is it interprets this string in title as HTML or as markdown. And so backslash and doesn't have any meaning. So to do this, we need to use in angle brackets br. And so what br does is it's HTML code to put in a line break. So again, wherever I have that backslash and I'm going to want to put in this br tag. And now we see that we've got our line breaks where we wanted them and C difficile is italicized. We now need to go to legend, and we will do legend dot text. And we can also do element markdown there as well. And we need that comma at the end. And now we see that C difficile positive is italicized. Pretty slick, right? Cool. So even if we still had the title at the bottom, I think this would be in better shape than what we had before for a couple reasons. Again, we're putting our access labels up towards the top where your eyeballs will start. We've already italicized C difficile, which is better than what we had before. So we don't look like chumps because we don't know how to italicize things. No, you don't look like a jump. Anyway. So one other thing that I think would look really slick if this was a figure that was standalone in a report or in a presentation would be what happens if we turn healthy individuals to be gray to match our gray points, those with or diarrhea to be blue and positive for C difficile to be red, right? So let's go ahead and modify our title to have some of that HTML code in it, the CSS styling to color our text according to our different variables. Because then we could think about perhaps getting rid of the legend entirely, which would again clean up our plot considerably. So we can incorporate HTML here. So we'll use a span tag, and we'll do style equals. And then in single quotes, I'll do color. And the color needs to be given in hexadecimal. And so let's do 999999. So the first two numbers are the red channel, the second two are the green, and the last two are the blue. And then we'll close that with the angle. And then we will close the span like so. And let's see what this looks like. And so we see healthy individuals. Pretty cool, huh? And something I might think about doing is making this bold. So how would we do that? Well, up here with our title, we could put in here strong. And then we could do escape strong. And then we see we have bolded healthy individuals. Awesome. So we're going to repeat the same thing for diarrhea and for positive for C difficile. So we will do span, and we'll do style equals color colon. And then we will do hexadecimal of blue is so red, green, blue, 0000FF. Okay. And I also want to put in strong. So this title is really long. And we also want to do positive for C difficile. So here we will do span style, color. And we will do again, this is red. So it's ff 0000. And then strong positive for C difficile, put that back here. So we do back strong. And then back span. Let's see if this works. Fingers crossed. And so now we have healthy individuals, diarrhea, positive for C difficile, you know what, I'm going to make those with diarrhea, the blue, bold. And I kind of like the aesthetic here, where we have the three groups left at the left side. It wasn't intentional, but I think it works pretty well. So I'm going to change that those with diarrhea. Sorry for all this talk about diarrhea. I hope that doesn't bug anyone. When you when you study crap for a living, you know, it doesn't really phase you. So that looks pretty slick. Again, now we can get rid of that legend. And we talked about this before. Do you remember that argument that we could use? Again, yell it so I can hear you. show dot legend equals false. And that will get rid of the legend. And so wow, that is a much cleaner presentation of the data, right, where we have healthy individuals are gray. We've read this we understand what's going on. The legend is baked into the title. I come back my Z again comes back to NMDS access to I can then say, well, red is positive because I've already seen the legend blue is those with diarrhea, gray is healthy. Pretty slick, right? So again, this is not something that you would be able to publish in a journal. Because again, our conventions are so fixed by having the title at the bottom. Some people don't even like having those declarative titles, which I think is unfortunate. I think we need to tell our data, tell people what our data say, your data will not speak for themselves. Let's listen. Yep, I don't hear anything. We've got to speak for our data so that they tell us what the data say, and that the the investigator, the person doing the analysis can communicate what we want the audience to see. Cool. So even if you couldn't do that title that way, what other way could you perhaps label our plot to make it clear what's going on? Well, I'm not going to put the legend back in. I think what we could do is we could perhaps put some annotations on the plot, so that we could label this as being positive for C to facility, this being from people with diarrhea, and this from being people that are otherwise healthy. So let's go ahead and see how we might do that. So I'm going to go ahead and make a new data frame that I'll call my legend. And we'll use the table function to make this. And so we'll have x, and I'm going to make up some numbers here. So this is going to be a vector of the x positions for where we want the three labels. And so we will do, I'll do negative point seven. I will then I'll do a negative, another negative point seven. So the first is going to be for cancer, the second. So what I want is I want a label here. So that'll be at about negative point seven down here at about negative point seven for healthy. And then maybe one up here, again, thinking about a Z for the blue, and that I'll put at point seven point seven. I'll rough in these numbers. And then we will we'll bump them around a little bit to get what we want. And then why I'll do C 0.7 for healthy. I'm going to put that at negative 0.7. And then why will be 0.7? Okay. And then I'm going to do label. And I'm going to borrow this vector like that. And I think that looks good. So we'll do my legend. And as we've already seen, we can add multiple geomes from with different data. So we can do geome label or geome text, sorry. So we'll do geome text. And my data will be my legend. And I will then say AES for the mapping. So x equals x, y equals y label equals label. And color equals label. And then we will add that and I'm going to go ahead and do inherit dot AES equals false. And I'm going to be doing some modification here. So I'm going to put these three arguments onto separate lines. Let's give that a run and see what we get. And so we get a mess. No, it looks cool. So we have this legend that it puts in for the geome text. So let's go ahead and remove that. And we'll do showed out legend equals false. And so we get rid of that. And so I now have diarreal control kind of in the wrong place in case in the wrong place. And I put these in the wrong order, obviously. So this needs to be case. And then non diarreal control should be here. And that looks good. And again, we're doing some iteration here to get to the point we want. So we have case diarreal control, diarreal control. So I don't I don't want this to be labeled. I want this to be the vector for my color. So I will do that. And then for my label, I will then do so that the color column will tell ggplot what color to make the text. And the label is what I want it to say. So I'll do see difficile positive non diarreal control again is healthy. And then the third one will be diarrhea. Okay. And that's good. And so I need to update my mappings down here. So label is going to be label color is going to be color. It wasn't happy with color. So what did I do wrong? Make double check that I ran that. There we go. Gotta be sure we run that. So again, we get our labels in the way we want it. With the text we want, we still need to fix the formatting. And roughly speaking, the position we want them, and we've got the right color. So now we need to kind of bump things around a little bit. And before I go too wild with that, I'm going to use a different geome that actually comes to us from the gg text. And so that's going to be geome rich text. And that will allow me to use markdown and HTML within my labels. And so if we run that, we'll see that the labels look a bit different than what we had with the normal text. We've got these borders around the label as well as a white background. So we'll want to go in and we'll clean that up here in a moment. So let's go ahead and up in our C difficile positive. Let's go put in that break again with the br. And we can also put stars around C difficile so that it's italicized. And we then see we've got C difficile positive with the C difficile italics. We still have that that border which I'm not such a fan of. This is also center justified. And what I can do is here in geome rich text, I can do H just. So we've seen this a few times right zero. So it's left justified. And so then we see that we've left justified the text as well as the position. So before it centered this rectangle on negative point seven point seven and all of these right. So again, so why we didn't go too wild with futzing around with the position of the points. So I think that looks pretty decent. We do need to get rid of the border and the background and then get things positioned how we want them. So to do that we can do fill equals na that will get rid of the background. We can also then do line dot color. I believe it is equals na. Sorry, I didn't mean line color. I meant label color. Give that a run. And so now we've gotten rid of the labels and we've gotten rid of the background color. So now we need to kind of adjust the labels to get them to be in the right spot. And I'll do this again, iteratively, where for my legend, for the case, I want thing it to be a little bit to the left. So maybe let's do point eight five, and maybe go up a notch. So that's in a good position, the diarrhea, we also want it to be up and over a little bit. And so that is going to be point eight five, let's do that. And then let's do point six five. I lost it. I think I went to the right instead of to the left. Yeah, so let's do point six five. And this should have been up not down. Again, the nice thing about having this all coded is that we don't have to think too hard about it, we can we can change and rerun it. So we need to be down and over a schmidge. So down, let's do point eight, and then let's do point six. We've got the H to go to the left a little bit. This is kind of one of the more tedious parts of building these out, right? And maybe a little bit more to the left, maybe just up a little bit. So again, this is pretty tedious, right? And again, something important to keep in mind is that it's important to be doing this in the format of the image you want to generate something I'd like to do maybe is make these bold so they pop kind of like what I had up above. So let's go ahead and do that. Again, we will do strong. I think it's pretty slick that we can use HTML within our labels. And and it is perhaps a reason to maybe learn a little bit of HTML and CSS to get some styling that, you know, allows us greater flexibility in how we present our visuals. So that's bolded. And it can read just everything again. Why can't things just stay? So anyway, maybe don't want to spend too much on this because it does get kind of tedious and I'm sure this makes for just horrible viewing. So that's in good shape. Let's bring the diarrhea over a smidge. Things I never thought I'd say out loud move the diarrhea over a smidge. So 48. And I think that looks that looks pretty nice. So I'm getting weird margins going on here. So I think something that I'll do is turn off this chord fixed and make this chord Cartesian. And that again, gives us perhaps a little bit more flexibility moves everything around. But you know, maybe I'll just for the one last time, fats with the X on the case to make that say point eight five. And the diarrhea, I'll move that over a little bit as well. Let's make that point five two. And I think that looks pretty good. So again, I think this is a really nice illustration of what we can do. I don't know that I would want to do both the title as well as the labels. I think that is kind of redundant. I like how we have the coloring in the title. I think this would look really well as a figure going into a presentation again without these labels. But you know, for for a paper, you know, something you might have to do is, you know, turn off the title, right? So we could, you know, this is this might be what the figure looks like for the paper. And I think this looks really nice. You know, maybe we want to make the gray a little bit darker. We probably want to think more about the colors that we're using here. But again, that's not the topic of today's episode. Anyway, what I hope you gain from this episode is again, thinking about the Z pattern, and that when you see a figure like this, your eyes instinctively go to the upper left corner, and then they move over to diarrhea down to NMDS access one and healthy and then back over. And that's how we consume information. If you try to pay attention to your eyes, that's what's going on. And I think we have a lot to learn from cognitive psychology about how people interpret and understand what's going on in a figure. Now, let me give you what I think is the figure that I'd be most happy with, I would leave the title. And I would probably comment out this geom rich tax and leave it at this. And I think this again, is a really attractive visual that I would be happy with. If you disagree with me, by all means, please put a comment down below in the notes. I'd love to get your insight on what you might do differently. I think this is a very different way of thinking about visualizing a data. I've talked about empathy in the past for our audience. And so we need empathy for our audience to help them to see what we want them to see. But we also need empathy for our audience to respect that they're perhaps not used to having titles justified at weird positions on the axis, right? And so it might be a bridge too far to top justify your y axis or to left justify your x axis label. And, you know, we have to accept that that not everybody's ready for these types of innovations or differences in how we present data. We present data for a reason. In some ways, it's because we're kind of like stuck in the mud and we just can't change. But anyway, I think there's a lot of room for empathy, both in terms of helping the viewer to consume the information, but also respecting that we kind of need to take baby steps in helping them to understand, you know, what we're trying to help them do, right? Sometimes people just aren't ready for our help. Anyway, I hope you're finding this useful. Again, if you have any comments or your own critiques about this visual, by all means, let me know. We're going to be playing with this a little bit to illustrate different points in future episodes. If you have a visual that you would love to see critiqued and think about, you know, get some help in thinking about how you can make it better. By all means, shoot me an email, and I'll see about maybe getting you on a future episode of Code Club, and we can kind of talk through some of the constraints that you're going through and what your ideas were, and perhaps, you know, different ways that we can use our use packages from the tidy verse, things like ggplot, ggtext to come up with a little bit more of an attractive presentation. Please tell your friends about Code Club, please be sure to like this video down below. It gives me all sorts of warm fuzzy feelings when you all like it and when you comment. So please keep that going that encourages me to keep making these. Anyway, keep practicing with this material, be sure you go through this on your own, see if you can tweak it to make it your own, and we'll see you next time for another episode of Code Club.