 In scientific publications, it's common for a single figure to actually be made up of several or maybe I should say many other figures. How do we do this in R without having to resort to putting things into PowerPoint or Illustrator or some other program? Can we do this all in R? Yes, and we'll do that in today's episode of Code Club. Hey, folks, I'm Pat Schloss. And today, as I said, we're going to build a multi part figure using ggplot2. We're also going to use a package called Calplot. Calplot is a package that was developed by Klaus Wilke, who is a professor at the University of Texas. Cow, it's not about cows. It's about Klaus O. Wilke. Hopefully I'm pronouncing his last name right. Maybe it's Wilke. I don't know. But it was a package that his lab made or he made for his lab, to put together a bunch of functionality for figures that, you know, they needed the same types of functions over and over again. As we've been going through the recent episodes of Code Club, something that I've been trying to do is rebuild figure one from a paper my lab published many years ago. And so we spent a fair amount of time making a bar plot and then a strip chart showing inverse Simpson diversity index. That's the same as figure one A. Last few episodes, we've been looking at receiver operator characteristic curves. We saw those in BCND. So we've got our strip chart to replace the bar plot and a figure one a we now have the rock curve data that's in BCND. But it's all in one figure to show how we could go about building a multi part figure. I'm going to use cow plot today. And then next time we're going to use another package called patchwork that I've actually never used before. In my papers I've used cow plot. So that's what I'm going to show you today. As we head over here to our studio. I've got to our scripts that we've been working with in recent episodes. So the first one Schubert diversity dot R builds the strip chart showing the inverse Simpson diversity for each of the three different disease status groups. We also have Schubert rock dot R that builds the receiver operator characteristic curve. So what I'm going to do is I'm going to kind of scavenge from these two files and put all the data together all the figures together to make that one four panel figure. So I see my strip chart here with healthy diarrhea and cdiff negative diarrhea and cdiff positive. And then I've got my receiver operator characteristic curve using a higher level of inverse Simpson diversity to indicate a higher status of health. Again, we'd like to break this apart to do that. We basically need to create four figures. I'm going to pull everything together and I'm going to make a new R script why don't we save this and I'll call it figure one dot R and I do like to have one R script for each of my figures. So figure two would have its own R script and so forth. And I'll start with Schubert diversity are copying that over into figure one dot R. You can go look at the previous episodes for how we came up with this but it's all there. And what I'm going to do is instead of saving this back out as Schubert diversity tiff, I'm going to go ahead then and add this stuff where I was adding the lines and stars to the overall plot. And so this then gets saved as strip charts. So now I have an object called strip chart that has, if I call strip chart, it will then build out my strip chart. So that's figure one a now for figure one BC and D we're going to want to look at all this other good stuff. So we've already done the stuff with reading in the metadata and the alpha diversity. I'm going to want all this other stuff for the disease with inverse Simpson to build the rock curves. And again, let me run all this to make sure it works. This gets a little bit messy. And again, this old code really isn't relevant except to get us to the point of having something real that we can actually plot. If I come down now to my rock curve where I'm building the plot, you'll notice we're doing this mutation to change the order of the disease status groups. I'm going to have rock three rock curves each with one line. So I don't really need that information anymore. And my color, I don't need color equals comparison. I'm going to group it by comparison. And I'm going to make my color black. So my AB line here will be color equals black creates my rock curves that my three rock curves now are black. And what I need, as I said, is three separate rock curves, three separate figures, each with its own line. So I could use a filter. And I could say comparison equals equals non diureal control diureal control, right? And I can then pipe that in. I now have a block line for that curve. Alternatively, instead of non diureal control diureal control, I could say do diureal control case. And so then I get the diureal control case curve, I could keep repeating that but that wouldn't really be dry, right? So what I'm going to do is I'm going to turn this into a function. So I'm going to do get rock curve function. And we've talked about making functions in recent episode. And so I'll then use the curly brace and tab all this over so it looks respectable. And so we're going to take in the comparison that we want to use, I'm going to call it test. And so that instead of diureal control case, I'll put in test. And so if I if I load this up, I can then do get rock curve. And then for my test, let's do what I haven't done before the diureal the non diureal control case, we then get that line, right? I can now very easily run this three times. And I'm going to also get rid of the scale color manual because I'm not using that anymore. And that will just clean up my code a lot and make things easier to deal with. All right. So we have non diureal control case. Let's do diureal control case and then non diureal control diureal control. And so I'm going to call this ndcc. And then I'll call this DCC. And I'll call this ndc dc. And I'll go ahead and get rid of that gg save. And I run that now I have these three plots stored as three different variables. So if I do ndcc, I generate that curve, if I do DCC, I get that curve. And if I do ndc dc, I get that curve. And so you can see that we have those three different curves. So now we have four figures, we have our strip chart, we have ndcc dcc ndc The function that we're going to use is called plot grid. And that's coming to us from cow plot. So I need to go ahead and do library cow plot, and make sure you've got that installed and loaded. It's not part of the tidyverse. So you'll probably need to install it as a package if you haven't already. And so for plot grid, we can then give it the four plots that we want to plot. So we can do strip chart ndc. Let's think about what the order we want to do this in. Let's do the ndc dc ndc, see, and then the DCC. We can also then do gg save. And I'll call this Schubert fig one dot tiff. And I'll do width equals six, height equals six. By default, it's laying out these four figures as four different panels in a square. If I were to do n row equals one and call equals four, then I can force it to put it on one row with four columns. Of course, this doesn't look great. We'd want it to be a much wider figure than it is. And again, we could flip that to get the same type of result. I think I kind of like the two row two column look to tell you the truth. I would probably prefer this to be a two paneled figure one with the strip chart one with the rock curves. If you recall from the original paper, though, each of these panels with the rock curves actually had multiple models represented. So I'm not going to get into that building those other models. But again, I'm just trying to illustrate how you can build multiple paneled figures using this plot grid. And so that's that's pretty slick. And again, what I'm plotting isn't relevant because you're not going to make this plot for your paper, you're going to be putting in multiple panels yourself. We could also change the width of the two columns. We could do rel widths equals and then this will be a vector. Let's do 0.25 and 0.75 12 like a three to one ratio. And so we see that the first column is really narrow. And then the second column is really wide. Instead of rel widths, you could do rel heights. Right? So the first row now is really narrow, not as high. And the second is deeper, right? And again, depending on what you're trying to do, you can alter the size and configuration of those four panels. Of course, for my plot, what I want are two rows, two columns, and I want my relative widths to be even so I can do one comma one, that's the default. One thing that stands out to me is that the names of my disease status groups on my strip chart are too wide. Maybe what I could do would be to make the font a little bit smaller on those x axis labels. So in axis text x, I'm using element markdown, I can give it the argument size. And I don't want it to be too small because I want people to read it. So let's do size equals seven for a seven point font and let's see where that gets us. So it's okay, it's a little bit tight there. So let's go ahead down to six. We don't want it to be too small, right? Now we see we've got those three labels, smaller, a nicer font, almost makes me wonder if you know what if we were to flip it on its side? Would we have to deal with that? Anyway, these are the types of issues that when you start composing figures, you have to start thinking about right. And this again is why I like to work in with the actual TIFF file, specifying the dimensions that I want, knowing the dimensions that the journal allows me to use. I picked an overall width of six. I think I could probably go up to 6.5 on an eight and a half by 11 piece of paper, building this out in the actual dimensions that I'm going to be submitting the paper in and the figures in is a lot better than dealing with things in the RStudio window. It's also more reproducible. And so I don't have to worry about exporting by clicking, I can export by using gg save and that that works pretty nicely. Okay, cool. So one other thing that we need for a figure are our labels, right? So A, B, C and D, you might recall that I think I did A, B, C, D or something like that. It'd be ready to go again in the Z pattern. So A, B, C and D. So how can we add those labels? One option is to add labels as an argument to plot grid, where we can do C, A, B, C and D. And that puts a nice bold label A, B, C and D right in where we want it. And so that's again, pretty slick. And if I were to again play around with my relative widths, so let's make this second column even wider, it still knows to put those labels in the right position that back to one. So the label is set to 01. So zero on the x axis, one on the y axis relative positioning for each of the panels, you can modify that if you want. So you don't like exactly where it's putting it or maybe you sometimes might need to move that label to get a right way from some text. So what you could do is give it label underscore x, and then give it your relative position. Let's do zero, 0.25, 0.5, 0.75. And we could also do label, underscore y, let's do C equals one, 0.75, 0.5, and 0.25. And you can see by using that label underscore x label underscore y, we can change where that label is being positioned. I'm going to stick with the defaults. So this layout looks pretty good. We want even size columns, even size rows. But perhaps you would want, you know, a to be narrower than B, and C to be wider than D, right? And so in that case, like the relative width relative height doesn't really work. I'm going to copy this down. And I'll comment this out to save it. And I'm going to create effectively two plots that are merging together. So we'll call this row one. And we'll do plot grid strip chart and DC DC. And we'll have one row, two columns, and let's make the first column a fourth of the second column. And our labels then will be A and B, right? And so now we have again, one row, two columns relative widths, one and one. Let's again, let's just for demonstration purposes, skew at the other direction. And then our labels will be C and D. And then we can do plot grid row one, row two, and row equals two. And so here we see that we have in the first row, the first column is really narrow and the second column is very wide. And then for the second row, it's the opposite. Again, if you're trying to put together much more complicated designs and layouts, you can build up your plot this way by effectively, you know, running plot grid as we did three times, right? And you could do it many more times, depending on the complexity that you need. I don't I don't need that complexity because again, the plots are all kind of the same size and I'm happy with that. So we'll leave that alone. The last thing I'd like to do is add a label to each of the three rock curves to indicate what comparisons are being made. To do this, I'm going to modify my get rock curve function. First, I'm going to create a vector, a named vector that I'll call pretty names. And it's a named vector because I should be able to give it a name like the test value and output the pretty name, right? And again, we have non-diarrheal control case. That is the name that will be given. And then we want it to output a pretty name. And so for that, I will then say healthy versus and then see difficile positive. And we can then come down and we'll then also do non-diarrheal control case or non-diarrheal control versus diarrheal control. And so we'll say healthy versus see difficile negative. Right. And then the final one we want to do is the diarrheal control versus the case. And so then that will be see difficile negative versus positive. Okay. And so again, these are pretty names. Must have an extra commas in there. Yep. And so if I give if I say pretty names, and then in square braces, I do then diarrheal control underscore case, it will then output the pretty name. So coming down to get rock curve at the end here, I will then do geom rich text. And the rich text comes from the gg text package. And we can then do data equals a tibble. And I will then say x equals 0 point, let's say 75 y equals 0.25 label, pretty names, test AES x equals x, y equals y label equals label. And then we'll also want to do inherit dot AES equals false. And we get our markdown included and we've see we've got this border around it that we'll want to clean up. But also our labels are quite long. So let's go ahead and put in a line break after the VS. And again, we can use HTML, the br tag to impose that line break. Let's see, this isn't looking so good. So let's go ahead and put the C diff negative. Let's put the break there. And I think it's it's looks a little bit better, but maybe it's too high. What it's certainly doing is centering it on point 25 point 75. Let's go ahead and drop it down a little bit. And so let's see where we had geom rich text. Let's put this to 15. So these labels have a color right so the color of the border as well as a fill, I can turn those both off by doing fill equals na and label dot color equals na. And that then drops off the comparison. Something I can't help myself but want to do is to go ahead and put in our pretty colors that we saw previously. I'll insert some HTML and CSS here. So we'll do strong style. And then in single quotes, because we're already using the double quotes, I will do color, and then be be be semi colon close that out. And then close out the strong, we'll repeat that down below for this other healthy. I think this looks pretty nice. What I'm trying to illustrate here isn't going into the weeds of making this figure look amazing. What I really wanted to emphasize was cow plot and how you could use the plot underscore grid function to get different types of layouts that you want for multi panel figures. This is relatively simple with four panels, all equal size. I've seen, I've seen figures that maybe have to like P facets that P different panels in their figures. And so that's just like a crazy number of panels. This achieved the goal of creating a layout for a scientific publication using cow plot. In the next episode, we'll do the same type of thing, but we'll use patchwork, a newer, I think perhaps more feature rich tool for making layouts of figures for scientific publications. Well, I hope you found this interesting. I know that making multi panel figures is a real challenge for people. I know back in the day before I knew about cow plot or have these different options, we would make the four individual figures, we then go put them into something like PowerPoint or illustrator, and then kind of do our composition there and never looked good. And variably, we'd have to change something in one of the figures. So then we'd have to kind of redo the rigmarole. And it was just a pain. Also, the resolution of going through something like PowerPoint is never as good as the resolution of doing it directly from within our also, I hate Adobe Illustrator and I just don't want to ever use it again. So being able to build these complex figures directly in our really heightens your reproducibility and is and is really quite powerful. Again, I hope you found this useful. Thank you for sticking with me through the end of this episode. I really appreciate those of you that are watching these. Please let me know if you find this useful and that if you're able to use this for your own papers, that would be awesome. As always, please tell your friends about Code Club, be sure that you subscribe to the channel. And we'll see you next time for another episode.