 Right, we're back. Thank you very much. Again, some minor technical difficulties, so apologies for the delay. So we have with us Travis Kirk, who is going to be talking about making consort diagrams with R. Dr. Kirk is Director of Data Science at the Prostate Cancer Clinical Trials Consortium, where he directs data science efforts in cloud-based informatics and advanced analytics with a focus on the healthcare sector. Formerly, he was a visiting scientist at the Harvard School of Public Health and Assistant Professor of Epidemiology at the University of Florida, where he focused on application development of tools for applied machine learning causal inference and biostatistics. Dr. Kirk holds an SED in Epidemiology from the Harvard TH Chan School of Public Health, an MA in biostatistics, and bachelor's degrees in mathematics and statistics. So without further ado, I will hand over to Travis. Thanks, Chris. That was kind of you. So can I get a, we're good to see my screen and everything? Can you give me a verbal yes, because I can't see anybody now? Yes. Thanks. Great. Well, thanks again for having me. So I know we're a little bit behind, but I ran this talk a little bit short because I knew these things happen. So I think we'll be able to get back on track and nobody has to stress. We're all good. Thanks so much to the organizers of our medicine. I'm so happy to be back this year. I was around last year. It was great. It's even better this year. So thanks again, the sessions so far have been awesome. I apologize that I didn't pre-record this talk, which I'm of course getting live right now, so I can't have fun interactions with you in the chat like some other speakers have done. I've learned my lesson. I'll do a better next year. But for now, we'll just go ahead and proceed. So again, I'm Travis Kirke. We're actually hiring at PCCTC, so feel free to reach out if you're looking. I'm here today to talk about consort diagrams. So if you attended last year, you may remember Peter Higgins' excellent talk of the same GG consort name. To avoid reinventing the wheel, I'm going to borrow two of his interest slides and then provide some context for why I'm back to talk about this topic. If you aren't familiar, consort diagrams are figures which show the flow of participants through a trial. Here you can see an example of a consort diagram from the New England Journal where you can see the number of participants, in this case infants, who were assessed for eligibility, the number who are ineligible for various reasons, and the number randomized and followed up. These diagrams are expected by most medical journals, certainly for clinical trials, and often these days for observational studies as well. Today, consorts are typically handcrafted into templates provided at the consort group's website, which is linked here. Peter last year used the afterward artisanal to describe this process. These templates work well enough, but copying and pasting from your analysis into word docs, as you can see me doing in this slide, can be error prone. It is a common gotcha for manuscript or study reviewers when numbers don't match up. Some implementations of consort diagram construction do exist. Thus far, these are not GG plot based. Some of you may be familiar with the excellent diagrammer package, which when carefully used can achieve this goal. Danny Wong, Norbert Kohler, and Ben Gerber's pull request to Peter Higgins' GG consort repository do use it. The challenge with diagrammer is that its output is actually too elegant for blocky consorts. You can see an unanswered stack overflow question below, showing diagrammers aesthetically nicer rendering of a consort, but it has these curved lines that don't really conform to consort rules. In other attempts, my group implemented a latex tixie version a few years back, and earlier this year I embedded the flowchart.js library into an R package called flowchartr, which was fun, but also didn't permit enough customization to make the full consort diagram possible. So here we are, still lacking an end-to-end GG block compatible consort implementation. Having tinkered with this idea for years now, I had some building blocks in place, and what I was developing unfortunately did not extend Peter's package from last year. I didn't want to risk a GG consort to naming situation, so it was time for an email. I said, hi Peter. I'd like to take a pretty different approach to GG consort. Are you willing to transfer the name? Travis. Peter, being the archetype of the generous and awesome R community, responded. Hi Travis. Feel free to use the name, hex sticker, and anything else useful. You can give me the link to the Github repository. Peter. Great. Thanks again, Peter. This was awesome. And thus, GG consort was born again. So here we are. The new GG consort framework posits that consort construction happens in two distinct stages and provides convenience functions and our objects for each. First, at the time of data wrangling, we would like to capture counts of cohorts or sub cohorts, for example, consented patients who are randomized into a data object. This will streamline efficiency and enhance reproducibility when the source data changes, for example, due to a regular data update. Once the counting and annotation object is established, we can begin work on the second phase diagram layout. We do this by creating a new data object that is ready to pass to consort geomes in GG plot, which specify layout and general aesthetics of the diagram. So let's work through an example. Here you can see a simulated data frame, which is included in the GG consort package. I wish I had been using Peter's package that I just saw 30 minutes ago, six months ago, but lesson learned. And in the next version, we'll see it happen. In any event, trial data contains simulated information again on 1200 patients who are eligible for a study. Some of these declined to participate, as denoted in this declined variable, or were ineligible due to two exclusion criteria, prior chemotherapy or bone metastasis. Eligible patients that were left were then randomized to receive either drug A or drug B, as denoted in the treatment column. The fundamental building block of GG consort is a new object of type GG consort cohort. We initialize one in a dplyer chain using the cohort start verb, which also accepts a label that describes the full cohort represented by the source data. New cohorts are added to the GG consort cohort object using the cohort define verb. The full data source object is always accessible as dot full, as you can see here. And we construct new sub cohorts with standard dplyer. Here you can see we've created a subcourt of 1141 consented participants out of the original 1200 participants. And so we can define further subcourts. We use, even if we wish, previously defined cohorts. And here we're going to identify the chemotherapy naive participants among those who are consented using the consented cohort, which was defined in the previous step. So now you can see we have these 1028 chemo naive consent patients out of the 1141, which was a subset of the 1200. So you can see the directions is going. And in this way, we can continue to build out all subcourts, which need to be represented in the consort diagram. We've now added randomized patients, that is, those who are consented, chemo naive and did not have bone metastasis. And those who in the end were assigned to drug A or to drug B. Often in consort diagrams, we need to count the difference between two cohorts for enumerating exclusions. Anti-joint is very useful for this purpose. Here we use anti-joint to identify the 262 excluded patients as the difference between the screen population of 1200 and the randomized population of 938. That math does check out, I promise. Now each of these cohorts should have a label for descriptive printing in the console or in the ultimate consort diagram. We provide those labels with the consort label verb. Finally, we store the GG consort cohort object into steady cohorts with a right assignment operator there at the end. So I've got this steady cohorts. Again, that's where we're living with this object. And these objects now have print and summary methods. So here's the summary of what we've built so far. You can see I've added a few more exclusion categories. Those who had prior chemo, who had bone metastasis, that I didn't do in the previous step, but the processes as you saw before. And then I added their cohort name, count and their label. So that's everything inside of steady cohorts now. And we're ready to move on to the diagram layout and aesthetics, the phase two of this process. But before we do, I want to point out a convenience function for pasting or gluing cohort labels and their counts together into a single string, such as you'd want in a consort diagram box. Cohort count adorn does this according to the following default format. So it'll be the label and then n in parentheses with the actual count. If you want a custom format, that's possible by way of the label function argument. You can see an example of placing the number in front, which sometimes consorts or journals might actually want it that way instead, which is fine. We can now add our first consort box with the consort box adverb. This function needs a name of your box, an X and Y location and a label. Notice the use of cohort count adorn and the label argument. And you can see our box below. Looks like it worked. Great. Let's add another box. This time for the randomized participant count. You can see a lot of the heavy lifting has already been done by setting up the randomized cohort inside the ggConsortCohort object and then passing that sub-cohort to cohortCountAdorn. Here's a more complicated box, enumerating all of the exclusions. It's really more of the same, though. You name the box, in this case, exclusions, position it with X and Y, and then pass a string for its contents to help with the cohort, with the help of cohortCountAdorn. Here's our new box. Great. So we've got three boxes here. Now, we're really on a roll. So adding boxes for treatment arms is a piece of cake. Same structure here. Just cohortConsortBoxAd, give it a label, sorry, a name, X and Y coordinates, and a label. We're almost there. We just need to add arrows. So we're going to add some arrows here. As you might expect, this happens with ConsortArrowAd. When you want an arrow to connect one box to another, you simply pass the name of the starting box and the name of the ending box, along with which sides of those boxes you want the arrow to connect. Here we are drawing an arrow from the bottom of the full box to the top of the randomized box. If an arrow doesn't start or end in a named box, you can specify start or end to XY coordinates instead of box names. So for example, here we'll start the exclusions arrow in the middle of the first arrow by passing startX and startY, and then tell it to end in the exclusions box. Sometimes we need lines without arrow ends. ConsortLineAd takes care of this and accepts arguments in the same form as ConsortArrowAd. So here we're just specifying XY coordinates for both the start and the end, and you can see that what the result is is that we've added a horizontal line between the randomized box and the allocation boxes. We really need three arrows to finish this now, so we specify them each with ConsortArrowAd in the same way that we've done the previous slides, and we store this lastly to a new object called StudyConsort, again with the right assignment there at the bottom, which will now have class ggConsort. And ggConsort is special because this can now be passed into geomConsort. So themeConsort, once you've passed it into ggplot like this, themeConsort is a useful helper that sets the theme to void as you will want and lets you extend the horizontal and vertical margins with two pretty straightforward arguments there. This is often necessary because some text boxes can extend beyond the ggplot defaults, and it takes a little bit of iteration, a little bit of tweaking to get the canvas right, but thus far I haven't observed it to be too painful, although it could be an area for future development in this package. Since we're now operating in ggplot, I thought I'd point out that we can of course use other ggplot geomes. Here I'm using geom rich text from ggtext to add a blue text box labeled allocation in between the two treatment arm boxes like this. And now we have a completed consort for the most part. We can of course add new arms and arrows as we'd want to round out below the allocation arms, but I think you get the point now, this is how it works. And one final thing I thought I would mention, there is a function called cohort pull. So after you've drawn the consort diagram, you probably want to pull up a single data frame or tibble to proceed with analysis, but you want to do lm or survival or whatever your analytic aim is. You can use this cohort pull to do that, or to troubleshoot or investigate excluded patient cohorts or other so cohorts that you might want to look at. That's it. So thanks again to the our medicine organizers and programming committee, and once more to Peter Higgins for kindly passing the ggcon sort torch. A special thanks to my co-author on this package, Gary Caden Buie, who provided critically important feedback and ideas about how this package should behave. I'm very lucky to have him as a friend and a software development mentor. To me, I learned tons from him every time I talked to him. So thanks again, Gary. And lastly, thanks to all of you for listening and hopefully picking up the package and trying it out. As I mentioned here, it's under active development and I'm still working around some of the documentation. So if you find any bugs, feel free to let me know. I look forward to hearing your feedback on your experience and reach out really anytime. Thanks again. So I think we actually have time for a few questions, which is great. There's a couple of the chat, please throw them in and vote quickly. I think we've got another few minutes before the next talk starts. So the first question here is, how does it handle large boxes? Can you use things like dodge? Oh, that's a good question. So under the hood, all this is driven by ggtext, the GM rich text. So a lot of this is really just a wrapper for rich text. And yes, we lean heavily on what that package does very well. So I didn't reinvent any wheels there. Yes, it will handle large boxes. Most of the dodging is done by way of the V adjust and the H adjust under the hood. And that happens when you say, from this box to that box, it recognizes which side of the box you want to point the arrow into and then shifts it to the left or the right or up or down, if that all makes sense. Yes, it does handle large boxes. But as I mentioned in the end, there is a little bit of iteration that has to happen here, not just for the canvas, but sometimes things just aren't quite landing where you wish them to. It remains a little bit artisanal. It's not fully automated yet, but it does, I think, go a long ways from where we were before with those templates. I will mention that for that iterative process, there is a print method for the ggconsort objects. So when I was line by line going through add geombox or, I mean, sorry, when I was adding the consort box and I was adding the consort arrows, if you just keep pushing Enter, I'll keep printing that back to your console without even using ggplot so you can see where things are landing in kind of a beta fashion. Okay, I think we've got time for one more question. Probably, I don't know, we might get pulled through. I'm not exactly sure, to be honest, but let's have a go. Can you add sub-cohorts for dropouts? Reason for dropout, nausea moved out of country. And can you add a sub-cohort for how many completed each arm? Yeah, absolutely. So the add cohorts, that cohort define function, accepts arbitrary de-plyer on your source data, right? So you can filter, you can select, you can, it's really just a list of tibbles underneath. So your wireless dream of what kind of cohort you want to put in there, yes, it can happen. Great, thanks. Well, I'm just going to check this slide because I'm expecting this all to be magic to the next room at the moment. I'm just hoping to make sure everything's going to plan. I'm not doing well anyway. Let's just keep going. Let's answer questions. So there's a one here, Hi, are labels editable like if you want to use it in Spanish? Oh yeah, that's also a really good question. I would, I would hope so. Again, like leaning on the GG text package, anything that's possible there will be possible here because it really is just a wrapper. I know that's very flexible with regards to special characters and kind of HTML markup and things of that nature. So it should be possible, but I need verification on that before I say yes all the way. Yeah, great point though. Okay, I'm not seeing any more questions. And I think we're probably, I'm being cold. I think we're going to go. Yeah, looks like we're going to go. So thank you very much for the talk and we'll see everyone in the next session. Thanks. Everybody else got already pulled over so you guys can go there if you want. All right. Great. Thanks. Thanks. Thanks Travis. Thank you Chris. I appreciate it. Cheers.