 All right, can you see it now? Yes, we can see it. Finally. Well, I don't know what technical problems so it's happening there. But anyway, thank you, everybody, for your patience. So I'll start again. My name is Nick Barrowman. I'm a statistician at the Clinical Research Unit of the Children's Hospital of Eastern Ontario Research Institute in Ottawa, Canada. And today, I'm going to talk about my R-Package V-Tree and how it can be used to find hidden patterns in clinical data. So there are two parts to my talk. And in the first part, I'm going to show the components of V-Tree that let you draw a consort flow diagram. That'll introduce a bunch of the different features that V-Tree has. And then in part two, I'll look at data exploration with V-Tree and how some of these features play out and let you hopefully find hidden patterns. So I'm going to start by talking a little bit about the consort diagram. So for randomized controlled trials, the randomized controlled trials are generally considered to be the gold standard of clinical evidence. In order for a randomized controlled trial to fulfill its promise, it has to both be done well and be reported well. And so consort, which is the consolidated standards of reporting trials, is all about the reporting. And a key feature is the consort diagram like this, which I'll get into a little bit more in a minute. I'm going to give, as an example, the oral antiviral drug, Paxlavid, which is used to treat COVID. In fact, it was used to treat President Biden quite recently. And also recently, in the New England Journal of Medicine, there was a trial of Paxlavid that was published. And here is the consort flow diagram for that Paxlavid trial. And you can see how patients are followed all the way from eligibility assessment through randomization into one of two groups. And along the way, there are some exclusions, discontinuations, and a total number followed up. As I was preparing this talk, and I thought, oh, this is a timely example, I noticed there's actually an error in this diagram. Whoops, my sharing went away. I'm going to try that again. Pardon me. All right, I noticed that there was an error in the diagram. And here's the error. There were 2,396 patients assessed for eligibility. And 137 of them were excluded. And then 2,246 underwent randomization. But if you look at those numbers, they don't add up right. So 2,396 minus 137, well, that gives 2,259, not 2,246. So there are 13 patients who weren't accounted for in this diagram. And I don't know exactly what the issue was, whether it was just a misprint, or if there were 13 patients, it's hard to say. What I'm going to do is just take away 13 from the 2,396. So what I did was I created a data frame, and I gave it 2,383 rows. And then all the numbers work out. But I mean, the point I do want to make here is that reproducible method of producing a consort flow diagram, and not just labeling blocks in a picture, but actually computing the values, a reproducible method is very important to avoid errors like this. So I made up this data set with 2,383 rows. Each row represents a patient. And then you have the different columns for randomization, true or false, and then different exclusion reasons, the group assigned, and so forth. So if you look at the first row in this data set, you see this patient was not randomized. And the reason was because of exclusion reason number 2. There's no group, just a missing value for group, because they weren't randomized, and so forth. So now let's consider a single layer tree. We could just use the baseR function table to get frequencies, 137, that were not randomized, 2246, that were randomized. If you used vTree instead, and I just want to point out, you specify the variable in quotation marks, and you'll see why a little bit later, then you get the same numbers, 137, 2246. You also get percentages and a colorful diagram. If we go to a two-layer tree, you can see what the colors are all about just to help you see what's what in the diagram. Now we've got group in there as well, and we've separated the variable RAND and the variable GROUP by a space inside quotes just to make it convenient. And you note the number of those randomized, the number that received Paxilvid versus the number that received placebo. But you also see the missing values, and this is a strength of vTree, as it will always show you the missing values, except in this case, we actually want that node to go away because it's not telling us anything interesting. And that's why we might want to use pruning. So in pruning step, and there are several ways in vTree to do pruning. In this case, I've used the follow command. And follow says, only follow below the node true for randomization. So it doesn't follow below false. And that gets rid of the missing value node. But there are other ways to do pruning in vTree. Now let's go to a three-layer tree where we also include follow-up. And follow-up, now we can see what happened, or how many of the patients who received Paxilvid were followed up or not. The ones who weren't followed up discontinued for various reasons. Now too, I mean, this already gives us a good deal of what's going on in the consort diagram. But to actually get into the reasons for exclusions and so forth, we're going to need another tool in vTree. And that, oh, I'm sorry, I've skipped a step here. I just want to point out that what I've done here, I've used a, there's a parameter called label node that lets me replace sort of truths and falses with more informative names like excluded or randomized. And if you look up here, it says show var names equals false compared to the previous page. Oopsie. In the previous page, the previous slide, I had the variable names appearing and that's helpful when you're building the tree. But once you've labeled the nodes, excluded equals false, randomized equals true and discontinued trial equals false, followed up equals true, then you don't need the variable names anymore. So I was just about on the previous slide to say that if we want to look at the reasons for the exclusions and the discontinuations, you need another parameter and that's called summary. The summary parameter lets you display information about other variables within a subset of the data within a node of the variable tree. And so here, for example, I've got, well, actually this is the general structure of the summary string. We specify a variable name and a format string and the format string consists of text and special codes. So for example, how many patients were excluded because they didn't meet eligibility criteria, EXC1, that's one of the variables. So I specify EXC1 and then this string and it also has a special code to compute the sum. And if you do that for the consort diagram that produces did not meet eligibility criteria 124. When you do that for all the different eligibility and discontinuation criteria, it gets a little bit lengthy as you can see here with the summary command. But then you can get a more detailed consort diagram. I've done a few other things, I've turned off the colors and various other things, but basically it's a fairly short command despite all this stuff about the summaries and it gives you all the information you need. You see the reasons for the exclusions and for the discontinuations. And of note, if you look in the Paxilvid group, there were no deaths out of 1,120 patients randomized to the Paxilvid treatment. In the placebo group, there were 13 deaths out of 1,126. So it's pretty striking. So that's roughly how you can produce consort diagram in VTRI. It doesn't have those kind of rectangular look of the classic consort diagram, but it's fully reproducible and it can be used as you update data from your trial or you correct errors and so forth. And again, it is reproducible. So now I wanna move on to part two, data exploration. And the point is that the tools that I incorporated into VTRI so I could do flow diagrams like consort, it was kind of an added bonus that they can be really useful for data exploration. In my experience, people often have data sets but they just don't really know what's inside the data set. So what do they do? They open up a sort of a spreadsheet view of the data set and they look around and they might see a few things, but it's hard to see patterns if you do that. So VTRI is a simple tool that lets you explore patterns in your data. And I think it's an example of the tool that people have been missing instead of just looking at your data and hoping to get some insights. You can actually have a structured way of looking at your data. And so I'm gonna illustrate this with a retrospective cohort study. It's a data set from the medical data package in our and it concerns patients who are being treated with a bone marrow stem cell transplant. And it's about the risk of cytomegalovirus or CMV reactivation in these patients. So my data set has 64 rows. So 64 patients and 26 combs. I'm gonna start by looking at frequencies. These are the diagnoses and I'm using this special setting pattern equals true, which in this case just lets me order the diagnoses from the bottom, the most frequent ones up to the top, the least frequent ones. I'm just gonna, as an intermediate step, I'm gonna relabel things a little bit because the data set just had say ones and zeros for male and female and so forth. And now I'm gonna look at another pattern tree. This time I've got multiple variables. I'm looking at sex, I'm looking at a genotype variable. So they're heterozygous or homozygous. And then also whether they received radiation prior to the transplant. And again, they're ordered from the least freak or sorry, the most frequent at the bottom to the least frequent at the top. So this lets you look at combinations of values. The most frequent pattern here is male heterozygous, no radiation. Now I'm gonna look at the same three variables in kind of a standard V tree format, standard variable tree format. And I've got so sex, the genotype variable and prior radiation. But I'm gonna look at the primary outcome here in this study, which is CMV reactivation. And I'm gonna let capital R represent the proportion of patients who experienced reactivation. You can see overall 41% experienced reactivation, but it was 53% in females, 29% in males. And then you can look at within females and males, the heterozygous or homozygous groups. And then further, you can go on to look at radiation. And I'll just point out, if you look at radiation within these groups, the no radiation group always has higher reactivation rates. So 18% compared to 0%, 79% compared to 75 and so on. So that's an example. I mean, there are many, many more examples I would love to show you, but inevitably there's not enough time and we'd have to get into more details. There is one final thing I do wanna show you, which is what can you do with quantitative variables? Although VTRI focuses on discrete variables, it can display summaries of quantitative variables. In this study, the primary risk factor of interest was the number of activating killer immunoglobulin like receptors or acres. At least that's how I pronounce it. They were especially interested in a low number of acres versus a higher number of acres. And we can start by using the summary command to just look at the mean number of acres in groups. So overall, the mean was 2.8, standard deviation 1.7. But when you look at those where there was not reactivation, the mean was 3.1, whereas it was only 2.4 in those where there was reactivation. So that kind of shows off the kind of the relationship, but it's a little bit back to front. So maybe we should look at this dichotomized comparison, but how do we do that? Well, and this relates a bit to how I said, look, we put the variable in quotes because I've got this expression here where I said acres greater than or equal to five. So that dichotomizes the quantitative variable acres into just the two groups, the less than five where the proportion reactivated is 48% and greater than or equal to five where the proportion reactivated is just 19%. So that shows it off nicely. So that brings me to the end of my talk. And once again, I apologize for the technical snafus at the beginning. VTRI is available on CRAN and check it out there. Also, I have a webpage set up right here and there's lots of resources there, including a draft paper, which is currently under review. So you can take a look at that. My co-author is Richard Webster on that paper. And I also want to acknowledge Sebastian Gaccia who added shiny bindings to VTRI. All right, thanks very much. And if there's time for questions, I'd love to take them.