 Next talk is going to be by Gabriel Becker, who will be speaking to us about our tables, leveraging data visualization concepts to declare and create clinical trial tables. And I think they will be speaking live, I see you're here and unmuted, so just go ahead and share your slides and get started. Yep, I just, is that, do you see the schedule or do you see my slides? I see your slides. Great, okay. Yeah, so this is gonna be a pretty different talk from the last one, the last one is very interesting, but sort of covering a pretty different scope than we're gonna talk about here. So just some basic background about me. So I'm a statistical computing consultant. I've been working with an organization called Nest within Roche for a number of years now, developing the R Tables software, which provides a general table generation framework which they are using to generate the tables that are going to be going into the clinical trials, clinical trial reporting. Yeah, I have my email there if you want to get in contact with me and I'm also on GitHub. So tables are very important and they're useful for EDA but they're also a crucial aspect of the reporting process. And now you're probably wondering, what's why I'm talking about tables with a picture on the screen. So let me just go through some things here. What it ends up being the case is that we can understand tables and how we should think about them and how we should create them by thinking about things that aren't tables. This may be a little bit counterintuitive but it ends up being really powerful and I'm gonna try to showcase a few of the ways that it helps us today. This is a multi-year project, a very sophisticated and complex piece of software. So I'm not gonna be able to cover everything but I should be able to sort of hopefully convince you that there are benefits to thinking about tables in a way that's pretty fundamentally different from the way that most people typically think about them, I think. So the basic foundation of this is tables are faceted data visualizations, they are. And so we can see here, this is the same information represented side-by-side in a faceted bar plot and a sort of complex structured table. And so we can see each element of the table corresponds to exactly one element or one aspect of the plot and vice versa. So we have the concept of rows in tables and we can sort of, we can look at what the concept of row would mean in this faceted data visualization. Now it's notable that a row is not a set of these subplots, it's actually a slice across these subplots. And then we have a single subplot here and that corresponds to what we generally call a sort of vertical section within the table. And then we've got the columns here, we've got the faceting, the faceting information, which is recreated in the table here. And then we have the individual cell element in the table, which corresponds to an individual bar, a single bar within this bar plot, which again is not a full subplot, it's an element within a subplot. Now this bar has a width zero because the count was zero here, but the point stands, this is also an add and I'll log to a single cell right here. So tables are faceted data visualizations, they are plots and it's useful to think about them as plots. Now, what do I not mean when I say that? I do not mean tables have a similar purpose to faceted data visualizations. That's true, but that's not what I'm saying here. I do not mean that faceted data visualizations are similar to tables, that is also true and it's also not what I'm talking about here. I do not mean that we should be rendering our tables in ggplot2, that would be very silly and we're not going to do it. And finally, I do not mean that technically everything's made out of pixels which have an X position and a Y position and therefore they are data visualizations. That's not what we're talking about even though it's technically true. So what do we need in order to describe a faceted plot that we would like to make? We don't need very many things. We need a fasting structure in the row or Y dimension. We need a fasting structure in the column or X dimension and then we need information about what will be drawn in each subplot and in the ggplot2 framework that's gonna be your geomes and your stats, the combination of the stats and the geomes, right? And then there's some various other miscellaneous metadata annotations like titles and access ticks and things like that. And those do have analogs in the table space for the most part but we're not really gonna, some of them do like ticks, not so much but access labels do, but we're not really gonna talk a ton about those in this talk here. But here we can just see, I didn't draw the plot but this is pretty basic ggplot code that is probably familiar to many other people in the audience here. And here we have the geome bar that's gonna be number three which is our what's drawn in each subplot. And then we have the facet grid which is gonna define both one and two. So both the faceting structure in the y dimension and the faceting structure in the x dimension which are the rows and columns respectively. Okay, so that's what we need to describe a faceted plot. So what then do we need to describe a table? It's the same because as we just talked about tables are faceted plots. So what do we need? We need the faceting structure in the row dimension and the faceting structure in the column dimension and what's gonna be drawn in each of the panes, each of the subplots, that's it. That's all we need to define our table. So our tables defines a framework of verbs we call them layouting verbs for reasons that I may have time to get into at the end. But we have these verbs and they allow you to declare the various aspects that we just talked about. So that's it. This is the list of things that you need to define any table other than like the actual logic for calculating the cell values which you're up to, is up to you to implement. But in terms of declaring table structure, this is it. It's a very sort of like deep layer like situation where you have a number, relatively small number of low level verbs and you can combine them to do sort of arbitrarily complex things. So faceting, we have these split functions. So we split rows by and split columns by. And so we, unlike facet grid, we incrementally declare faceting. And whenever we declare a faceting by default that's gonna happen inside whatever faceting already existed in that image. And we're gonna see an example of that which will I think make it a little more clear. I mean, it looks like there's a lot of stuff going on in the chat. Let me just look at that quickly. Okay, so thanks for the feedback. It doesn't look like there's any questions there though. So if you do have questions, please do ask in the chat and we'll try to keep an eye on it. But yeah, so that's for declaring faceting. That's all we need. And then cell value derivation, this is the equivalent to the stat plus geome situation where you're defining the calculations that are gonna generate the values that appear in the cells of your table. We call that cell value derivation because those are what those words mean. And so we have just two of these. And then we have summary situation which is equivalent to marginal or group summaries in a plot which weren't actually in the sort of Titanic dataset plot that I showed you earlier, but they could have been and we'll talk a little bit about that as things move forward. But that's it. That's all there is. So now let's go through some examples. Column faceting in ggplot. We've got our facet grid calls equals arm. And in our tables, split columns by arm. And there you go. Side by side, everything is nice. We can also see just as a brief aside here, the range is multi-valued. So, whoops, sorry, getting ahead of myself there. Range is multi-valued here. So we can have multiple values in a cell if we want to. We can also, as we'll see, I think we'll see later, have multiple cells within a facet again to having those multiple bars in a single subplot in the bar chart. So, row faceting, not really very much different than what we were doing a second ago, but now everything's taller. So we do row equals far as sex in the ggplot side and then we do split rows by sex, which is the cdisk specified variable name. For that, which is why we're using it here. And there you go. We've got a nice tall table and plot here. So xADSL is a synthetic ADSL dataset that ships with our tables. It was generated by another open-source tool called random cdisk data, which also came out of the Nest project. But yeah, it's correct. It's the data and that particular dataset is included with actually it's included with four matters, which is a dependency of our tables. So whenever you have our tables, you will have this dataset. So grid faceting, we can have faceting and columns and rows at the same time. Obviously we know that ggplot can do this mostly because even if you didn't know before today, you saw it earlier in the Titanic plot that it was doing that. And here we can see that we can also have that in our tables. You just do split columns by and then you do split rows by. And if anyone isn't familiar, this sort of vertical line greater than thing is the native pipe. So you can just think of that. There's the same as the greater pipe for the purposes here. There are a few differences, but they're not important. Thanks, Pavel. Pavel is the technical lead of the larger project of which our tables is one of the packages. So that's that. We can also nest faceting structure. I don't have the code here because I thought the tree was a little bit more useful, but in both ggplot and our tables, we can nest faceting. So we can split by arm and then for each arm, we can split by gender via the sex variable, right? So you can see that on the left and on the right here for the plot and the table, respectively. And let me just look at what time it is. Okay, and yeah, so that's that. So I've just talked about how our tables is basically gonna be leveraging the fact that tables, specifically the type of complex structured tables that are involved in clinical trial analysis or EDA of data are faceted data visualizations. And now we're gonna talk about some generalizations to how certain aspects of visualization work typically that sort of arise from applying them to tables. So first is what faceting does. Typically and traditionally, faceting is a partition of the data, which means it is both mutually the sets, the different facets are both mutually exclusive and exhaustive potentially with the exception of the removal of NAs. And so you're gonna take a categorical variable, you'll look at the values of the variable and that will define your subsets. This is the sort of group by in deep layer type of thing. That's fine, our tables is happy to do that, but it doesn't have to be. Our tables doesn't care whether things are mutually exclusive or whether they're exhaustive. So what this allows you to do is it allows you to very easily, as we see on the right here, have facets that overlap each other. And in the table space, this is actually pretty common. It's very common to have a all patients category, for example, but just that you can also, as we see here, combine arms to have sort of virtual combined arms for certain analytical purposes and things like that. And our tables is happy to do that. So that's one thing. So fastening has been sort of generalized beyond what people typically think of when they're talking about fastening. Yeah, exactly. Let's see both versus everything else would be another example of this. Next, we have subplots and what's sort of what's happening in subplots. It's not, I drew a table on the left here, but in GG plot, typically, unless you sort of really want to go through a fair bit of pain, the subplots are all gonna look the same. They're gonna be plotted on different data, which is associated with the facet pains of your faceting that you've declared, but they're all gonna be bar charts, for example, and the example that I had before. Or they're all gonna be scatter plots with different stuff in them, right? But our tables doesn't care if you have the same stuff in the facets or not. So here we have, we're splitting by sex. So the facets are male and female, female and male, right? And then within that, those facets, we have various things, right? We have a mean and a range, right? So that's the equivalent of like a bar and a line, a scatter plot line or something like that. And you can also, I don't show this here, but you can also have columns that have fundamentally different meanings within the R tables framework, which is another thing that you would not see in a sort of faceted plot. Like you can have a column where the first column is N and the second column is a confidence interval from a model that you have fit. And the third column is the p-value of that confidence interval, including zero, et cetera, right? So I will, yeah. So I'm gonna, I've got about one minute. I'll try to address your question, Eric, but I do want to get through a couple of things first. The last thing is marginal summaries. These are not really possible in sort of base GG plot two, but there are extensions to GG plot two, one of which is the GG side function where you can do that. We also have that in our tables, where you can, there's a native verb that we saw in the list called summarized row groups, which generates sort of marginal group summaries and puts them in your table. And by default, that means count and then percentage of the count in the column. Something that you can't really do in plots is have multiple different levels of information being marginally summarized, but you can in tables very easily. So you can see here we're marginally summarizing both the genders and within each gender we're marginalizing the strata one, which is like a made up variable. Doesn't really mean anything, but yeah. So with that, I'm going to very briefly, this might actually answer your question, Eric, but a lot of people don't know this, but the Nest team has released what is called the TLG catalog, which is an open source compendium or code for TLG generation. It is open source with a commercially permissive license. It's publicly available now. And it is also the code that is used internally at Roche to generate these tables during clinical trial analysis starting this year. It's based on turn in our tables. I don't have time to talk about turn. It is another open source package is currently on GitHub. It will be coming to Cran soon and Pavel put a clickable link in the chat there so people can go look at that. This has 225 table variants with code that's just there that you can use. That's, you know, this uses synthetic data, obviously in the actual catalog, but the code you can use any data set that meets the spec, right? There are 88 sort of top level table named tables and then each of them has a number of variants. This is the, these are the numbers. So there's 68 different variants for reverse events tables which seemed like a lot to me, but you know, I'm not a biostatistician. And then very last thing, I know I'm very short on time, but another sort of larger project of which our tables is apart. Oh, and the other thing. So the TLG catalog, the sort of analytics are done by this package called turn, but all the tables themselves are generated using the core our tables framework, which is why I'm mentioning it here. And then I'm also part of a working group for our tables for regulatory submission from the R consortium. And we have authored a book and it's available now for sort of, you can look at the repo, which I have a link there. I know you guys can't click on my slides, but that's the URL. And then it is slated to be published as an official first edition on the 23rd of this month. So that book covers six different table engine packages of which our tables is one. And then it has worked examples for sort of five archetypal reporting tables from clinical trial analysis done in all six packages so that you can compare and contrast. And with that, I think I'm maybe a little bit over time, but hopefully not too much. And I'd like to thank everyone for listening. Hopefully that was helpful and informative. And I don't know if I have time for questions or not. That's up to the moderator. I think we'll save any questions for the chat. That was a really fantastic presentation of our tables. Thank you so much. It looks like there's a lot of great stuff there and please answer any remaining questions in the chat. And if there's any relevant links you wanna drop for us there, that would be great and thanks so much. Sounds great. Thank you. Thanks everyone. All right, and with that, we'll move on to our last speaker of this little block.