 Hi everyone, thanks for joining today's R-Adoption series. We're really excited to talk about metadata and metadata for speedy delivery. So exciting topic and I think we have two exciting demos to be able to talk about that today. So quick just a little housekeeping up front. So I want to talk a little bit about the scope of some of the R-Adoption series. So obviously it's aimed at all those who are leading R-Adoption but it's open to everyone. We really like to focus on the how to of R-Adoption as well as technical demos or whatever it might be. Typically our format is a presentation with some sort of focus discussion at the end. And then finally we like to host those webinar videos on our consortium website and have included the link in this intro slide. A little bit about today's session so we'll have a quick opening which we'll be able to get through very fast. Then we'll be going into two presentations on metadata. So the first one will be from Christina Fillmore from GSK focusing on MetaCore and MetaTools and how to leverage metadata for dataset creation using R. And then we'll have a presentation on MetaLite from YujiSoul and Kiwi Anderson from Merck about leveraging metadata for analysis and reporting based on item datasets. So it will be, I think, great presentations and demos as well. And then at the end we'll have some time for any discussion and questions that might come up. So before we get into that first presentation, big thanks again as always to our sponsors, the R Consortium, Fuse and MPSI. Without their sponsorship, we wouldn't be able to make this happen. And yeah, so excited about today's conversation. So we will go over initially to Christina around leveraging metadata for dataset creation using R with MetaCore and MetaTools. So over to Christina. Hi. So can you guys see my screen? I assume so, I can see it now. So I'm going to talk a little bit about today about leveraging metadata for dataset creation with R. So for dataset creation, they're kind of two common places you get metadata from, or depending on where your process is internally, you're kind of the dataset specifications, which are probably the most common, what everyone's used to. And then also the define XML, which kind of contains a lot of the same information just in a different format that we end up giving to the FDA, right? So both of these tools are kind of really rich in things that are going to be helpful to create a dataset, right? Like they're meant to help programmers create the dataset by providing the metadata that's needed. Things like which variables go in which dataset, what the control terminology should be, are there any labels, all of that sort of stuff. But once it comes to taking that flat Excel file of the specification document or the XML file of the define, there's not really, there hasn't really been a great way of getting that into your R environment to start using. So that's kind of the purpose of MetaTools is to, or MetaCore, sorry. MetaCore takes all of that information, and we provide some readers as well, but also other companies write their own internal readers, depending on what your specification looks like to pull everything in, like pull a lot of that metadata in and put it in a standard form that makes it easy to use. So the MetaCore object is made up of a series of seven tables with the first kind of these top ones. The first one being about the datasets itself, so which datasets there are, how many, their names, their labels, their specifications, their stations, stuff like that. Then there's kind of a combination of datasets and variable information, then there's the information that's just about the variables themselves, the value level information, the codeless derivations, and then also a sep to kind of help you put some information specific to sep variables so that they can be built, so that seps can be built as well as added into the datasets once they've kind of, when you need to combine them again. So that's MetaCore, and that's really super helpful, except for, MetaCore really is a package that creates this object and has some readers in it, but it doesn't really do anything with this metadata. So we built a companion package called MetaTools, which really does the like interesting stuff with this information. So it's really looking at trying to action and build out some of the actions that you would want to have on the metadata available within your MetaCore object. So that's kind of the general high level view of these two packages that work together to try and really leverage this metadata to go forward. But now let's just talk about it. So we're going to use this specification to read in, which is kind of a pinnacle 21 type specification. And it has some datasets in it, and we're going to use this in order to build a little example ADSL dataset. So the first thing we're going to do is we're going to read everything in here, and we're going to read in our MetaCore object. So the MetaCore object, you have two options, you can read it in with quite equals true or not quite equals true. So if you don't have quite equals true, it just reads it in says you've successfully imported, and then it will give you a potentially a variety of warning messages or not. So here it's saying things like the sub flag is missing. It's just missing values and like IDVAR is missing values. And this all makes sense because my specification document didn't have anything about subs. So they're not there. So that's why that's all missing. And then it also tells me some information that I have here that says like, oh, these derivations were never used anywhere. In other words, I have derivations in my derivation column that weren't ever applied to any variables. So it gives you kind of some helpful warning messages that basically say your specification doesn't really hold together as like fully cohesive. But you might not want to see those all the time. And so you can set quite equals true so that we don't have to see them and then it just, I can't spell today. And when you do that it just says that it was successfully imported with some suppressed warnings. But that's all you have to do in order to read in a specification. If it's a standard specification, you might need to build your own readers if you're from a company that has their own specification type that's unlike anything that's here. And that's totally fair. So then the next step I'm going to do is I'm going to just also read in the DM data. But that just uses Haven. There's nothing fancy about that. So let's look at the medical object for a second here. So the medical object says that it, if you print it out, it says that it is a medical object and that it contains three data sets, which matches what we saw within the specifications right. So if we look at those three data sets by looking at that DS spec or the data set spec table to access that DS spec table, all you need to do is have a dollar sign to DS spec. You can have standard R nomenclature that you would expect. And you can see that there's the three data sets, the ADA, ADSL and ADPFT, and then it tells you the structure of them and the label that's for those. Great. That makes a lot of sense. You can also look at the VAR spec, which this one, like I said, the VAR spec just deals with variables. So it gives you the variable, the length, the label, the type, format and common, whether it's like a common across a bunch of different ones. So all of those probably kind of make sense, hopefully. So the first thing that we're going to do, because we want to try and build an ADSL. So we're going to build out the ADSL table or data set really quickly. So we're going to just subset so that it only has the one data set, the ADSL data set, because that's the, we only need the specs for ADSL to build ADSL. So now if we look at the ADSL spec here, it just says that it contains one data set and that one data set will be ADSL. So if we wanted to look at the ADSL spec, if you look at like the DS spec here, you can see it now only has ADSL. Great. So the first thing I'm going to do is I'm going to just check what variables I need to build. And it says that I have a log list here of all of these variables that I need to build in my fairly short amount of time that I have to display here for you. So first step is I'm going to pull through all of those predecessor variables. Oftentimes variables, especially in ADM data sets are really built off of just a bunch of predecessors. And you're adding in a couple of things, but maybe you're not adding in too much. I've cheated and made the spec myself, so it's largely predecessor variables. So what I'm going to do is I'm going to use this function from meta tools called build from derived. So I'm going to build these out and it's going to take my ADSL spec. It's going to take that metacore object. And then a list of data sets, a named list and whether we want to keep them or not. Here we do. And so if I run this, I now get a fully built out data set. This looks pretty good. This already pulls in 13 columns for me and merges everything across and matches everything up. So that makes my life super easy. Here I knew because I built this that it all came from DM. But in the case that you didn't know where that what she had says you should be using in order to build your data set, you can also always just do this. And it will tell you, hey, you didn't pass me any data and you need to pass me DM. So great. So we passed in DM now. We're able to have the ADSL that's just built off the predecessors. So if we go back to check the variables again, we now see we have a much shorter list. Great. Okay, so let's now start doing another data set or another variable. Let's just start with sex and so sex and sex is we can get the control terminology that's used to build sex and by using get control terminology. This lives in MediCorps and it will tell you what the control terminology for sex and is from that medical object. So great, we know it's one two F and M kind of what we predicted right. So let's see if you look here in ADSL Fred, you can see we already have sex. So we actually just need to be doing the code to decode kind of mapping, like you would do potentially in SAS using kind of an input or a put statement, we can do the same thing. What we're going to do is we're going to have a call the function create var from codeless that this lives within meta tools. We're going to pass the data set and the specification and that input variable, ie the variable that is already in your data set in this case sex and the variable we want to create. So it will use the control terminology from sex and in order to figure out how to create that. And so if we run that we can see that that M's got converted to choose and the F's got converted to choose because that's what the control terminology said to do. So great. Now we can do that for a whole bunch of variables. If you look at the variable list of the things we had to make actually a lot of these sex and race and all of them are just based off of kind of converting that control terminology information. So great, so we can do that super easy. Very quick. So now you can see I've gone from having just 13 variables to 21 variables, where each one of these rows makes that new variable whether it be sex and FNN or, you know, treatment 01 p. So cool. That's good. We can look back at what we still have to build. So if we look at our get control to our loops, if we look back at the variables. We can call that check variable function again. Let's see what's left. So if we take it if we put in ADSL decode. And then we're going to call the ADSL spec, which is what we named our like subsetted metaphor object. It will say we only have two variables left to make the age group and the age group end. Great. Cool. So age group comes from is special because it's subgrouping the age category, the age column. So we're just creating subgroups out of a kind of continuous variable. So if you look at the control terminology for age group, you can see that they're almost equations, but not quite equations. They're kind of helpful, but not perfect. And thankfully, there's a function in meta tools called create cat bar, which creates those categorical variables from a continuous. So what we're going to do is we're going to supply the data set like usual and the medical object, same as we did up here. But now we're going to provide a reference variable, in this case age, as well as the thing we want to create, which is age group. And when you do that, you're able to convert the control terminology that looks like this into kind of actual actionable kind of if else statements behind the scene. And so, now you can see that the age gets converted into the correct grouping, 67 is between 65 and 85 58 is less than 65, and so on and so forth so forth down to 84 which is greater than 80. So you're able to get all of those pieces without having to do too much because someone in this at the point you were writing it in the specification, someone wrote it out in a kind of equation like way that made it so that you're able to do this. Perfect. So, when the other thing that is nice is that with this create cat variable you can also get the numeric as well in this case we need both age group and age group and so we can just create both. And if we look at our data set here we can see that both the age group one and age group one and got created from just that single call. That was able to split out and do the kind of if else statements from age for you, so you didn't have to think about it. And now if we check and look at our check variables again it looks like we have no missing or extra variables we are good to go. Perfect. So, the kind of last couple of things that you might want to be doing with these metadata is being able to kind of order the columns add the labels and to do all of the kind of final stuff the things that you need to do in order to write it out to an XPT. And so we're going to first we're going to order the columns that may just make sure that they're set in order of what the specification says, and then we're going to set the variable labels. So now, if we look at ADSL we can see that study ID has that label of study identification identifier, and you subject ID is labeled as unique subject ID, and so on and so forth, all the way down. So that's great. Now before we want to be really done oftentimes you want to do some checks and things like that. And so you're going to be you're going to want to be able to do things like check that your control terminology is correct across your data set, as well as check that the variables you have we've already got variables both times that we know we're good. But you do that and you're able to see that both under check control terminology and check variables produces no missing extra data and basically how all these checks work is if you if it if you have a variable you shouldn't for instance, like we can see that that will make it throw a tantrum, which is what you want. It says that you shouldn't have that variable in there doesn't belong needs to get out. And equally if you have control terminology that's incorrect it will also fall over. So it doesn't a nice job of telling you what's gone wrong. And really that's kind of the purpose of it is to help you do those automated tasks that are kind of boring but fairly easy to automate like that control terminology conversion will also helping you make sure that the data set you've created matches the specifications, whether it's the orders or the order or the label are correct or that you have the right control terminology. You want to make sure all of that works together so really that is how Medicare works today and metatools and how they work together in order to help quickly make a data set I mean we've made this ADSL data set, which, okay it's not the fanciest ADSL data set in the whole wide world, but it is 23 columns, and we made it fairly quickly, and it's not something we probably could have done if we didn't have if I didn't have all of this I'd have to spend ages writing ifs and else statements to get all of this control terminology pulled through. And that's much riskier for me as a person to write when you want it to match as much as possible to the specifications, or what's in your mind and so to do that it's easier if it's just automatically done for you so there's no chance for human error. So that's MediCorps and Metatools I think I'm going to hand off to Yuji. My name is Yuji, I'm a senior scientist from work in the methodology research group. And today I also have my manager, Kiva Anderson here. We will present today we will present a powerful tool using the metadata approach to for in clinical trials for analysis and reporting. I will first share my screen. So today's package I want to introduce is MetaLite, and this our package can transform the ADM data set in clinical trials into metadata, and then we can use this metadata for analysis and the first of all I would like to declare that everything I introduced in today's presentation is available in GitHub. And I will introduce three packages MetaLite, MetaLite AE and Forestly. And all these three package are open source in GitHub and we have very wonderful package down website for these three are packages, whereas there are rich documentations for you to read and learn. So, first, I would like to give you a high level story about MetaLite. We know like in clinical trials we have a lot of raw data, they can either come from data management warehouse or SDTM or some additional data source. And later on this raw data will be transformed into an ADM data and this ADM data is what we usually use for analysis and reporting in clinical trials. And for our proposed MetaLite package we will focus on the right hand side where we assume we already have ADM data set. And given the ADM data set, we can use the package called MetaLite to transform the ADM data into metadata. And in my rest presentations I will give you a concrete examples how these transformations can be done. And given this metadata, we then can use this metadata for some documentation purpose. For example, creating ANR grade validation tracker or some sub-mission documents. And this documentation purpose can also be realized by using the package MetaLite because MetaLite contains some help functions for documentations. And I will also show you some simple examples on the creation of ANR grade. In addition to the documentation, we can also use this metadata for TLF deliverables. For example, if we are interested in some AE tables, then we can use MetaLite.AE to generate AE summary, AE specific tables. And if we are interested in lab data, then we can use MetaLite LB to generate these tables. And here this MetaLite AE package is matured and ready for you to use. And we are still developing MetaLite LB and MetaLite SL for the future case. And in my presentations, I will also introduce an example how to use this metadata to build some AE tables. And finally, this metadata can also be used to generate some interactive reports. For example, we can generate interactive forest plot, and we can also generate interactive box pot. And in my presentations, I will take the forest plot as an illustration example to show you the magic of this metadata. So without further ado, let me first introduce how to transform Adam data into metadata. Basically, to build a metadata, we need to go through four steps. In the first step, we need to do some initializations about this metadata. So we need to tell the population data set and the observation data set. And in these examples, we use some public clinical trial Adam data set. And we assume the population is ADSL data set and observation is ADAE data set. And after feeding the population and observations, we then create some statistical analysis plans. And in my toy examples, I assume we have two examples. One is AD summary, and the other is AD specific. So I developed the first analysis plan by using the function called plan. And I tell, like, this is an analysis, what is the population and observations? What is the parameters I want to report? And when I add another plan, I use this pipe operators and use another function called add plan. So these two analysis plans can be add on like a layer to another layer following the logic in GG Pro 2. And it is more reader-friendly for you to review and reproduce in the future. And after defining the analysis plans, we just feed in the analysis plans into the metadata. And after feeding the analysis plans, the key question is we have defined a lot of the keywords in the analysis plan. For example, what is AD summary? What is APAT? What is week 12? And what is week 24? And what is any related series means? We define a lot of keywords, but we haven't defined the scope of these parameters. So in step four, we need to define all the keywords in the metadata. For example, I started defining the population keywords APAT. It should be the safety flag equals to yes. And I also assume like the two observation keywords week 12 and week 24 will be the subset safety flag equals to yes. And AOCC01F flag equals to yes, respectively. And I further define the keywords in the analysis. For example, what is REL? We know REL stands for drug-related AE, right? So we define like REL is AE related either equals to possible or probable. And the similar story happens to AE OSI and other keywords AE summary and AE specific. And after defining all the keywords, we now can build this metadata by just running a function called meta-build. And in this way, you will get a metadata. Let me see if I can get an example to show you what this metadata looks like. This is our package down website. And all the material are here for you to learn. So this is what this metadata looks like. So it will tell you first, what is the observation and the population data set? How many subjects and how many records do you have? And besides, it tells you how many analysis you have because the analysis is corresponding to the RTF tables. And furthermore, it gives you some details about the populations. For example, it tells you like APAT is safety equals to yes, and the label is all patient as treated. And it also tells you the details about the observations and all the details about the key parameters like related AE OSI, any etc. And finally, it tells you how many analysis functions you have. And in our examples, we assume there are two analysis, AE summary and AE specific. So the analysis functions only covers these two. So this is how we build the metadata from the add-on data. And after building the metadata, we then can use this metadata for these three purposes. First, let's see how to use metadata to generate ANR grid. So the generation of ANR grid is very, very simple. So this is a section of creating ANR grid. So basically, we assume this ANR grid has four columns. The first column is the RTF table titles. And to get all the titles, we can use these SPAC titles to play on the metadata. And you will get all the RTF titles automatically. And furthermore, for the second column, we want to know like the RTF file name. And to get this file name, you can use this help function called SPAC file name. And you will get these file names accordingly. And the third column is a function name. And in our mental light, we assume all the analysis is done by R. So this function name basically is an R function. And the last column will be the populations. And if you use a function SPAC analysis population, then it will automatically tell you the population subset and observation subset used to generate this specific RTF. And in this way, you can generate a very simple ANR grid. And if you want to get more columns to this ANR grid or more details, please feel free to play with all the functions beginning with SPAC. Because this is all the help functions we implemented in this package to help you get some documentation purpose. And after introducing the documentation of MetaLight, then let's take a look at an example how to use MetaData to generate some AE tables. Assume like we already have this MetaData build. And with this MetaData, you only need to go through two steps. The first step is to call the format AE summary. And in this example, I assume we focus on the generation of AE summary tables. So the format AE summary function basically is to tell like how many digits you want to display for these proportions. And after getting all the format AE summaries, then we need to call another function called TOF AE summary. And this TOF AE summary basically is to tell like first what is the text you want to add at the bottom of these tables, and also where you want to save these RTF tables. And if after you applying these two functions to our MetaData, then you will get AE summary tables similar like this one. And please note that these tables are 100% generated by R, and we also verify the output by R with that INSAS and ZMATCH. So the validation work is done and the R analysis is accurate. And then you can also use a similar procedure to generate AE specific tables just to call like format AE specific as the TOF AE specific. And also there are some details. So for example, like if I want to adjust some column wise of these RTF tables, I can add more details of these TOF AE summaries. And I can easily like also enlarge the font size of these tables and the orientation of these tables, etc. So it gives you a lot of the cosmetic costumes you can play with. And another fancy thing is that we can also generate a more carved tables. So here like if you here adjust the argument mock equals to 2, then everything will be masked with this X sign and it will give you a mockup tables. And this is an example of using metadata to build some analysis tables like AE tables. And here I would also give you another examples on the metadata to get some interactive reports. Let's take the forest plot as examples. So here is a story how we create an interactive forest plot. So we assume we already have a metadata already built here and to get a forest plot, you only need to go through three steps. The first one is to prepare a forest. And in this part, you need to tell which population and observations you want to use to generate this AE forest plot. And you also need to specify the parameters. Please note that these three parameters will finally be passed on to this AE criteria. So here we have any AE, drag-related AE, and series AE. So in this select list, you will have any AE, drag-related AE, and series AE. And if I click drag-related AE, these tables will be automatically updated. And all these AE's listed here is drag-related AE. And if you choose another like series AE's, then we only have two series AE listed here. But if, for example, you are interested in other criteria like AE, OSI, or grade 325, you can add more parameters here. So you will get enlarged selection list here. And after prepare AE forest, you need to go through the second step is format AE forest. And this is basically to tell the column-wise of this forest plot and the font-size of this forest plot, etc. And the last function is AE forest, which is to display this AE forest plot. And after getting these three steps, you will get an AE forest plot like this one. And in this AE forest plot, the first column here is AE events. And we have AE proportions, the risk difference, and all the numerical details here. And for this forest plot, it has very fancy interactive features. For example, if you are hovering your mouse over all these points, the numerical labels will be displayed. And also it will show you the details of these confidence intervals. And another interactive feature is that we have a triangle button. And if you click this triangle button, it will show you all the subject details who are experiencing these AE's. And here we present the site number, the patient ID, gender, race, etc. for illustration purpose. But if you are interested in displaying more subject details or less, it can be also costumed by yourself. So in this ways, we can get some AE listing details in this AE forest plot. And also we have the third interactive feature is we have a search bar here. So for example, if I'm interested in AE with keywords, then I can enter these keywords. And it will automatically filter out the AE with these keywords pane. So in this way, maybe the clinicians can find anything they are interested. And also a very powerful interactive feature is this slide bar. So here the default is to display all the AE with any number of index. And if you are interested in like AE, for example, with the index larger than 8% or 7%, you can just play with this slide bar and it will automatically filter out the AE whose index is larger than 7%. And also you can slide different AE's if you are interested, either drug related or serious AE's, etc. So this is a very powerful tool for us to get all the details in one single plot. And this is a story of our meta light approach. To catch up, we have a pack this item data into meta data. And then this meta data can be either used for some documentation purpose or generate RTF deliverables or generate some interactive approach. And at the end of my presentations, I would like to mention some of the key advantage of meta light. So first, our tool is an end to end tool, because it starts from defining like defining the observation and the population data set. And it ends with deliverables of RTF. So it covers all the steps in software development life cycles. And the second key feature is its automations. A function core is always better than a checklist. So we can have like ANR grade and also like the R analysis functions together instead of separate them in like Excel files and in such micros. And the third key feature is its trace abilities. All the analysis record can be easily traced. Let me show you a little more details. Recall that when we define the meta data, we have a lot of the keywords. And after the definition of meta data, we may somehow forget some of the keywords we just define. But it doesn't matter because you can always get the definition of your keywords by using the function called clad and mapping. In these functions, first you input your meta data and then you search the keywords. For example, I forgot what is APAT means. So I search this APAT and it will show you like, oh, this is all patients as treated. And the scope is safety flag equals to yes. And you can also search for another keywords. For example, what is SER? Is it a drug-related serious AEs or just a serious AEs? I just forgot. And you can search for these keywords by using this function called clad and mapping. And it will show you all the details. It is serious AEs and the scope is AESER equals to yes. So it will not be a big problem if you forgot any details about your meta data definitions. And everything can be traced. And another key feature of the meta data is its single entries. We only need to enter or define the keywords in one place. And after that, this downstream change in meta data will be automatically passed to the downstream in analysis. And here are the examples. So here like, we assume like previously we are interested in APAT populations where the safety flag equals to yes. But now we don't want to analysis APAT populations. We want to analysis ITT populations. So we just switch from the APAT keywords to this ITT keywords and define this ITT words as ITT flag equals to yes. And label it as intention to treat. And then if we define update the meta data, then all the analysis downstreams will be adopted automatically to the ITT population rather than the old APAT populations. So in this way, you don't need to update the ANR grid manually one place by another place. You only need to change one place and we will automatically do the synchronize for you. Yeah, that's that's all for my introduction of MetaLite. And even do you have any closing comments? Okay. First of all, both of these presentations were really awesome. And I hope people enjoyed them as much as I did. Several years ago, I went to a short course that Frank Carroll from Vanderbilt gave and he was working on data monitoring committee reports that were interactive and very visual, but with the ability to, you know, drill down on patients to basically make your data monitoring committee members much more efficient in doing their review. And so, Yu Jia, along with our colleague, Yilong Zhang, have worked with us to do an initial safety data monitoring committee report that that's interactive. And one thing that's really cool about it is that it doesn't share the data sets, it just has the ability with NHTML to go through a lot of things that Yu Jia has shown you. And so all the review committee members need is a browser and an HTML file that we can share with them. So it is pretty awesome. So, you know, in terms of our strategy at Merck, kind of the next thing we're thinking about is some larger scale reporting where we're doing a lot of modeling for health technology assessment. It was decided that those analyses would be best run in R for, you know, partly because of the package and capability availability and partly for other reasons. But that provides us a bigger kind of production stage to see is this something that's really going to have a big impact on us potentially. You know, otherwise, Merck is still at a stage where I think we're thinking a lot about automation and trying different things. But we don't have kind of an end-to-end coherent, here's where our biggest sympathies are and, you know, where we're really going to focus. So I think there's still opportunities to do that. And in any case, maybe a question or two before we open up more broadly. Just, Christina, one thing, you know, somebody in my group says, well, why would I use MetaTools and MetaCore? Couldn't I do the same thing in SAS? Do you have any comments on that question? Yeah. So like the one comment is that while SAS does some of this potentially SAS plus other systems do things like checking control terminology or those slightly more complicated derivations in terms of like calculating subgroups. So that's one reason. Also, unlike SAS, this is an open source tool. So like I am the maintainer of MetaCore and MetaTools, but I'm just the maintainer. I'm not necessarily like, if there's a thing that you want to see that you're like, I would like this automation that makes sense to automate using this metadata. Just put in a poll request. And I will review it. And I will probably accept it. And then you will have that thing. And everybody wins. So like it's a little bit more of the like, we all work together. Like when it comes to specifically the metadata that I cover with the stuff that's in the define, because it's in a define that is required by the FDA, it's fairly standardized at this point. And so there is, I feel like a lot of opportunity to easily find more easy wins within that metadata. And to build out more and more tools. And the more that we can have standard tools that everyone can use, you guys at Merck, us at GSK, like everybody and like even the FDA when they're reviewing it, I think the better we're going to be rather than having to rely on like internal SAS macros where like the FDA has to spend more time reviewing that because like internal macro is not like something they can be using regularly where if we're all using standard tools and standard our packages, we all kind of rising tide raises all boats kind of argument here. But that's, I would say, it's a fair comment to say a lot of the things that are currently available in MediCorps and metatools are kind of what you can do in SAS already so there's only huge advantage over doing it in SAS, but it really just opens the door to if you want to develop a data set in R using like AdRoll and some of the other packages that are out there, this will help. Great, thank you. And so do you have any, what kind of collaboration group do you have both internally and externally? So, MediCorps is under a Taurus, so technically like so a Taurus and Mike Sackos and I were, and a woman called Maya, we're kind of the main developers on it. At this point, it's mostly me, myself and I so I would love collaborators if someone is very interested in this, please contact me, I would love that so much. But yeah, we are part of the Farmiverse, so it's part of the general Farmiverse collaboration. So kind of I have been speaking to people with like I know that Roche is starting to use the tool as well. And so they're starting to adopt it for their needs. And so other than like general conversations with Roche and others kind of just me and also with Mike. So, so that's those packages, but like I have had, I can't remember his name, someone did do a poll request in to add some more functionality I think to MediCorps tools, which I accepted and that's got pulled in. And so, more of an ad hoc than formal like partnership. But yeah, we will, but I would love for help and ideas if there's anyone who would like to add something to this that they feel like, Oh, this thing is missing and I think it could really add a lot of value. Right. Yeah, and I, I guess, I went in my package develop and I probably tend would tend to suggest people submit issues rather than pull requests, but that that may be partly subject matter driven. But so I accept all things. I accept issues, whatever you feel comfortable with. Presumably you might get emails to emails, I get messages on LinkedIn. So it's a combination of Slack we're very active the so medical or meta tools are part of farm reverse and so they have their own slack channels within the farm reverse slack. So that is a another option for how to message me if you would like to I have many channels of communication and I accept them all. That's awesome. So, again, um, you know, Oh, for you Jeff for those who I don't know if everybody's familiar with plot lanes so everything that ends with L of Y if it looks like a curious wording it's just because those are interactive plots right. Yeah, we borrow a lot of ideas from pot properly and we name our like forest parts, the package called a forestly. So we can like a following a very similar naming conventions as priorities. Yeah, you know, it's just awesome to get the interactivity that you might get elsewhere that's just a huge advantage of power. You know, I have other things like I could bring up this stuff. I could talk about all day and all night, but maybe I should them are you going to say anything or do we have any other questions. Yeah, so I think there were a couple questions through the comments. And I wanted to hit a couple admin questions and then a couple presentation specific questions so in terms of admin questions. So our plan is to post the presentations on the our consortium website. In terms of accessing code. That was a pretty common question. And I think any nickels posted that medical or metatools open source. He posted the links as well as posted links to the to the code. Also on crayon, we expect a new release relatively soon. You can try to select updates, but other than that, they're like crayon or GitHub, both of them are available. And then there was an open question about experience with regulatory agencies for self-contained code. So suggestion would be to go to the previous webinar where the FDA spoke a lot about that. So that would be a good webinar to check out. For the presentation specific questions. For you, Christina, there was one from Pavel and he basically wanted to confirm that it would be possible to drive basic derivation of code deco pairs dynamically from a read and spec. So like, could you leap through and create a var from codeless without writing the code using metacorn metatools? Yes. So let's see. So. Sorry. With medical or metatools at the moment, there's not a functionality that just says, like, do them all. Like, that's not currently in there, but you could theoretically write that. And there's there's no nothing stopping you. If you want to write that add that in great. But yeah, so, so we don't currently have functionality that will do all the dynamic pairs. If you provide a like list of like, here's all the dynamic pairs, you could easily also use something like map to in order to, or maybe not quite map to, but you could definitely make it happen in our very easily. Perfect. And then for you, Eugene, maybe even you would know this as well. There was a question about if modules for lab data, etc. are on the horizon for for metalite. And for the lab data, it is still under the development and currently we take the a data as an illustration examples, and the development in lab data will following a very similar procedure. Okay, perfect. We'll see where it goes but you know, it would be wonderful to be much more comprehensive in this. And that, you know, would have to be prioritized. And I think also if people were interested in collaborating in this general area. You know, please reach out and we can consider it we haven't done not so much so far but this is obviously potentially a lot of interest and I generally have the attitude like Christina the work better off with open source communication to get acceptance of places like regulatory agencies and elsewhere. Awesome. Great, so I haven't seen any other questions in the comments section so I guess to the audience if you have any questions, please put it in the comments. I'll probably give some people time just in case there are questions. I did have a question for for the presenters. And it's kind of a inspirational is the right right question right way to phrase it but are there any parts of the kind of the clinical delivery that could use metadata that's currently not using metadata that you think is an interesting problem to solve for for speeding up clinical delivery. Go for Christina. And I have some thoughts. So, one thought is within kind of the metadata space that I've talked about today around derivations potentially because the way that we do derivations today and define XMLs kind of just loose text format it's pretty hard to build information off of them today. But if we could make them more of a rigid slightly rigid not too rigid but metadata driven format you could definitely get a lot of easy wins I think from that. And then another place is actually when it comes to this place. And we at GSK have another package called T format which really is looking at generating tables based off of metadata, not doing the analysis so it doesn't do quite what you G presented here it really just does the displays only of taking what an ARD or this analysis results data sets that the CDISC CDISC is starting to kind of push as like a new data standard and building displays off of those. So that's like another place that I would that I find interesting. And I think there's a lot of chance for speed so that you people don't have to spend time formatting spaces etc etc so Yeah, I mean my my dream would be, you know, you've got metadata, you can use it to generate You know an overall plan for what you want to report in a study, which may be a very different layout than some other tools. And, you know, we have say PowerPoint tool that does that at a high level. And so that without looking at a big pile of PDF mock ups. People can see kind of, okay, how much are we putting in our submission and where are we putting it. Do we have all the content we want is it a gazillion tables or, you know, 200 tables. Are you using standards. Are you making a lot of custom things. And then, you know, from that, and I think a lot of the basis is here you know you can, if you need specs generated for your internal processes and you can do that. You need review plans for for checking and validation you can do that if you need to generate code and tables you can do that. And then the obvious interactivity so that, you know, in the future, presumably, you know, you're doing the thinking about subject matter as opposed to details of implementation so that you know, that's probably where we all want to go. But, you know, this, these kinds of tools are, I think, a great move in that direction. Yeah. And I agree, like, from my personal experience assisting the clinical trials, we always find it, like the clinical trials, at least for the RTF, there are a lot of updates going on. And when there are some updates, the programmers and statisticians need to update the ANR grade, also the SPAC, and then rerun the tables. So there are a lot of the repeated work there. And sometimes we just forgot to update one single file. And this small mistake may cause a lot of confusions for the statisticians or the programmers when they review the older files. And so our high level plan is to get, if we can get everything automatically realized by R, which can save a lot of the manually entry data, we can save a lot of time and effort from both programmers and statisticians. Awesome. So it looks like we had a question come in from Andy Nichols. So a question to the presenters. In your experience, how are companies tending to store the metadata outside of our, is it in a database? Excel? And do you think the surrounding infrastructure we have is mature enough to support metadata driven workflows? I'll take an initial shot at it. Excel is like so many people's favorite tool. I wish it weren't worth sometimes. But, you know, I think we tend to do a lot of documentation stuff there. And I think the question about is the infrastructure mature enough to support metadata driven workflows? I certainly think it's mature enough to be working in that direction. And there probably are updates that would be needed. But I wish I understood more about what other companies are doing and if there are enough commonalities, you know, that building reusable tools like Christina and UJ have shown, you know, could drive that. Certainly, that's where we'd love to go. But, you know, people should also respond to this in the chat, as well as, you know, Ben, UJ, or Christine are commenting. Yeah, I think at GSK we also, we do use a database, but also Excel, because as Ethan said, Excel is everyone's favorite tool. I think that, I mean, MetaCore accepts Excel documents in part because they are so common. And, but I think I'm probably of the belief that building small tools that can leverage metadata in workflows, or like within code is probably where we're at today as opposed to a fully automated workflow because I just, I, because like Excel is not a great tool for Meta for putting things in, because like there's no way to check when you've done it wrong, or to be like this doesn't make sense. Like none of these things can have this over here and this over here like you need things that do that automated checking and stuff. So, but also it's really easy to use. There's like, it's hard, it's a hard problem of like just even how you put it in, how do you copy from standards and then edit it like, I don't have solutions at all. But I think where we are today is like, probably a hybrid approach, but that doesn't mean we can't get to a fully automated approach in the future. Well, I mean, don't you think you should take credit for the fact that once you've gotten it in from Excel, you've got your intellectual property stored in R, it's checkable there, and it's reusable there. Yes. Yeah, so I mean that that is the purpose of MetaCore. I mean, you know, some things where it's like immutable. So it is a little bit of a weird R object. It's not an S3 for those who really love R. It means that you can't edit it. So you do need to go to source to edit. So we do try to keep things within MetaCore so it's harder to do things you shouldn't. Like if things are failing just removing it from your local version, but that balance between having it central and having it so that everyone can be accessing it but also having it in your local R session so that you can access it and leverage it is definitely a tight line to walk. So Christina, you know when you laid out the different components of the metadata, it looked like a relational database. Could you just comment on how much it is and how much it isn't like a fake relational database in my head. So it technically is not it's it there's it's not like a little SQL light sort of thing hiding in there. It's officially not. But if you were to put it in a relational database, it would be very easy, because it is more or less that structure, some of the things like the codeless are nested tables. So it's lightly more complicated. But yeah, you could just you could just put that into a relational database. That's a good mind map anyhow. Yeah, I mean that is that is the exact structure of what it looks like each of each of those tables does exist in the data set in each data set and so you can like pull them out and pull them all combined or pull them out in parts. And that is totally possible, but it's technically not a relational database. Yeah. More technicalities than on actualities right and you do you want to comment on the underlying data structures or maybe I saw you long commenting or earlier he may want to put a chat into. Yeah, for the update of the metadata structures, it depends on the audience we want to deliver the metadata. So if we want to deliver this metadata to stats or with some with audience who has some programming background, this metadata is very easy for them to read and learn all the details. And if we deliver this metadata to the audience who has limited programming background, then I prefer not to deliver the metadata instead we deliver some RTF tables or some visualizations of this metadata, because like all the numerical details. It's hard to to for them to get all the right. Oh yeah, but I mean there's there's the underlying data structure but then there's different views of it which is kind of what you're describing. Yeah, you know, a physician or team planning view. There's a table mock-up view. Yes, yes. But then underneath you have to have all the data structures to support those views. Yes, yes. We the metadata created can be used for different purpose either like depending on this project going on and the business need. But with all the essential data information in this metadata, it can be used to generate any tables or purpose that team needed to display. Cool. So it doesn't look like we have any more questions from the audience. So, before closing it out, any final thoughts from our presenters, Eugene or Christina. Well, I hope I hope other people are equally excited. It was really nice for people to see comments from people that they look like they're interested. So I think both companies are interested in interacting and talking about these things. Oh, and one other thing, Christina, when you mentioned the Taurus, I mean, it is great to feel like there is a company out there that, you know, presumably is actively engaged in this so that other companies can feel like, well, this isn't GSK's tool that they can run away and hide with. No, and like meta tools. So I can, I can confidently say that like Rosh and Taurus are all actively like involved in MediCorps. Rosh doesn't do as much of the coding as much as they like tell me I would like this that or the other thing and I do that for them. But like, it's, it's definitely something that they are starting to use that they plan to have in their frozen environment from what I hear. So it's definitely something that people are engaging with. And yeah, so if it if if you do want to engage with it and your company and you find something isn't working correctly and you need help and just getting help up and running if you have like weird specs, like feel free to reach out to me and I'm happy to help as much as I can. All right. Cool. So, I think we will wrap it up now. So thanks everyone for for joining the webinar. I thought it was really great to see sort of like actual practical demos of how to use metadata within the clinical delivery pipeline. And as always, we will post the webinar on the website as well as presentations and just thank you all for joining and hope you found it as as interesting as as we did. So thanks for joining and hope to see everyone in the future on another webinar. So thank you. Thanks.