 for joining us today. So this is our R adoption series event and we will be talking about R and Shiny regulatory submission. So I will, my name is Ningen, I'm from Roche Genetic and I will get started with the introduction about today's session as well as the working group who is organizing, our submission working group who is organizing this event today. So sorry, let me go to my slide. All right, so going back, yeah. So this event is co-hosted by the R consortium, R submission working group, which is a course industry collaboration with the aim to improve open source language usage in the regulatory setting. And this is also co-hosted by the FDA statistical association, which are representing the statisticians in FDA, working on cross different therapeutic areas. In today's session is hosted also organized by the R submission, R consortium, R submission working group and then also jointly with FDA. And as many of you may know that from the collaboration between R consortium, R submission working group and FDA, we actually did, very proudly we did multiple pilot submissions aiming at showcasing how to use R and Shiny to do a submission for clinical trial submissions. So we finished our pilot one last year, which was aimed at submitting our PLF based on using our language. And then that was successfully wrapped last year. And then you probably already heard about our learnings in the last adoption series happened last year. And then this year, very proudly, we finished our submission of pilot two, which include a Shiny component. And currently there is ongoing pilot number three, which is trying to showcase how to submit Adam data, our general Adam data to FDA. And this pilot has been already submitted to his team in his under review right now. And for pilot four, which Eric is leading is working on exploring new technologies such as container to streamline the submission, which is undergoing right now. And for pilot one and pilot two, we successfully submit those pilot and then received feedback from the FDA folks. And all the formal communication can be found on our GitHub repository. Here I'm just showing a screenshot of the formal response letter. And then all the materials for the R-Submission Working Group first are available on GitHub. And also everybody is welcome, anyways, welcome to join us. This is an open working group. So if you want to join us, we do host monthly meetings, please feel free to email this email address over there. Yeah, so as many of you may may attended a previous session from us, which happened in 2022, where we shared our learnings from the first pilot submission from our consortium as our submission working group. And in today's session, we would love to share more on the recent learnings since 2022, including the the review experience submission experience for the pilot number two, which include a shiny component. So we our session will start with a presentation from Paul Schutte, who will be giving us a overview of open source software for regulatory submissions. And following that, Eric will dive into our consortium R-submission pilot two and to share his experience on leading that effort. And following that, Hisu from FDA will be sharing her review experience of R-based submissions. And after that, we will do an interactive panel discussion. And like I have some prepared questions, but we are also welcoming any questions coming from the audience. So if you have any questions, please feel free to share that in the Q&A session or in the chat and I will monitor that. Also, I would like to give a heads up that on January 8th, we will have a another session following this one talking about the adoption of R in Japan for the pharma industry. You will be very exciting to learn what's happening in Japan, how they use R in their regulatory submission and also like learn about their perspective of adoption of open source language. And with that, I will turn to Paul to give his presentation. And Paul, do you want to start sharing your screen? You need to stop sharing first, unfortunately. Yeah, let me stop sharing. Okay. So let's see. Okay. And you can see that the title slide? Yeah, we can see it. Okay, great. Thank you, Ning. So my presentation is entitled the open source software for regulatory submissions in regulatory environments. I'm the deputy director for the division of analytics and informatics. So the whole title goes on for a while, as you can see, but we're in the Center for Drug Evaluation and Research at the US FDA. Our standard disclaimer is following. Basically, if you don't like what I say, blame me, not my agency. Let me start with where things began for me in some ways, going back to actually 2015. One of the early questions we had was, are we allowed to use non-proprietary software in our review work? And can potentially sponsors submit something other than standard proprietary packages? And we have this statistical software clarifying statement we came up with, which was as follows. FDA does not require use of any specific software for statistical analyses. And statistical software is not explicitly discussed in Title 21 of the Code of Federal Regulations, which is what governs FDA. However, the software packages used for statistical analyses should be fully documented in the submission, including version and build identification. It goes on to say, as noted in the FDA guidance E9, statistical principles for clinical trials, the computer software used for data management and statistical analysis should be reliable in documentation of appropriate software testing procedures should be available. Sponsors are encouraged to consult with FDA review teams and especially with FDA statisticians regarding the choice and suitability of statistical software packages at an early stage in the product development process. And so that was adopted in May of 2015. We didn't hear a lot immediately, took a while to disseminate. But finally things have started to come in. And then we got to the next level, which is, which program should be submitted and how should we do that? So this is taken from the Study Data Technical Conformance Guide. So there's a link here, but if you even just Google it, you can find it. Make sure that if you do find it online, that you're using the most recent version. So the Study Data Technical Conformance Guide is updated twice a year. So if it has a older than six months, you may be using an old version. So sponsors should provide the source code used to create all item data sets, tables, and figures associated with primary and secondary efficacy analyses. Sponsors should source code in very technical language, single byte ASCII text format. Files with Windows, Microsoft Windows executable extensions, such as those listed, should not be submitted. For a list of acceptable file types, referred to the document entitled specifications for file format types using ECTD specifications. So there's a subsequent document that will show. Further more, sponsors should submit, excuse me, sponsors should submit the source code used to generate additional information included essentially in the labeling is one way I think of this. And I've added the bold here at the very last sentence, the Pacific Software Utilized Version and Operating System should be specified in the ADRG, which stands for the Analysis Data Reviewers Guide. So this is fairly prescriptive. Keywords are ASCII text format. Do not submit executables. Refer to a list of acceptable file types and include version and operating system. So which file formats are accepted? Something that we should point out is a text file does not have to be a .txt file. While .txt files are text files, the reverse may not be true. As of 2021, cedaraccepted.c.cpp.m.atmap.rmdpy.jl and m5. Well.sas and .r are acceptable in modules m3 through m5. And you can go to the link here. Specifications for file format types to actually confirm that for yourself. Again, this is going to be a fairly technical slide at this point. Some caveats basically is, as of this time of this writing, basically you should only use ASCII text codes right now. Variable values are the most broadly compatible with software and operating systems when they are restricted to ASCII text codes. Some technical terms use UTF-8 for extending character sets. However, use of extended mappings is not recommended. So this is actually something that shows we've encountered issues with. But if you stick to ASCII text, you should be good for the foreseeable future. Okay, so switching gears a little bit or switching from what FDA will look at to some of the possible forms and alternatives. So let's talk a little bit about the R programming language. A little bit of a brief overview based on ASCII, which was originally developed at Bell Labs by Chambers et al. in 76, extended to R by some folks in New Zealand in 1993. What was interesting for me when researching some of this is how some of this evolved, how CRAN evolved. Now our studio is a widely used integrated development environment and we now have some governance for this open source software. We have an R core team and of course the R foundation that supports that. And then the R consortium on top of that which supports the R foundation and the R core. So here's the link to the R consortium, which is the sponsors of today's talk. And members of the R consortium include a number of pharma companies, some tech companies, and the American Statistical Association. Our consortium as Ning was pointing out has a number of various working groups. We have the R validation hub, our repositories, our tables for regulatory submissions, and our submissions. We'll be focusing primarily on the last one, our submissions. So Ning did talk a little bit briefly about some of our pilots that are completed or ongoing. Pilot one was the R-based submission of tables, graphs, and analyses for Cedar using Atom data sets. So basically the analysis of the Atom data sets were performed in R. Pilot two is a Cedar submission with an interactive shiny component, which has been completed and will be discussed further by my colleague, Hasee Cho. Pilot three, R-based Cedar submission, derivation of Atom data sets from SDTM is pending. And a pilot four has been discussed and it is development. And then the actual GitHub site of the submissions working group is listed here. And the data sets that we're looking at are based on Cedar submission as well. So we're going back over where we were a little bit from last year. This is from Pilot one. It's just a basic demographics table. And it shows what R can do, unassisted, with other things. This is just taking the Atom data sets and doing the appropriate analyses. And as you can see, this is from Hawaii's back from 2021. We have a Kaplan-Meier Curve, Kaplan-Meier Curve plots. And we can see basically the standard types of graphics we would anticipate. Now, what's different with Pilot two is that there's an interactive Kaplan-Meier plot. And it's somewhat interesting. And others will discuss things further. But you can see you can actually adjust with sliders, say the age category. And you can see the age distribution that's available. So some of these interactive features are actually quite interesting to play with. So those are the first couple of iterations and they're continuing to evolve. Other groups supporting R, it's a team effort. There's an R and pharma conference. Most recently, it was held in October of this 2023. You can find many of the presentations online and view those both slides in some cases in recordings and others. There's an R and medicine conference that occurs in June. Some of our colleagues have in other areas, other companies and agencies have formed a collective. They call the pharma verse, which has some interesting tools. There is, of course, the use R conference, which is slated for July. And of course, the ongoing work of the R foundation. I did say open source. So it's not exclusively R. We can mention a few things about Python. Python is not used typically for submissions as much, but it is used in some places within and outside of the FDA. So things like natural language processing, text modeling, optical character recognition, machine learning, more of your data science applications. There's less of a focus on statistics and visualization. Some commercial packages, including some deployed at FDA use Python. And there is an ongoing active effort to essentially have shiny app functionality with Python. So let's talk about a little bit about why isn't everyone using open source immediately. So there are some challenges to using open source. One is the rate of change. What we might call the who are you going to call problem as well. There isn't necessarily a dedicated help system in place. Support by it. As folks will sometimes remind you from your IT department, open source isn't always free. It does take time maintenance and updates. There are a plethora of packages, many, many packages. I think they're all I can easily say there are thousands of our packages available, which ones can be trusted. And then even within the ones that can be trusted, one can find that at times there are dependency and version control issues. Some reflections and observations. It is now eight years after we and Cedar issued our statistical software clarifying statement. But we have yet to experience a completely R based submission. We do see some hybrid submissions, some use of both SAS and R, particularly R for graphics and hybrid workflows are also reported. Recent graduates tend to have more experience with open source tools than proprietary alternatives we're finding these days. And if you are thinking about a R based submission, we certainly encourage that sponsors reach out to review divisions. One of the aspects we had discussed is that most of us doing review work in FDA are using Windows based environments, whereas most sponsors are using Linux based environments and occasionally that can pose some challenges. Some additional reflections and observations open source tools are popular with AIML. We have to include some discussion of that at some point in the future. Merging technologies while not specifically open source have potential to be disruptive as does AIML. I'd like to put in a quote due to the great Stanley with great power comes great responsibility. Open source tools and methods don't necessarily promote good statistical practices. We do need to avoid P hacking and cherry picking. Here is my contact information and let me stop sharing and turn it over back over to Ning. Thank you very much, Paul. And yeah, I guess with that we'll turn to Eric to give an update on the Sashini submission and just a reminder to everybody we will leave all the questions to the panel discussion. So if you have any questions, please feel free to put that into the chat box or box or the Q&A feature. So with that, Eric, I'll turn it to you. Well, thank you, Ning. And thank you, Paul, for the terrific kicking us off introduction here. In fact, Paul has actually foreshadowed quite a few things that I'll be touching on in my roundup here of the pilot to application in particular. We're going to kind of go over a bit about the history of how this came about and some of the very important details in terms of making the application and a lot of the practical considerations along that we encountered on that journey to make this a successful submission. So let's dive right into about the history, so to speak. And I would say that with this effort and frankly, a lot of the efforts we've been seeing the last few years that have been transforming life sciences of open source software, I would say had a lot of roots in the R Pharma conference that Paul and Ning had mentioned earlier. In fact, late breaking news, the recent conference videos that Paul mentioned occurred in October are now officially on the YouTube channel. So you can check those out if you missed the live recording or the live presentations of those. But this screenshot here is from the very first R Pharma that was an in person event over at Harvard University. And it was at least for me personally, one of the first times that I could get together with, I would say like-minded, enthusiastic, statisticians, data scientists, programmers that want to leverage open source leverage these newer technologies to supercharge our drug development process and really start to work with each other, not just work in silos. And R Pharma, I think was a huge step to making that happen. We were actually quite privileged that year to have Joe Chang, the author of Shiny himself, come give one of our keynotes talking about the use of Shiny in the pharma industry. And in fact, he had a bit of a tongue-in-cheek moment in the beginning where he was admittedly kind of terrified about just how far we were trying to extend Shiny in use cases such as reproducibility and other HPC enhancements and the like. And he thought this has great potential, but he really kind of opened his eyes to just where we were going with Shiny as a fun foundational technology in the use of within drug development. And in fact, we've seen almost an explosion of content, whether it's in this conference or other conferences that have a life sciences flavor of just what Shiny is doing for for life sciences worldwide. And well, right on cue, Paul himself was part of the first ever R Pharma and then parts of these subsequent ones as well. And this is where we learned about, as he mentioned, the software clarifying statement that really is just was reinforcing to us that we are on the right track and not, you know, going behind under covers here, so to speak, leveraging open source in any nefarious way. This is open source and software itself. The FDA, as he's mentioned, is not requiring any specific framework. So really, we're seeing a lot of momentum draw upon these kind of disclosures, these messages and a lot of the innovations that have been happening in life sciences. And as he mentioned, one of these efforts that really took foothold from these initial discussions at our pharma and into more robust collaborations has been the pharma verse, which has been as a team as a collection of packages, helping with the data processing and tabular generation of statistical results. And it is garnering a lot of positive momentum. And I know there's more work to do in this space, but it is really exciting to see just how far our is being being embedded into clinical statistical programming. The pharma verse is definitely one of those promising efforts. Now, of course, I'm going to talk about shiny quite a bit, right? Well, alongside the pharma verse, another effort that has been specific to shiny, so to speak, has been the TO framework. The TO framework is a set of shiny modules that are helping to automate the exploration of clinical data. And it's really giving us a lot of efficiencies in how these analyses are quantified. It's a really promising effort. And again, there's a lot of momentum around this as well. But seeing what shiny can do to help streamline the production of these clinical results, definitely, again, got the creative juices flowing in terms of where we could take shiny in the future. And also, as been disclosed in previous papers and presentations, that yes, our colleagues at FDA are using shiny themselves. This is just a set of links that you can access after this webinar. If you want to check some previous materials, that just like life science sponsor companies themselves, FDA and others are definitely using shiny rather to help streamline some of their processes. Now, we've seen some momentum already happening from the previous of our pharma conferences, the previous collaborations, that shiny itself is playing a key role in helping to accelerate internal decisions, helping bring us a more efficient, automated way of reviewing clinical results and clinical data, and then being an attractive front end to some of the more complex analytical routines that one might encounter in, say, design of clinical trials, other pipelines. In fact, we don't have time to get into some of the work that I've been privileged to be a part of with using shiny in the complex innovative designs initiative. But what we're thinking about, we've been thinking about for a few years now, is that how can shiny bring some of these same benefits that we're seeing across different aspects of drug development into an actual submission package? And as Paul mentioned earlier, the our submissions working group, part of the our consortium was formed as again, a cross industry working group with members from the sponsors and life sciences and as he mentioned, direct some direct collaboration with regulators, such as the FDA. And our goal is to look at the overall clinical submission package process and the overall packaging and seeing how our can be used to generate analysis programs and results and beyond so that we can have a clear reference, a clear resource for those that want to take this journey and not sure where to start, or maybe asking themselves if this indeed is possible. Well, yes, it certainly is possible. We'll get to what we've learned in a little bit, but a unique feature of this working group is the very iterative nature to be able to identify any gaps that occur along the way, as we're building these analysis programs with our as we built this shiny app with our and to really put it through kind of a stress test, if you will, to see what we have to do to get it through the gateway, so to speak, of transfer and into the hands of the reviewers at the FDA. And again, nothing's being done here behind closed doors, all the materials of this are out in the open on our GitHub organization, many resources on there, I'll be highlighting those as we go along. As Nick mentioned earlier in the webinar, the first foundational layer was laid in November 2021, when we had a successful submission of using our to produce tables, listings and graphs, using cdisk open data, but this was done entirely with our and throughout the way, we adhered to the ECTD or the electronic common technical document specifications for sending these materials to the FDA. And in fact, there was, we leveraged a package called PackageLite, P-K-G-O-I-T-E, at that time that this was written, we had thought somewhat ironically that all the R programs had to be bundled as a literal text file along the way. Well, we've learned a lot since then that Paul clarified earlier, but the submission package regardless was a way to demonstrate that R can indeed be a part of a clinical submission with generating these analysis results. But that's not what we're going to be talking about today because you heard about that last year. Now we are going to talk about the shiny component of this. So it's time for a live demo. Let's see what happens. But what you see on the screen here is the pilot two shiny application. I'm running it locally here just to make sure I don't have anything crash. Now on the surface, it's going to look somewhat basic of your shiny pro, but nonetheless, it is a way for us to efficiently put in the results that we generated from pilot one. What do I mean by that? Well, we've got our tabular summaries, such as demographics, such as our primary endpoint analysis. And again, these are using packages in the R ecosystem that are very efficient for creating summary tables, just like they were in pilot one. We didn't change anything about that. Same with this efficacy analysis again, using the same results as pilot one, but putting a little helpers along the way, such as tool tips and whatnot. And we even added at the FDA's request new table, looking at the summary of completer status across the different time points for the different treatment groups. Again, with our a lot of this is quite easy. What gets really interesting is we get to the graphical summary, which was as Paul mentioned, a Kappa Meyer of visualization for the time the first thermatologic event in the population. And on the surface, you see a very nice looking Kappa Meyer plot written with one of the GG plot two variants in the R ecosystem. But midway through the creation of this application, we thought, well, we are building a shiny app after all, why not add some filtering capabilities to give a more interactive component to what this application can do. So as Paul mentioned, we could take a variable that's available in our ADSL sets such as age, get this nice looking histogram to distribution of ages, and simply condense the data a little bit. And like anything is shiny, by default, everything is reactive updates in real time. I don't have to go back to my desk and create a brand new output with this age cut off. I can have it right here in the application, all the completer numbers at risk are updated in real time. And again, that's one of the selling points of shiny itself. Also, I know from my experience in creating shiny apps that it's very important to give the users kind of a nice starting guide, especially for an application that's more than a handful of inputs and outputs. So we put some nice clarifying information here about what was in the application in terms of the summaries, and a little preview of how to utilize the dynamic filtering capabilities. So we get nice advantage of some nice web based formats to document all this. So with that, let's go back under the hood a little bit of what this application is made of. So it is a shiny application, we all know that, but it is created with some fundamental techniques that we think really benefit a robust application to be shared in life sciences. One of those techniques is the use of shiny modules, which is a way to have kind of a super charged R function that's shiny aware, I call it, where you can basically kind of organize your application and distinct pieces and reuse different modules along the way, should they repeat. And also, speaking of packages, we also push the envelope a bit further here, where this application is actually composed as an R package with the Golan framework. We think R packages is a really novel way to construct applications that have the documentation, the R code, and just bundling all the declared dependencies very efficiently, which has kind of been the standard for obviously contributing to the R language itself. Speaking of dependencies though, we'll get to the environment stuff in a little bit, but we made sure from the outset of this pilot that we leverage a package management system that's tailored to our projects called RM, R-E-N-V, that is a critical component to making sure that the package environment that we use in our development of the application as well as the same packages would be used by the FDA reviewers, because the last thing we want is to have a transfer of this application and our reviewers not able to run the application because of a mismatch in, say, Shiny itself, maybe one of the dependency packages. You want to make sure that they're having as close to a bit-for-bit replication of that R execution environment as feasible. Then again, I have a lot more to say about that. But also, like I mentioned in the outset, all the material was in the open, the development journey of this application and code all through GitHub, all the commits are there, all the pull requests are there. I was certainly not the only one to develop it, so we had a great collaboration, but all that is viewable in our GitHub repo if you want to look at the history of it. But speaking of instructions or getting people on board, the ADRG for this application was hugely critical because to get all the environment stuff set up, we had to be pretty prescriptive in terms of the R version, how to get RM up and running, and we created a very comprehensive ADRG tailored to this application, and I'll give you a quick preview of it here. This is all in the GitHub repo, of course, but as we get to the sections about installations, this is actually where I spent a good chunk of my time developing this because we wanted to leave no stone unturned in terms of making sure that our colleagues at the FDA had clear understanding of the R version that we used, how to get packages compiled, how to get packages installed with the RM and packaged light on bundling, and then to actually restore the application itself and then to actually execute the application. Again, it's all on the GitHub repo, but we wanted to be very prescriptive here so that there was no ambiguity into how the application would be executed. Speaking of execution, let me take a little diversion here and show you another way that you could run this application. What you see here is our studio, not on the same development environment that I built the application with, this is a virtual machine. In fact, this virtual machine is trying to replicate as close as possible within logic the environment that we were told that would be used by the FDA reviewers, which as of now is our installations on their Windows machines for their work. So this virtual machine was created with a utility called QuickEMU because I don't know about others, but now on my development environment of work, I'm on a macOS system and on my other systems at home, I'm on Linux. I actually don't run Windows anymore, but that was going to be a problem if I didn't at least investigate this because whether I like it or not, Windows is still a very important operating system for the FDA reviewers to be able to review submission results. So this virtual machine was created and actually bootstrapped over and over again as we updated the ADRG with instructions, updated the application with maybe new code changes and whatnot to make sure that I could at least encounter as much of the potential issues that could occur with bootstrapping this application before we sent it to our colleagues at the FDA. So again, the application is basically the same thing I just showed you in that overview is just executing on Windows instead. But that was the key, right? I wanted to make sure that I could replicate that environment again as close to within reason so that we didn't have any surprises along the way. So special side shout out in the virtues of open source, what made that happen was this quick EMU project by a Martin Wimpress. So shout out to him for making that. As I mentioned, this has been a highly collaborative process. We had regular cadence along the whole development journey with Paul and Huisu to kind of show them what we were working on, some questions that we weren't able to maybe figure out ourselves that we wanted their input on. And we would also deploy kind of preview versions of this on an external host such as shinyapps.io, which again, we were clear was not the final version of the app. But it was a good way for them to preview things about going through the transfer process each and every time we made an update. That ended up being quite important actually because Paul, I'm telling you, I didn't see Paul slides before this. So we just came up with this together independently. But yes, with great power does indeed come great responsibility. As I showed you in the early part of the application demo, we had the filtering capability in that Kaplan-Meier visualization module. At first, we actually put it in all the modules. We actually had it. So on that, say, primary efficacy analysis, you could do the same thing with the TO modules such as age sliders, demographic, clicking on gender and things like that. However, those analyses that were in those summary tables were based on pre-specified analyses in our analysis plan that was submitted as part of this mock submission. And of course, if you are going to allow for that dynamic filtering, SHINee is going to react to that. SHINee is going to update the data and regenerate any inference results, any statistical model results to use that updated data. So having those p-values, those inference results change, those effect sizes change as the data is being filtered, that could lead to confusion and perhaps misinterpretation. And hence, what we agreed upon as we got to the successful submission was to indeed allow filtering in the visualization module only since that was not showing any inferential results, any statistical modeling results. Again, thanks to the iterative nature of this whole process, we were able to get in front of this before the actual transfer. But the discussions that have occurred from this have been very important in going forward for the future efforts, I would say, of using technologies like SHINee in this space. So of course, you've all heard it by now, but we are happy to say that yes, as of September, this year, early October, we have successfully transferred and reviewed the FDA, I mean, they are consortium pilot to submission package is now official. All the materials, again, are on the GitHub repository, but it's a huge credit to the team. I'll get the acknowledgements later, but certainly we are very appreciative of this very important milestone. Now, you may be wondering what's next. You've heard of it a little bit described, but in terms of SHINee for submissions in our working group, we are looking at a new frontier of two technologies as part of this pilot for package. One is the container technology. Those aren't aware containers are a way to encapsulate software alongside compute environments based on Linux, but executed with a container runtime. And in this case, we're trying a technology called Podman. That's pretty exciting, though, because it is trying to minimize the hurdle a bit more with the compute and reproducibility of the environment, not just the R side of things, but kind of the operating system side of things. But what's interesting is what's garnered maybe even more excitement across the community in general is WebAssembly. Basically, a way that we can take this SHINee app, bundle it in such a way that the user doesn't necessarily need a full blown operating system environment. They just need a modern web browser to execute that application. We are early stages of this, but we are making tremendous strides in the development of this. And we hope to have findings early next year as we get this ready to transfer to the FDA for their evaluation. But we have a link here to the submissions pilot for working group site. If you want to get more information on that. So again, I know my time here is almost up. I just want to quickly acknowledge the pilot to team that's worked with me along the way. And again, Paul and we sue from the FDA side for their generous time, their ability to iterate with us and discuss this with us in one of the most collaborative efforts I've been a part of externally. And it's really been a terrific team effort. So I appreciate your time. And I'm certainly looking forward to the rest of the webinar. And I'm hoping you enjoyed hearing about the pilot to journey. I guess I'll pass it back to me. Sounds great. Thank you so much, Eric. Now I'll be passing to he so to share her we review experience. And I saw a question that whether the slide or the recording will be shared. Yes, the slide and recordings will be shared. For recordings, there will, there is a our consortium YouTube channel. So you will be able to find that recordings on YouTube after this, this event. So with that, I'll turn it to he so thank you name. Um, is it all presented because it's not in presentation order yet. Give me a second. My computer is getting slow. Let me swap it. How about now? Yeah, we still see the Oh, nice. Now we see the clock is good now. Okay. Awesome. Thank you. Give me a second. Okay. Good afternoon, everyone. My name is he so to statistician at DFDA. Today my review experience with the R base submissions. Okay. This is the standard disclaimer. It got disconnected. Let me No problem. Yeah, we open it. It seems like there was maybe a network issue just now. Yeah. But now it's better. Yeah. Okay. You guys Cool. Yeah, we can see the disclaimer. Yeah. Okay. Disclan. Okay. Let me start over. So this is the standard disclaimer. This presentation reflects my view, not the FDA. So here's the outline of my talk. So I will talk about what is important to FDA. My experience with the R base submissions and findings and issues and recommendations to those who consider R base submission in the future. So FDA is responsible for protecting the public health by ensuring the safety, efficacy, quality and security of drug, vaccines and biological product and medical devices. And in a review process, there are many important things to consider. But in the R base submissions, I summarize three important things based on my experience. Conformance, reproducibility and traceability. So first is conformance. So it is important to follow FDA guidance and meet requirements. So when submission, submission packages are standardized, it is easier for FDA to review application and data. So study data technical conformance guide provides specifications, recommendations and general considerations on how to submit standardized study data using FDA supported data standards. And see this standard is also required for some regulatory submission to FDA. And for electronic submission to FDA, FDA requires to use electronic common technical documents format, ECTD format for short. And ECTD is the standard format for submission, for submitting application, amendments, supplements and reports to FDA, CEDAR and CBER. And if the sponsor didn't meet the requirement, when they attempt to submit something, error might occur. Or even if they go through the submission gateway, reviewer might not be able to see or access to a material that sponsors submitted. So I wanted to highlight that, please make sure to look for appropriate guidance depending on your application and follow them for smooth submissions and review process. Second, reproducibility. No matter who runs the code, when the code is run or in what circumstance, we want consistent result when using the same data. Third, traceability. CDISC define that traceability enables the understanding of the data's lineage and or the relationship between an element and its predecessor. And we want to understand how source data, which is known as SDTM is converted to its analysis data, which is known as ADTM. So in R-based submission, I wanted to highlight the importance of conformance, reproducibility and traceability. And I'm going to talk about R-based submissions. So this is based on my limited experience, so it may not cover all the cases. So DAI division in FDA has been working with the R-consumption R-submission working group for more than two years and completed pilot one and two and currently reviewing the pilot three. And the working group desired to have first publicly available regulatory submission package using open source language so that anyone can refer to this submission for their future application. And I just stated the objective of each pilot here. Pilot one is to test the concept that an R-language-based submission package can meet the needs and expectations of the FDA, including the assessing the code review and analysis reproducibility. And it contains four simple analysis that has been done using R. And Sponsor created their own developed package and converted the package to text file using package lite. It went through the FDA gateway without any issues, so I was able to run the submitted code and confirm the tables and figure. And in pilot two, we tested whether our Shiny application could be successfully incorporated into a submission package and transferred to the FDA. The Shiny app was built to using the same source data sets and analysis in pilot one. And please note that the R Shiny application is supplementary material and it does not replace any of the R programs, tables, listing or figures. And we successfully received the electronic submission package and installed and load open source and open source and their developed packages using the submission. However, for the purpose of this pilot, we recommended displaying all the tables in aesthetic format and applying filters and dynamic display only for the cap and Maya plot. I will explain more on this in later slide. And we also emphasize that only pre-specified subgroup analysis should be performed. And in pilot three, the objective is to retest pilot one with atom data set and generated by using R. The working group use R to create the atom data set again from the SDTM and then redid the pilot one. So now we are currently evaluating the generated atom data set using R and comparing the analysis result using original atom data set and generated atom data set. For clinical trial submissions, we still see most of these submissions in SAS based. And there has been a few SAS and R hybrid submissions, but I haven't seen submissions that has been fully in R. And with this unfamiliar circumstances, there has been some challenges in reproducing like replicating these answers computational environment. And I'm now going to go to share my findings and issues. We receive a questions like how FDA controls our versions and our studio, but one of the issues is related to our versions and our packages. So I look up how frequently our versions have updated in CRAN. And since 2020 April, there has been 14 updates. And in 2023, there were four updates. So with this frequent update, you may run into a problem installing a package in older version of R. And sponsor might use the most recent version of R, whereas FDA reviewer might use older version, which may present challenges. And actually this occurred during the pilot review. So when I switched R versions in R studio, it didn't work for me. So version of R studio should be comparable with R versions. This is the potential issue, because in the future, reviewer evaluate multiple R based submission simultaneously and sponsor use different R versions than reviewer may face challenges. So we are currently working on to find a solution at this moment, but we don't know what's the best solution. And similarly, managing our packages version is another issue. Some packages are dependent on specific version of other packages. And updating package may make changes to package package dependency and how your code run. So in the pilot and many my application review, REMB package has been using to make sure FDA reviewer can restore the same reproducible environment and manage package dependency. However, I noticed some challenges in using REMB. So first, some reviewer are not familiar with the REMB. So it takes some amount of learning to understand what's going on. Second, installation time can be very long because all packages and dependence packages are reinstalled. Third, when there is a issues comes up, it's hard to figure out what the issues are. So using REMB might not be ideal. Next issue is different environment. So in a review, restoring the same environment is critical. However, different operating systems sponsored and FDA use may cause some differences. So in the pilot project, the sponsor created them on Linux while FDA reviewed them on Windows. So we noticed the sponsor's file name has been slightly changed once FDA received in pilot 3. And we are not sure the reason exactly, but different operating system might be the cause because in terms of a file name, I think one is case sensitive and the other is not case sensitive. And Linux and Windows may treat space like hyphen and underscore differently. So we have to change file names and file paths to match what sponsors submitted. So it sounds very tedious, but if reviewer have to change all file names and paths in each script for all these submissions, it could be cumbersome. And depending on the environment, warning message may appear differently. So even though sponsors did not encounter any issues while they are preparing for these submissions, reviewer may see error or warning message when they are running the sponsors code. So in pilot 2, we asked to provide a list of potential warning messages that may be expected to occur and impact the reproducibility. And when running a sponsor's code, there are cases when reviewer have to select options to proceed. For example, if like running RUMB for the first time or switching from one to the other, like questions with several options pop up. And if there is no clear instruction on what to choose to replicate the same environment, it may result in different setup. And next is flexibility. So R offers greater flexibility. One of the minor discrepancy in pilot 1 was that FDA used the approximation to compute the confidence intervals, whereas sponsors used the precise quantile function. So there are multiple ways to calculate the value. And there are diverse packages that do the same thing. RSS has different default settings for functions and representing the missing value. And lastly, the sponsor created a Shiny app in pilot 2 and Shiny let user generate interactive data analysis by applying filters and displaying the generated output. And in the initial pilot 2 review, all tables and figure in the app are connected. So when I apply filter in figure, the p-value in the efficacy tables were also affected and changed depending on the subgroup. So these interactive tools may be useful in an exploratory analysis. However, there are concerns regarding the use of such a tool for regulatory submission. Because it can be inappropriately used for selecting subgroups to produce statistical significant results even when there is no real effect, which is known as p-hacking and cherry picking. So as I said earlier, we emphasize that only pre-specified subgroup analysis should be performed for efficacy analysis in a regulatory submission. And we recommended displaying all tables in a static format and applying the interactive feature only for the tap and wire plot. And I'd like to share my recommendation for future RB submission. So if you're consider conducting a clinical trial in R, let FDA know at the design stage. SAS has been the dominant language for almost all regulatory submission. So we've been trained and worked in SAS for a long time. So our base submission is something new to us as well. So if R will be used for submission, let us know ahead of time so that we can prepare for that. Second, use CRAN or Accurated Repository for sourcing packages. Validation of our packages was out of scope for this pilot review. And we expect the sponsor to use validated packages for robust analysis. Third, use standard packages and minimize dependency on sponsor develop packages. We understand sponsor develop their own packages for comprehensive and effective data analysis. And it may be more sophisticated and more convenient for you guys, but reviewer may have challenges restoring the same environment. So if sponsor use those packages that reviewer also can access to, review process might be smoother. And lastly, provide thorough documentation and details comments. Providing step-by-step instruction on how to restore the environment as well as hold itself would be helpful. And provide the information about version of R, RStudios, and each packages. And if you're using simulation, specify seed for reproducibility in the document. And clarify default value or reference value for a function. What is obvious to you may not be obvious to others. So this concludes my presentation. And I'd like to thank everyone who has helped me in this presentation and RB submission review. Thank you for listening to my presentation. My presentation. I'll turn it over to Ning. Thank you so much. And thanks again, Paul and Eric. Really, really good presentation. And I see there are already many good questions coming in to the audience. Please keep them. The questions coming, I will try to bundle them with similar topics, etc. So for the next 15 minutes or so, I want to spend some time really kind of dive into some of the questions we're talking points you already mentioned. So I guess I'll start with a pretty sad question. Like I want to start a question that like reflecting back for the last year or last two years kind of like since last time we have this adoption series meeting in 2022. Like what has been changed in your organization? Like do you see more people adopting R? Like have you, your organization explore new use cases, etc. Yeah, from my side, I was doing a reflection before this event. Definitely, I feel like last time, like when we talk about our four-year submission, it still sounds like a hobby project or side project for people. But right now, like our base submission is becoming a strategy for Roche. Like actually, we're preparing our first R-based submission right now. And also, I definitely see the like a talent shift as Paul mentioned. Last year, we had multiple openings and we had a technical interview for every candidate. And we asked candidate to select SAS versus R for their technical interview. I think over 90% of the candidates choose R over SAS. So it was really shocking to not shocking, but kind of interesting to see. So I would love to maybe hear more from Paul, Hisu, and Eric. Like what is your reflection coming like now is a year after our last R adoption event? Yeah, I can take the first part of this. I would say there's a lot of momentum around this. I think there are a lot of misconceptions in the early days of this before these efforts. And I think a lot of that is going by the wayside. Just the innovations that we can achieve going back to what we've done with PILOT2, but even in R itself, the ability to create these more rich displays of statistical results. Again, within reason, within, you know, save controls to avoid the issues that Hisu mentioned earlier. We are seeing a lot of momentum at least on our side. We're certainly growing in terms of where it's being implemented in the pipeline. We have multiple PILOTs going on right now. They're using R to generate the statistical results. We have new PILOTs ongoing for the dataset generation. And we certainly hope to be someday into an R-based submission, but certainly the foundation is getting more laid, so to speak. And we're going to hopefully have some positive momentum in the coming years. Yeah, I mean, it's difficult to grasp the overall trend at the FDA because it's a huge organization. So I don't see a notable increase using the open source tool in regulatory submissions. But even before this PILOT, some reviewers, including myself, are more familiar with R and use R regularly to review application and conduct research. And we also have been deploying some of the R shiny applications as internal tools to perform the various analysis and facilitate the workflow amongst clinicians and other disciplines. So for example, if verifying the sample size is required for each application, maybe one of the statisticians can create shiny applications that do the same, do the sample size calculations automatically once I'm loading the data so that we don't have to do the same thing over and over from the scratch. So we use R regularly for our work, but I don't see the notable increase in open source tools in Beggle Tree review. Maybe Paul can add more. It's been a slow process. I think the industry and regulators tend to both be somewhat conservative. So it takes time. Well, I think there's an evolution taking place. I'd say the changes are evolutionary rather than revolutionary so far. Thanks for sharing. And from the audience, I see there are also mentioning of like several other open source languages such as Python and Julia and curious like maybe like Eric Pan. So do you see uptake of those languages in your organization? Yeah. I'm certainly seeing in the in the fields as Paul mentioned in the machine learning space and the artificial intelligence space. Yeah, we definitely have many data scientists at our organization using Python quite heavily to generate insights and also leveraging these kind of techniques with huge amounts of biomarker data, digital biomarker data. I would say in the routine day to day clinical analysis, ours is still very much a key player in the open source technology side of it. But the key parts for us is that we are trying to modernize our procedures such that it's language agnostic in terms of the deliverable generation and how we're developing tools so that we pick the best framework and tool for the job. But certainly we are seeing a lot of momentum in the AI and ML space with languages like Python. Yeah. I'm not sure about FDA. I don't see any. I think some people are familiar with Python and they're using for machine learning and like big data analysis. But I heard that I'm not familiar with Python. I'm not. I don't have knowledge on Python. But I think one of the reasons that R has been using for regulatory submission is that there is so because of the CRAN. Like CRAN is the central software repository, whereas Python does not have such a repository. So that might be some challenges. Yeah. I can clearly comment that Python has, what's, oh my goodness, I knew that was going to happen. My Python has the PyPy kind of package registry. But unlike CRAN, there isn't, I would say, well, again, I might be getting in trouble here for saying it. CRAN has the R Foundation, the CRAN maintainer is behind the approval process of packages that enter CRAN itself. So it's not just a set of formal checks that you can automate via R level. You also have a human curation aspect of CRAN. I know that's been an attractive feature for many people in life sciences to differentiate that from other frameworks which are admittedly a bit more open, but sometimes that can be a detriment. Yeah. I think, go ahead, Paul. Oh, I was going to add, yes, some of our colleagues who are more on the data scientist side of the house are more of our Python users. We still see R is primarily the, in our area, statistics, but also clinical pharmacology uses R to some extent, somewhat more extensively. One of the questions that appeared in the list was Julia. And I originally had a slide that just had Julia blank and a question mark because we haven't seen anything. It's approved. It can be used. We just haven't seen anyone actually use it. Thank you. And I guess this discussion about CRAN and package repository is a very good segue to a very hot topic in the Q&A. I see it's about validation. So it sounds like people are asking that they want to get more clarity on what validation means, et cetera. And I know for our Python submission, validation wasn't in scope for us, but we did have discussions within the working group, quite extensive discussion within the working group in terms of what validation means. And I guess my take is that there won't be a universal stamp saying that this package is validated. The other is not. It's up to interpretation by organization and the persons. So maybe to switch the question a little bit, can you share a little bit on your thought on how to establish a trust of a package and how do you determine that a package is reliable or not when you are doing analysis or when you are reviewing certain analysis over there? At a minimum, we want to see something as fit for purpose. I think that's sort of the overall idea that we want. Our colleagues in CDRH actually use the term software assurance more so than validation. And in some ways that provides a little bit better description. We want to be assured that the software is fit for purpose. So having the statistical software clarifying statement says there should be some testing regimes going on that folks have looked at this and considered that the packages they're using are giving the correct results. But I also want to acknowledge that there are two other R consortium working groups who have worked to address this issue. The validation hub, which is an ongoing effort. And then the R repositories group. So our group doesn't have to solve everything. There are other active efforts ongoing to address these issues. And then I'll turn it over to my colleagues. One of the ways that I verify our packages is doing the same thing in different software. So for example, SAS, since I'm using SAS and R, I use both SAS and R for my review. And then once I get the same result that I got in SAS and R, then that's the one of the way to verify the result and trust the result. That might not be the most efficient way to verify it, but that's one of the ways. Yeah, and I can take a little bit different perspective. But let's say, for example, for the Pi O2 application that we created, why would I choose certain packages that I chose for it? Well, there's a few reasons for it. There are, again, many packages that may do a similar thing, but I look at kind of the overall quality of how that package was built. Obviously, with being R, you're going to find the package source somewhere. Certainly, I look at things like documentation inside the package. I look at what kind of unit tests they produce so they can be in front of regressions. And you also kind of take a look at its dependency footprint and seeing what other packages rely on it, where I can say, you know what, I don't think I have to reverse engineer, reverse test a package like Dplyr myself. But if it's a more niche package that maybe is doing one thing and it's only from one manuscript with no GitHub repo, no package being yet or whatnot, I'm going to at least take a closer look at it. Maybe it does indeed do what I wanted to do, but I might have a more in more clear focus on honing in on doing that verification myself versus doing a bit of a risk based approach for packages that we know are being used day to day and like the analytical data processing pipelines that many in the art community are doing. But I do acknowledge there are varied opinions across the spectrum on this. And my hope is that we just take an intelligent and pragmatic approach as we try to answer some of these. It may be also reflecting on what he's presented that basically like maybe from a reviewer or for any citizens perspective, basically a package on crime already have some quality assurance over there. So like for people who are writing a package, maybe one guidance is open source it as soon as possible, as soon as it's stable and get opinion from the public, right? It may be a follow up on question on that for his own part, there are cases probably like the sponsor won't feel comfortable to open source some of the proprietary art package, and which is probably not optimal for submission, but in that case, do you have any tips for their submission, like kind of like, I guess you mentioned a little bit on documentation and testing. Certainly documentation testing, other aspects. We're also working on some solutions for how to submit such things as part of the phase three, we talked about a so-called phase three B process where we were looking at, can you submit it as a zip file, for example. So as much as possible, we would prefer if things be on CRAN or a similar curated repository. That's the general answer, because we don't always know what sponsors do with their own particular packages, and I'll leave it at that. Thanks, I think that's really clear guidance to people open source or linear. Cool, and maybe shift here a little bit, I see there are a lot of questions in the Q&A that are about usage of Shiny, good practice of Shiny, and how to preventing cherry picking and p-hacking. I think like all three of you mentioned that like in your presentation, I know we also have a lot of discussions, you know, working group meetings. Yeah, I just wonder whether you want to share more insight on that, and interestingly I saw there is one question saying that like, how do we deal with p-hacking when we were using SAS? So yeah, so to me, like p-hacking is not something technology relevant. Yeah, I guess like by using our Shiny, we want to make the decision making process faster, but not by kind of enabling like exploring everything. Yeah, so I guess like you mentioned about like pretty finalized is still important, and so do you want to share more on some guidance or good practice on the usage of interactive tools? Yeah, I guess I'll take a stab at it. Basically, p-values correspond to hypothesis testing in inferential statistics, and by its very nature inferential statistics is not really an interactive setup. So when we make tools interactive and we include hypothesis testing, if we're not careful, we can be tacitly encouraging p-hacking. You know, you just adjust all the sliders and knobs and buttons till you get that p-value less than 0.05, and if you torture the data long enough, you will get something significant is the sort of unofficial motto that many have. So in some sense, our interactive tools can make that manipulation process more efficient. It doesn't mean that it's brand new. It can occur with SAS. It can occur with any package, but the whole idea of we want to avoid what sometimes is called data-driven analysis where we have some ulterior motive that we're looking for. So in that sense, that's why we like the graphical versions better than the tabular. You can see how the confidence bands become wider, for example, as we decrease the number of subjects. Maybe I can just echo what Paul said. I think using interactive features and dynamic visualizations for exploratory analysis seems very helpful. I think someone mentioned signal detection in safety data. I mean, if it's going to help, so it could be helpful. However, we have concerns in using for efficacy analysis and for the analysis that has been not pre-specified in protocols. So people using appropriately for interactive features for p-hacking. One of the questions was, how does SAS prevent p-hacking? I think Paul and I meant that our shiny applications interactive features, not the comparing SAS and R. I just wanted to say maybe not for efficacy analysis and not pre-specified analysis. Eric, anything to add on here? Yeah, Paul took a lot of my points already, but I will say that the interactive features that we can do with shiny in this space, I see opportunities of how we can make the process of kind of linking results together a lot easier versus in the traditional static outputs. You might get a summary result of a treatment group. You want to dive into it further, look at a subgroup, then you want to look at specific patient profiles and individual listings. This is where I think an opportunity shiny can bring to that reviewing experience of not necessarily the p-hacking, but more about linking all this information together succinctly to get into insights for decisions easier. I think that's where some great opportunities arise for sure. I totally agree. I feel like having a shiny for submission doesn't mean that we have a kind of a freestyle shiny that you can do other sub-population analysis, but to your point, I feel like oftentimes in CSR, we have hundreds of pages of tables for 10 endpoints, a countless of sensitive analysis over there. I definitely feel that having a more user friendly interface where shiny could be something really useful over there. In Eric, maybe a follow-up question I saw, there is a question from the audience asking how to validate a shiny tool or what is the software release maybe like good practice for shiny? Do you want to share a little bit more on that? Yeah, absolutely. Under the hood, a shiny app is indeed just our code under the hood, but with it being an application, there is a different development philosophy as you architect this effectively. I think following best practices that have been touched upon actually in recent disclosures by Daniel Bovay, your colleague at Rosh on research, software engineering kind of best practices. What I mean by that is make great use of version control to have a traceable history of your updates, a clear documentation of what your app is intending to do, whether you call that specifications or user stories or whatnot, and then really having a strategy in terms of what is a production release mean to your particular application and be able to still prototype new features, new fixes in a development branch, if you will, a development deployment so you can iterate without compromising that safe, you know, established version of the application. And then also making sure your business logic is effectively created as our functions and not so much entangled in the shiny code itself, because you have a lot easier time testing business logic with the knowledge you know about R already than when you entangle that into an overall, like, one big app.r file, if you will. We've all been there, I started that way, but I've learned, I've learned my ways a bit. So the package structure or other frameworks such as Rhino and others are trying to get you into that engineering mindset so your business logic is safely encapsulated as opposed to the application specific code that's kind of tying it all together. It does take some getting used to, but I think our pilot applications are a good demonstration of those practices in action. Totally, 100% agree. I don't think there's anything different than the good practice of writing R code in general, yeah. Cool. And I saw several comments about kind of like R versioning and also like difference between Windows and Linux. And I found one being very interesting here pointing out that like kind of no matter how good our ADRG is, like if we ask FDA reviewers to kind of like install like sponsor specific R version in our package environment and such, sounds like that's a lot of work for the reviewers. And also, I guess Hisu mentioned that like you guys oftentimes like review multiple submissions at the same time. And if different sponsors use different versions, different packages, that can create a real hassle for you guys to switch between computing environments. Yeah, I don't feel like this has been discussed before. So we want to like hear like from like Paul, Hisu and Eric, how can we make life easier for FDA reviewers? Like it does already sounds like a lot of work like from reviewers side, like trying to like follow every single sponsor's ADRG over there. Yeah. One of the recommendations that I put it in my recommendation in my presentation is that use the standardize package is use the standard package rather than using sponsor develop package, so that we can we can access to what sponsor access that I think that's that I think that's the key or I mean, one of the way that I if I have problem replicating the sponsors environment, one of the way I resolve it is I just use my current version of our studio and packages without sponsor develop package or packages version that sponsored specified. And I just use to do my analysis independently. And then if it works that verify the result. If not, we usually I usually talk to reviewer review team and then draft on IR information request and communicate with the sponsor. But my this fundamental recommendation is just use the standard package and rather than and minimize the dependency of the sponsor develop packages. I would concur with Hisu is that robust programming can be the best way forward. Um, we know folks want to use the latest and greatest and sometimes that bleeding edge means that something a month or two from now, we may not be able to run that package or that particular program anymore. Whereas I actually had to go back and look at our code that I wrote a dozen years ago. And if the code that I wrote that was robust worked like a charm didn't really need to have too many updates. So in this case, that's not entirely fair since it was only a couple hundred lines of code. But if it can be based on robust results and not the latest and greatest version, particularly developmental versions of items that can lead to unnecessary fragility of the end product. Yeah, those are great points. I'll just mention that I wish we were in a somewhat more optimal situation where all of you on the FDA side had a somewhat more standard our environment to review these on, but we're not there. And but what can we do with what we have, right? I think it's a combination of keeping up to date with what technology do we have to mimic, again, the reviewers environment as best as we can from the sponsor side and being as transparent as possible about the instructions to get over those hurdles of like RM for package restoration or the R version rest installations. I hope that we can come to a promising effort maybe next year, based on what we're doing in pi before I'm making that piece of the execution environment reproducibility a bit easier. I think we have a long ways to go. But I am keeping in tune with some of the latest technologies that hopefully can help with this down the road. But I think with what we have now transparency and making do of what we have of making it as as clear as possible, the different steps that are needed to replicate environment are probably the best we're going to have in the short term. Maybe I can share one of the one of my review experience. I think I've seen a sponsor provide a provide a link for Microsoft trends snapshots. I think they use the Microsoft trends snapshot. I'm not sure. But that's one of way to replicate the sponsors package environment. And then they provide a code and then brief instruction on how to replicate the environment replicate the R packages they use. However, when I tried to run it, it didn't work. So I Google it and try to study what Microsoft trends snapshot and then I realized that Microsoft is no longer maintaining the trend time machine snapshot since after June 2023. So recently that has been changed. So I'm sure the link must have must have work for sponsor while preparing for the applications. But after the submission, the link didn't work. So open source language tends to change faster and packages and version updates happen more frequently. So I think sometimes making restoring the environment harder. Yeah, that was unfortunate that they shut down that service earlier this year or last year. There are some other efforts to kind of replicate that. And I know Posit has their own package manager service is trying to do some of this. But I agree that going forward, we need to keep a close eye on this. Whether it's RM with something like this, this MRAM like service or another type of service to and ensure this package reproducibility, because it's probably one of the biggest issues we'll encounter in these submissions. And maybe a follow up question on our version. I saw there's a new new question asking like whether there's any restrictions on our version. For example, what is the oldest version that considered acceptable? It may be an expansion of this question to Pond, he says that my understanding is right now, like within FDA, there is no recommendation on our version to use basically different certifications can use can choose like what version they prefer to use. Do you foresee that like as Eric alluded on like in the future there may be a shared server and you might have kind of like recommended our versions like release once a while and such. I don't have any answer at this moment. Maybe do you have any idea, Paul? If you get two statisticians together, you might get three opinions sometimes. I don't think it will have a resolution in terms of what is the finalized our version. We have different needs for different folks for different projects. And I think that's where we are now and that tends to drive things. We can go back some ways in time, but for example, I don't know for even if we could if we are allow to install some of the older versions of our just because of security issues that have been addressed and resolved. So part of it is the entire ecosystem is one of the issues that we don't typically worry about too much is security. But that's something that if we we have to live with and abide by the security regulations. So that's one thing with trying to install older versions. And I realized I may have overstepped and I didn't mean to step on Eric's toes with the pilot for web assembly. I think there's some really neat stuff going on there. I realized, yeah, I need to distinguish between sort of the development area and then sort of the regulatory submission area. So we're still regulatory submissions. We'd like things to be kind of tested at this stage. For development, we want to encourage innovation. But then it's a matter of bringing that innovation into the production regime. Yeah, certainly. This has been a hot topic in my circles as well, especially in the working group as we think about going forward to make the review experience easier. I don't know if we'll have, I mean, I hope in the in the long term we'll have a seamless way to kind of have the robust development environment push the envelope a bit, but within, you know, safeguards and controls. But then when it's time to submit, we're able to have, I'm kind of spoiled by what we have for a lot of the modern tooling for package, like CICD frameworks, where someone like POSIT or another organization can quickly check, Hey, is this update to D-Plyer going to break their existing testing and also break any packages that depend on D-Plyer? Like there are lots of principles in the modern development tool stack. If we can somehow bridge some of that gap into this, this execution environment for submissions, the pilot to virtual machine saga that I went on is a very manual, somewhat brute force approach to it. But maybe there are some ways we can automate bits of that down the road. I don't know, but we'll see what the future holds. Eric, in terms of pilot number four, I saw earlier, there was a question about POMMN versus Docker. Do you want to share it to be more on that? Yeah, I'll let Paul chime in after this, but Docker in itself does have some licensing concerns for certain organizations. And we were hopeful that in the spirit of the pilot itself, that we can leverage as much open source technology alongside R as possible. And POMMN for those that aren't aware is an open source project through and through. It's been sponsored by companies like Red Hat, but it's an entirely open source. And there are organizations that tend to like the PODMN framework a bit better than Docker for some security technical reasons as well. But certainly the licensing was a bit of it. But I don't know, Paul, do you have anything to add to that decision? Yeah, the licensing, what security would allow us to actually install some of those types of issues basically rendered Docker not truly feasible in some ways. Our approved version is obsolete, for example. So there are issues and then the subscription issue changed. So it greatly affected the ability of our folks to run things locally with the subscription changes. And I think that for us that basically made it not feasible any longer. PODMN is one area and I'll go ahead and say it. I'm really excited by what Eric's doing with WebAssembly. And I think that has some real potential. We are as well. That's why we're kind of taking our making sure that we have it all vetted appropriately because we acknowledge that we're pushing the envelope quite a bit on that pilot. But the gains that we could get from this, whether it's just for Shiny or whether it's for other parts of an open source submission. I mean, the momentum that we've seen ever since POSIT released Shiny for Python and along the side did their quote unquote Shiny live service where these Shiny apps are running in somebody's web browser without a Shiny or server or POSIT connect server behind the scenes. That got everybody's attention, then R got the support with WebR recently. So it's, as we Susie even mentioned a number of topics, this is a very fast moving space. So just in the last couple months, since I think even I talked to Paul and we Sue about their advancements and WebR, it's gotten even better since then. So we're kind of glad that we're going to use this early part of 2024 to see just how far it's really gone to make it hopefully minimize the burden of running this application but also pave the way to sharing these more effectively. But yes, there is a ton of excitement in the life science of space around this. Yeah, I totally agree that I feel like sometimes I always feel like pharma is like behind technology and we're always like kind of following what tech companies are doing. But like for Shiny and WebAssembly, I feel like we are definitely the ones shaping the technology landscape over there, which is really, really exciting. Cool. Maybe another question on Shiny as well. I saw one person asked that like what about like sponsor hosting a Shiny app on their server and then kind of having FDA reviewers to access that server to do review. I guess earlier, like Paul and he, so you shared in a working group meeting that that could have like potential security concerns, right? Yes. Basically, my understanding is that essentially that type of arrangement is not permissible under our current security rules. And I think that there are concerns while we would have some folks, I mean, one of the initial concerns we had with this type of arrangement is while most companies are not going to use leverage they might gain by doing that. Do you always trust your marketing folks to as much as you trust your statistical colleagues? For example, I could foresee several ways in which an unscrupulous actor could make use of that information. Part of it is your trust, the circle of trust gets extended a lot further. If we're running the data, if a sponsor is running the data, running the hardware, running the software, I know that setting things up and just exchanging data itself has some issues. You can say, well, do you really trust the data? Well, not entirely. No, we do. There are data checks. There are processes. The entire system to some extent is built on trust, but we also build in some sources of verification of that data, risk-based monitoring, centralized statistical monitoring on the one hand and other aspects all along. So I guess the concern I would have with someone running a query on someone else's data on someone else's server is gets down to that level of trust and how can we be sure that it's open, transparent, and reproducible? Thank you for sharing. Switching the gear a little bit. I saw there is another interesting question talking about medical device. I guess you mentioned a little bit about there is a software assurance guidance from CDRH. Maybe a broader question is what is the collaboration between CDR and CDRH? I know here we are more talking about drug product submission, but I feel many of the challenges are similar across CDR and CDRH. I wonder whether you want to share any insight that we're doing. They're governed by a different set of rules. For example, our base submissions have already occurred in CDRH and the laws are different. The standards are different. What applies can be different for devices than in the drug world. So it's not that they're less rigorous. It's a different type of issue going on. Our trials, for example, in drugs tend to be much larger than those in devices. As a result, we have much more involved data standards. The device world is something that's very different. Most of what we see and most of the guidances apply to CDR and CBER, the Center for Biologics Evaluation and Research. I would say go to the appropriate guidances that CDRH has developed. We do share some things. Digital health technology is being currently, a lot has been developed for the device world and we're picking up on how those are going to be used in the drug development space. That's also a very exciting field to watch right now. Cool. I think we have five minutes left. I will go to my last question. I think you already shared a lot on this, but I just want to ask what are some final words on the advice to teams who are considering an R-based or open source language-based submission in the future? I can start. I think it's obviously a lot of positive momentum in this space. I think leveraging the collaborative nature that we've been seeing in these foundational efforts, you can have that same atmosphere in your organization as well. There may be a lot of you that are transitioning from another language that's not traditionally thought of as a lot of collaboration in terms of sharing a lot of new innovations or knowledge around that. With that said, I think as long as you have a keen pulse on what's available in the community of packages, but leveraging best practices along the way as you bootstrap some of these processes, it has to start somewhere. Don't try to do everything at once. Find a specific area where there's a need that R can fulfill. A lot for all of us is visualization, interactivity, shiny, and the like, but it could be a lot of areas. Don't try to do everything at once. That can just slow you down immensely. I would say start with a fit for purpose project that could lead to potentially a submission down the road, but that will build up your confidence and hopefully your organization's workforce to not be intimidated by this and then instead embrace it as like this new frontier that I think can have some real strides in efficiency and automation and frankly cutting edge analytics in this space. It's a great time to start. Just don't try to do everything at once. I will just repeat what I said during my presentation. If you're planning to use any other language other than SAS, let FDA know ahead of time so that we can provide any feedback or answer any question you may have. Also, I recommend using standard R package from Cran that anyone can access to so it will make restoring the environment would be much easier and provide detailed information about the versions of our studio and each package as well as instruction on how to restore the environment and code itself. Provide clear and concise comments in a code as well. We're learning by doing so. I'm excited to contribute to more in R-based submissions and thank you all for listening today. And I would echo both of my colleagues that HESU is that if reach out early, make it a keynote. I think Eric had a good point that use the right tool for the right circumstance. We have had R use for visualization purposes. And in fact, we've had labels that have been made by FDA reviewers using R graphics that have been approved. So in some sense at one level, R is already being used. But as we look at other things and it has the area of all proceeds, maybe this is also what Eric was saying, or at least I'll interpret it as what he was saying, is that it's good to make evolutionary changes rather than revolutionary changes. You don't have to do everything all at once. You can do a bit at a time. You can take lessons learned and incorporate it into your process. Thank you very much. Very insightful. I really enjoyed the conversation today. And also thank you for all the audience. I think at a peak, we had over 400 audience today. Thanks for the engagement and there are a lot of very good questions. The recording end of this slide will be available online. So please check back if you want to rewatch or you want to look at the slide. All right. Thanks everyone and have a good week ahead. Thank you so much. Thank you. It's a pleasure to be here. Bye everyone. Bye.