 Yes. Okay. So thanks everyone for joining today. First up, just in a quick intro to try and link some of these kind of different initiatives together. We will have a submissions in our, our adoption series event coming I think last time I mentioned this, but we didn't have a date. Now looks like July 13 is going to be the date for that. And if you follow the link or go to the Arkansas ocean web page and click on webinars, which is in one of the tabs at the top, you'll get there it still says something along the lines that it's coming soon the details aren't there yet. But you will also notice a revamp of that web page so if you want to have a look at some of the previous webinars in the our adoption series they're nicely organized now. It's easy to go and get to that historical content so that will appear soon. So nothing much has changed other than the diary so if you keep that date for your diary. And, you know, we should have a very interesting session there we've got some confirmed speakers from the FDA, but that will be confirmed in the, in the details to the webinar. Okay. So, we have your hands and Julian. Finally get to Julian after you've organized all of these other presenters from everywhere else for this series. So looking for these two, and then we've got a, so just the two talks today and then we've got the q amp a with the presenters. It won't be a natural times add up but it probably just under the full hour I expect. And having said that, no doubt, tons of questions and we'll go run for the full hour. But we are one one talk fewer than we've had previously so hopefully there's plenty of time for everyone. Okay, I haven't seen is so it's your hands on and am I pronouncing your name correctly. I haven't seen him yet. Okay. Then it might be even shorter but I guess as a quick, quick switch around. Julian, are you able to go first and I will try and, given that you'll be presenting I will try and hand us in the background. Yes. And you should have this in there in contact right. Yeah, yeah, right. So I'll do that and I'll just stop sharing my video now. So I hope you all see my screen right now. Yes. Yes. That's useful. And you hear me as well obviously so that's very useful as well. So combining these two things. Thanks a lot for dining in and the third part of the case series. I'm presenting the how we implement a risk based assessment of our packages at Merck KGA or EMD Savono how it's called in the US. And there are some collaborators. So that helped implementing that in the company and one is also signed in today. So thanks Stefan for coming despite vacation time. There's an abstract that was just for our farmer. So basically using the slide again and just as a reminder, this is why we are doing this just why we are meeting all this validation as establishing the documented evidence which provides the highly reassurance specific process consistently and produces a product of predetermined specification and quality attributes. So that comes from the FDA. And this particular talk focuses on the accuracy of our packages. And how we implement or how we interpret that what our validation have obviously has nicely appropriate in our company. So basically it is focused here on our for obvious reasons. The idea is to keep the framework general enough to also be able to generalize it to other programming languages, Python and SAS practically probably not yet. So the very general framework is something that we have seen similar also in previous discussions that we differentiate packages, or we try to find a process that classifies packages into three levels of confidence. The format is three levels of confidence is the first level is the core crown packages just the base packages and the recommended packages. And as we all know and have discussed quite extensively. These are half minimum risk it in the documentation that our consortium has provided. Then we have the Merck add on standard packages where we assume very developed framework that we say there's documented there's enough documented evidence that we can trust these packages as standard packages so we recommend them to be used for our standard analysis, and we assume, and I put that good enough quality to produce the results that we expect to see. And then we have all the other packages which doesn't mean that we can't use them but it would be more user input required to ensure the proper quality of the analysis output specifically. So, we are trying to basically differentiate the huge group of contributed packages into two groups, the Merck add on standard packages, and other packages. Having said that there's sort of an algorithm that is mostly automated to summarize if a package goes into that level two of the, or that level three of the general package. So, after the installation qualification succeed successful with the execution of available tests, we would the ideas to make the package directly available to the end user. So, we are basically in this upper part. If not, we obviously try to resolve the packages use and we directly have it in the level three as available. And then it may take and that's the immediate thing so we don't have to wait until the entire process is executed. So, we try to make the time of request until availability availability as short as possible. And then after availability. Basically, the user can use it and the assessment with that packet becomes of higher standard is executed and that would be by to has two dimensions one is the test coverage. And the second dimension is the risk metric score. So, having a test coverage over 50, and because more test coverage is better. And having a risk metric score below 50 because less risk is better. And if that is fulfilled the package pretty much automatically gets promoted to level two. So we have a high confidence of package accuracy. Otherwise, there is a more manual version of a explicit risk assessment that the package can go through in order to still be promoted to level two so we're trying to keep keep the process as smooth as possible. Looking into the risk metric score we have the test coverage and here the test coverage comes in again is 50%, 15% of software development practices such as maintain a public code based on news file, 15% bug resolution and status. So downloads of community usage, usability metrics, documentation, health and many yet you see that all these components are basically from the risk metric package provided by the evaluation. We also know that these risk metric components are not independent metrics from a statistical point of view, means a good package may have good results and all of these things. Then together we create the overall score that is helping to understand whether packages robust or not. Having said that I'm a trained statistician so I couldn't have myself to actually use a statistical classification methods to find a good cut off what is a good cut off to create a robust score. And the statistical approach would be an RLC analysis. And that basically tries to find that optimal threshold of plus package classification, given that continuous risk score, and that was being found to, to satisfy sensitivity and specificity requirements at 50. So this number just does not fall from the sky but it is actually a data driven number and in future developments that can be obviously used and adapted. And we use that for training data of 61 manually evaluated up packages and obtained overall accuracy of 77%. So that's not too bad. And at the same time, we also have that second dimension of test coverage and could all together get a classification specificity of 88.5%. So saying that we are mostly automating it and still gives us a little bit of confidence that we are doing a decent enough job. Having said that they are obviously for each statistician model that you develop you want to have some trust and some training data. So the next step we are going to have some test step to verify that that empirical evaluation holds for independent test set. And basically that's why we are here. We are very happy to seek feedback because rather have the feedback earlier from the community, then later, when we are speaking with agencies. Thanks a lot. Yeah, let me switch my video back on. So I'm just having a look at the attendee list. I think we are still waiting potentially I haven't had an email response yet so let's go with Q&A and keep things moving now. No pressure, but now all the Q&A is on specifically what you're doing. I'll start things off with a question. If anyone else has questions, please post in the chat. My question is around, so you've got the two thresholds. If we get this all the time with things like risk metric but to what extent do you think that's open to abuse by package authors and is that a realistic concern? I mean, someone, I think it's more like a motivation for package authors to actually get their packages to a higher quality because I mean, I don't know. So you asked me now as a package author I think it's a good motivation to have like a package that was developed years years years ago, when different standards were good to go. And I think it's good to re-brush all the packages with some of the things that have not been standard back then. And I think I see it as a motivation. How much criminal energy do you need in order to have those thresholds available and then you are still going and re-evaluating those thresholds? I'm still saying that, like that's, it's still a the GXP environments are still sort of separated from other things. And I think it's very unlikely that something gets through and like what would be the motivation to try to get through a package that it's something that I don't really know. Maybe, maybe if we look at it from a less sinister perspective, what about if someone has a package, there's a popular package that I don't want to name, so I'm going to drop the author in it. But let's say you have a package with a lot of sort of miscellaneous functions in it that do lots of different things. You're not necessarily intending to use the functions. You're not going to use all the functions. You have an idea of what you want to use that package for. You have 50, 60, 70% coverage, but it might not cover any of the things that you're using. Is there a concern about what is being tested in the package? So we wouldn't go down to actually see that a specific function that we are interested is tested in the automated way. That could be something that could be looked at in an explicit way. Let's say more the other way around that the package has 30% coverage. I just want to use this one particular function and that one is covered so you could catch that in the explicit assessment and still be able to use it as a user. Yes, I absolutely hear your concern of having a package that may not explicitly cover the function, but at the same time we are all, when we are doing the analysis, we also expect all users to be qualified and that's the FDA requirement as well. So blindly using analysis tools is maybe a bad idea in any way. So we are just doing a risk assessment. We cannot fully avoid any risk. Any time. Thank you. Joe, you have put a question in the chat. Joe, since you've got video and do you want to ask the question directly. Yes, do you expect any kind of thrashing for some packages like passing criteria and failing and then passing, especially since it you apparently have a semi automated system. So I think also what I heard from other presentations is it. We had like last time we had the question like if some package doesn't pass. How do you deal with it, and other companies have been saying like, Okay, whoever has requested the package are they willing to write additional tests to make it pass and you get a little bit of how much does the person actually need the package. And oftentimes people just restored a request. That's what I heard. I don't remember who it was and I want to paraphrase wrongly. But I think that's probably a reasonable way of going with it. So other people have been on the line from last time I see other people's names so please feel free to speak up and join because there's some new interesting questions to make it more interesting for everyone as well. Yeah, that's a good point actually if other other people who have previously presented wants to chip in. Yeah, absolutely. Please do I think Alice I know you weren't able to join last time in a year as well. And unless you've asked the question but the before you, and Steven got in a question. So on the face of students so Steven's asking a question about the number of downloads and how you get that so that comes from risk metric. But I guess you can extend that question say well risk metric gives you a number of downloads and from that it generates a score. I suppose this is similar to my last question you know around the testing but like for everybody have got some faces online. When you get a single score how much do you think you can rely on that like that single score because that's just saying that here's a high number of downloads but if you have a very niche statistical package is going to have a low number of downloads for example doesn't mean it's any less good but I guess what I'm trying to say is the context of number of downloads or some of those other metrics like how long a package has been around for like that. I mean how long a package has been around for is always probably going to be a positive thing but like you could weigh that against like tests for example something that has fewer tests but it's been around for 20 years then that's probably still okay but if you're just relying on a score you're not able to make pick out those nuances is that a problem that is has it been a problem and do you see it as a problem. Yeah I mean that is one of the considerations that we we had in mind when we were actually developing our own processes at mark. So it's the one at mark that actually rely directly on the number of downloads it does take it into consideration but it's not weighted as in the risk metric approach so that's one way but again, I think the next series where we actually get to ask the FDA questions about the whole quality as well as the assessment that they deem what is the risks that they deem would be a good insight into you know how we can you know modify our approaches as well. So it was raised by someone at the FDA and one about early evaluation of meetings a number of downloads could be like the most important metric something along the lines and surely if something has been downloaded a lot of times is do we need much more than that. So yeah I think I agree that we're very interesting. Obviously that cobbment as well by the way was put in the context of I do not speak for the FDA as as normally as the case and these these things. Alice you asked a question. Does your protein business case you're doing us as you're on camera as well. Yeah, so basically, Julianne I was asking how do you take into account cases where the use case the package or the way the way that's intended to be used isn't a high risk situation such as it's just doing formatting, or very simple data manipulation where it's very easy to find that there is something happening that's incorrect. So you can catch those versus like if you have a situation like that but the package itself assesses using risk metric to actually be fairly risky like it doesn't think the test coverage is high enough for hasn't closed enough bugs in the last 30 days or something like that. How do you balance those two. So basically a package that wouldn't like go into the standard work library like the standard add on package would be still be made available and that means the user would have to put more work in. And that obviously depends on the task that you're trying to accomplish. So there's this whole quality assurance SOP thingy that it's still like how complex is the analysis, what is the purpose of the analysis that tries to balance out how much quality is more I need to do for my code. And I think just setting up that risk assessment for our packages and that particular context makes a lot of sense and aligning it with that. Pritam, do you want to ask your general question. This is to all the authors and presenters who are here. We've been trying to get, you know, the term trusted being accepted by our QA teams and so the general questions we had is what are the actual qualification requirements that he would, you know, designate a particular organization or a vendor author some packages as trusted so that they can be qualified without further user testing. So the quickest example I can mention is that we were trying to qualify our stand as an organization as a trusted vendor. So we were having so many difficulties with respect to how do we actually quantify it in terms of, you know, quantifiable requirements, how do you actually make someone a trusted source. Oh, Joe, are you waving to you want to come in on that question. Feel strongly about this. So trust just like trust anywhere is something that's earned over time. So, I think you have to treat them like everybody else in the beginning and then see how they pass packages over time. I mean, you might start with a prior and say that it's trusted. And that would be fine. So Bayesian thinking is good here. Speaking of Stan, but then you've got to build trust over time and I think that's the strongest track record. All right, so I'm going to let. Oh, do you have a hand up actually you go. Yeah, trust is built over time I like the base and thinking of it, but you can obviously help the trust building by publishing Stan has for instance published a little bit of insights about the software development cycle. And I think that really helps to establish or accelerate the trust building site that has put it like that. Yeah, I mean for certain close source softwares that we might use in this industry. We're not necessarily all going in and auditing so that trust is built up over time in most cases as well so I would get an impact in the white paper we said something very much along those lines is that over time anything can become trusted if you pass the same package for five years in a row or the same like five packages all by the same author or company. And you're always passing their stuff and they're always scoring really highly on whether it's you whether you're using risk metric and test coverage or using another mechanism within your company. If you're always thinking that they're fine and over time yeah you build up that trust that's fine. We're also seeing those some companies are looking okay is this something we can say today, like the our foundation, we haven't evaluated the our foundation over a long period of time, but they've existed for a long period of time they've produced our over a long period of time, they have the software development lifecycle, our studios and other Stan you've just mentioned the one thing in common they all have the software development lifecycle. And we've talked about that as a committee that maybe that's a really important document. I guess then it's showing that you actually adhere to that cycle is the is the next most important thing, because that's essentially if you're getting audited. It's, here's what my quality process is, and then you have to show in the order that you adhere to that process. So, first of all, someone can have a look at that and make sure they're happy with the process. Secondly, build up some evidence that someone follows that sdlc. Alice I'd love you to come back because I know you've written up one of these kind of documents internally at GSK. Yeah. Yeah, I mean, you basically kind of spoke to what I was I was going to say, where you can work to identify what is their sdlc and create a basically do an audit, an audit. There we go of the company or of like our stand, for example, the example in the package you were talking about, and show that they have consistently done this over a number of years and basically write your documentation to that without actually like performing your audit. Like going to that company or going to that author, but try to provide evidence to show they said this is the sdlc they follow. Here's evidence to show that they actually do this and this is why I trust it. In combination with the other things that you look at for package assessment as well, like looking at downloads looking at, you know, how well do they consistently test their packages. Do they do you have concerns about them like getting dropped off a cram or anything like that so basically creating a body of evidence. I don't think there's any one thing that you should worry about. Worry about in particular but it's creating that combined set of evidence to show that you can trust it. So just to chime in on that. I feel like that by a nature of the trusted resource assessment is less tied to the specific packages and the specific versions so that's like having that history is really important to understand the trajectory and feel like your trust in them can be in the test of time it's going to continue. And you don't have to worry so much about what's going to happen with the specifics of the packages and the versions and all that. Any thoughts on like how, you know, these trusted assessment you someone becomes trusted is there any way for someone to become untrusted, I guess you can get married and you can get divorced so like what, what mechanisms are the companies putting in in place. Sorry for the analogy, but what mechanisms are people putting in place to monitor I guess you people looking back in five years time and saying okay do we still trust this vendor. Or is it like a once I trust I trust forever. Are we too early to decide. I mean, this is one of the questions that was raised by a Q a team, and when we were doing this audit, and they did suggest periodic assessments that could be put in place so that all the vendors or at least the trusted sources can be periodically on a time frame that is suitable to all so it could be two years or three years but something that can be put in place so that when an actual regulatory audit occurs, you have the evidence to back up your so your claims as well. So, if you, if trusted was, let's say part of the risk assessment criteria so it's trusted or not. Right, but it could be built on on keeping a track of how they're trusted or not. So you could have a you could build a lag in there. So what happens is you've got sort of like a time series, have you been trusted over the last five packages over last 10 assessments. And that would automatically have a mechanism for if not disqualifying person from being trusted, or a company from being trusted with when it flag you to look into what's going on. So, I think that what I'm, I guess when I'm okay with me trying to say here, the trusted flag is conditional on past history. Going back some time. So, I think the system could easily accommodate conditional conditional metrics. Okay. Any other is now the questions in the chat. As many questions as we were going to ask for sort of Julian's talk alone, Johannes has confirmed just personal reasons you have to join us. So, what I would propose is if there are any other questions what it's a great opportunity to ask any or any questions as you said earlier Julian to any author, sorry, any person who's presented thus far. It's actually quite a nice open debate at the moment around some of these topics. If not, I'll just take up maybe five minutes, five minutes more of your time to just kind of round off a couple of things talk about what what we're doing next in terms of the write up of some of these and the plans moving forward. Julian. I just want to also invite everyone to continue the discussion on GitHub. And if anything comes up, I think it's a good place to continue and also to learn from each other to use that space and continue using that space and upload your presentations or files and into the case studies I think it's very valuable and also helps to get feedback before you get the feedback at the one time when you don't need it anymore when you don't want it anymore. Let's put it like that. You always want feedback, I suppose. Yes. Good place good opportunity for you to waterproof your frameworks. Thank you for everyone's contribution. Thank you. Thank you. And that's, that's maybe I would have taken three minutes to say the same thing. And you are much quicker. So you shaved off some time for everybody. So yeah, please do have a look at the get getting get on to get out of those people who haven't written, written anything up yet I know you have to get permission within companies and so on but it'd be great to see some more examples on there a couple of the ones that are up there already a fantastic really really nice documents to refer to, especially for those people who will be joining this or watching this video back and saying, right how do I get started how do I talk to my own q a teams that are having that information there as well as these videos obviously is, is fantastic so thanks for posting it into the chat as well. And thank you for coordinating these. This was originally going to be one, one session it's turned into three because we had so many people step forward so that's been fantastic. So yeah, before or five, depending on how many more people come forward. And so, yeah, for now we, for now we'll pause for now it'd be great to get some more input into the on the GitHub side. Otherwise, thank you everybody, and we will be back in touch in the near future with the next next follow up and keep, make sure you sit on the list, so that you get all of the blog posts and other updates from the, from the group as well. Okay, thanks everyone enjoy 20 minutes so back in your day. Thank you. Thank you. Thank you. Thank you. Bye.