 All right, thanks very much, Stefan, for the kind invitation to speak here to you at our medicine today. I'm going to be talking to you about a project that took place in the throws of the pandemic and how we solved the problem and how we continued to use the same rapidly developed software to improve our workflows and virology here at St. Paul's Hospital in Vancouver. So this is the backstory about why we got into this to this shiny development project in 2020. In summertime, particularly there was a worldwide shortage on reagents for SARS-CoV-2 testing for the high throughput automated platforms. And the sort of most notable automated fully automated platform at that time was the Roche Cobas 6800. It was really the only end to end on a platform where you could load a sample and walk away. We happened to already have one of those instruments when the pandemic started to do HIV testing, which we do some of the HIV testing, which we do for the province of British Columbia. And that meant that we could rapidly pivot and offer SARS-CoV-2 testing on the PCR on there. Problem was that a lot of other people had the same idea and the device is very easy to use. And so what happened is there became a shortage, both provincially, nationally and internationally on the reagent. And so suddenly we were faced with not having reagent at all. So it's going to car with no gasoline. And we did have some more manual methods. And at that time, the positivity rate for SARS-CoV-2 in the general population was low. And by low, I mean less than 5%. And so that allowed the strategy of sample pooling as a way of stretching the reagent supply and increasing the throughput. So let me give you a little bit of background on sample pooling. This is the most simple protocol for sample pooling. It's called Dorfman pooling. It was originally developed to stretch resources for serological testing. So the idea is quite simple. If you have low positivity rates of something that you're testing for in a population and you mix samples together and the mix tests negative, then you can impute that all of the individual samples are negative. And this works particularly well in PCR because the signal is amplified exponentially in the PCR process. It wouldn't work as well in other contexts where you pay quite a significant penalty on sensitivity. That is on limit of detection, but you don't pay much of a penalty in COVID testing. One of the things I will say though is that whenever you're doing sample handling on the deck, and by deck I mean a liquid handling robot, whenever you're doing handling of potentially very high concentration of viral load samples, on the deck you run the risk of contaminating neighboring samples. So pooling does introduce the possibility of pool contamination, but it is mitigated by the fact that you have to test individual samples when you get a positive pool. So when the sample positivity rate is low, this pooling strategy is great because you test one well and you get four tests out of it if that well tests negative. Now dwarfman pooling is the simplest kind of pooling and there are lots of other more sophisticated strategies for pooling and there's a large collection of papers that you can read about someone who's notable in this series at gentlemen and Chris Builder, D-I-L-D-E-R, so you can look that up. But the problem we had is we had a four week conception to deployment timeline and if you work in a clinical lab you'll realize that people really don't like things that are black boxes. So they didn't want to have a pooling strategy that wasn't abundantly transparent to them so that if they had to troubleshoot on a clinical level they would be able to do so. So complex pooling algorithms would require us to have a lot of automation on the liquid handler, custom automation. The liquid handler would have to get the results from the pooled results back and then automatically go back and select individual samples for retesting on a singleton basis because that's what you do when you get a positive pool you test all of the samples in that pool individually and that wasn't going to fly on a number of levels. The texts were already quite agitated that I was proposing to do sample pooling because they imagined themselves running around, you know, gathering specimens one by one and hand mixing them and things like that. So one of the development principles is I had to make their lives easier in every way that I could. The other thing that induces fear for technologists is related and this is an example strategy of a more sophisticated pooling strategy. You mix 12 samples together. If you get a positive pool from a 12-plex pool then you break it down into three 4-plex pools and if you get a positive one of those you test individually and the texts would see something like this and they would sort of have a moment of horror. So we didn't want to do anything except the bare bones Dorfman pooling and also we wanted to minimize the number of times that pipette tips would cross the liquid handling deck. So what do you actually stand to gain when you do sample pooling? Well it all depends on your sample positivity rate. So if you have a 1% positivity rate in the population what this graph does is tell you the number of results that you're going to get per test used. By test used I mean per squirt of reagents for that device. So for example if we're at 1% positivity rate for SARS-CoV-2 in our testing population you can see that the optimal throughput is going to be achieved when you mix 10 samples together. But as I said to you we were around 3-5% positivity so it seemed best to us to pick around a four plex pool although we did program four plex, six plex, eight plex and ten plex because we didn't know which way the pandemic would go. And of course if positivity rate gets too high you're not really buying you're buying one and a half tests for every one squirt of reagents so not so great once you get into high positivity rates. Now I know that people are there are aficionados and they're looking this that I've made a PowerPoint presentation and I use base R plotting and I know that that's in some minds gauche. So this is the sort of an outline of the analytical process. We have a Hamilton Vantage liquid handler which was our sort of main high throughput device it had a barcode scanner so it scanned all the barcodes on the individual samples. They were in 7.5 by 12 plastic tubes they'd been decanted from the original NP swab tube and then we used an R script to take the output of the Hamilton Vantage and convert it into XML that was compatible with the MP96 extraction device. So note that this is not the Roche Cobas 6800 this is a more manual workflow using a dedicated chemical extractor and a dedicated thermocycler and so we would generate an XML directly from the Vantage and we also took into account failed aspirations of the sample because the Vantage liquid handler has clot detection and if there were globs of mucus in the sample which turns out that was not an infrequent thing because you'd get snot in the sample the liquid handler would detect that as a clot and not piped it and that's a good thing because if it continued to pick it up it could drag a big strand of mucus right across the deck and contaminate other specimens. But if the sample was not properly aspirated we had to remove it from the pooling process and the sort of tracking of specimens through the system. So we just used SCP that is secure copy and I ran into a tool called I notify weight which sort of does what Dropbox does when you drop a file into a folder I notify weight sits and watches that folder and then does something. So I notify weight would look for the MP96 XML file and move it to a shared folder that was on network attached storage which I'll talk about in a moment. The chemical extraction would occur and then the samples and the files go over to the amplifier then in our script for some more post processing then to a Shiny app hosted on Ubuntu then to a middleware owned by this company SunQuest called Data Innovations and then out to the lab information system and to the electronic health record. So that's the general data flow. This is the actual sort of network diagram of what we did. So Roche did not want us directly connecting to the computers that control their devices the extractor and the thermocycler. So they put a mass and a hardware firewall in front of their devices but they did allow us to have one IP address connect to the mass and that is our pooling server which was Ubuntu with the Shiny server on top of that. And so this is how these are the devices that were connected. So we had three liquid handlers, three chemical extractors, three thermocyclers, the network attached storage and Roche diagnostics had remote access into their devices also and then the patient data would go out to this middleware and then finally to the lab information system. So obviously we had to validate the pipetting process, not just the data flow process but we had to make sure that our pooling software that was on the liquid handler was correct and so we made lots of colored solutions to make sure things were pipetting into the right places. That's sort of a low tech validation. Then we did a higher tech validation where we actually have four different container IDs for the four different specimens and then we make a concatenated ID for all of them. Unfortunately Roche software cut off the concatenated IDs. We needed like 50 some-odd characters and that cuts them off at 25 but we had to do some workarounds for that. There were many workarounds because we're asking things to happen that aren't usually supposed to happen and so we could compare our pool result to the expected pool result based on the individual patient samples. So in the validation process we tested each sample individually and then we pooled them and then we knew what we were supposed to get and we made sure that we got what we should get. But as I say because you're doing so much pipetting you do run the risk of contaminating individual specimens. So yes this is a spreadsheet but it is open office. So this is what the interface looks like. So you have to select the run that you want to process. So you select that from the thermocycler and it's on the network attached storage device which is just a little Linux server as it turns out. And so what we're seeing here is that yellow specimens are positive. This is their CT value and there's the concatenated container ID. So those are the four specimens that are actually unique identifiers that are actually associated with this and those specimens will be the ones that have to be individually tested and so we racked them in such a way that it would correspond to the well map in a 12 by 8 format which I'll show you in a moment. And then when it's time to send out the results to the lab information system only the negative results are going to go. The yellow results will be held back and that's held back by the software. So we made these little well maps that we could review and we have a positive in row A5. So we go down here to row A5 and we say these are the specimens that we need to pull and they're going to be in position A5 of four different racks of specimens. And so the text would pull those and then just submit them for individual analysis. And then this is what goes out to the lab information system. It's just a flat file. It gets dropped into a shared folder that's networked heterogeneously through SAMBA and then once that transfer has occurred every five minutes the lab information system slurps up whatever is in that folder and passes it over to the patient record. And so we wanted some statistics about what we're up to and what you'll see here in a moment or if you're paying attention to the figure there's more here than just COVID because people were so happy with the process of not having to do any manual transcriptions and not having to do anything with USB sticks that they wanted all the testing associated with this lab to be pushed through this app and so that's what happened subsequently. So in the time that we've had deployment we have had about 600,000 results go through. Those in traditional times would have been manually transcribed which may seem horrifying to you but it was a low volume highly specialized biology lab and at that time that was not a serious problem but when COVID came that became sort of untenable. So some key points as I mentioned primary tubes often contain fragments of the swab they always contain a fragment of the swab and they always contain a little bit of mucus and sometimes it's a problem and the liquid handler does not like this as you know and so we were having to do what we call pour over or pour off where we aspirate the sample into a secondary tube and leave the mucus and the swab tip into the primary collection tube that was used at the time of the patient collection. So some folks have turned off their liquid handler clot detection in order to avoid all these errors but we didn't think that was a very good idea because it meant the sample might not be adequately aspirated and as I said it can drag mucus all over the deck. We did consider doing pooling on the Roche 6800 but we didn't have any reagents so it didn't make sense but it is possible to do you can connect to the back end of the Roche 6800 and pull out the XML result file and manipulate it and so we set the pathway to do that but we didn't actually do it. So what were the tools of note? Well we used Shiny of course, we used Shiny Proxy which is a way of basically deploying multiple instances of individual Shiny servers inside of a Docker container so that more than one person can use the app at the same time and it allows you to, if you need to you can have authentication occur. So we used Shiny, Shiny Proxy, we also used Docker Swarm so we have actually two servers, two Linux servers that can replicate one another's behavior if one of them goes down. All of the software of course was stored on GitLab and then as I mentioned the networking was with Samba. So unfortunately the positivity rate started to climb in November of 2020 and it made it so that we came to a point where we couldn't pool and so we were really happy with pooling but quite sad that there came a point where it wasn't feasible anymore. So one afternoon we decided that the best thing to do would be to try to filter specimens out that were likely to be positive and the best predictor of positivity that we could find initially was just where they came from, their postal code or their zip code. Excuse me, we say postal code in Canada. So we wrote this app called Tanya's Little Helper and Tanya is the charge tech up in virology and you might recognize Santa's Little Helper there and so we would scan specimens and say whether they could be pooled or not. And this led to a effective positivity rate of the pooled samples which diverged from the patient whole population positivity rate very nicely so that worked well as a way of extending pooling but eventually by January 2021 we could no longer do any pooling. So we did come up with the idea that perhaps we could do better than just using postal code and that led us to try and build a machine learning app to use other features that we have in the patient record in order to determine a likely positivity. And so we tried a number of machine learning methods but settled on random forest because it worked very well and it was easy. And so we were able to decrease the effective positivity rate of samples targeted for pooling by about half but unfortunately the sample positivity rate after January 2021 remained up in the 20% range and so even cutting the effective positivity rate in half wouldn't solve our problem. So at the start we had many individualized processes. We had paper workflows, USB sticks, manual keying in of results and in the end we have no manual pipetting except that initial decanting event that is pour off event. We have automated pooling and singleton data workflows for all of the results coming out of the virology lab. We have 10 instruments and two Linux servers backing up one another and we have instantaneous lookup of prior results and in the associated raw data files. And a number of benefits have come about. We establish a pathway for Shiny app deployment. We've written some other Shiny apps in the intervening period to monitor our QC to look at our live patient turnaround times and obviously to do COVID dashboards and we've also managed to establish a pathway for AI machine learning model development inside of our lab which was not something that was likely to happen without a strong push through a practical project. I'd like to give acknowledgments to the people who helped me make this happen. The big ones are Mahdi Mabini who was the developer of the front end. I did only the back end programming and Grace Vandergoogten who did all of the liquid handling program. And with that I will stop and see if there's any time for questions. Thank you very much for your attention. Yeah this is very impressive. I had one question which you answered quite nicely whether there was any kind of validation that was done. Oh lots, yeah. And it was all in five weeks though. I mean like the pressure was really on Stefan. It was just like I've never felt so much pressure to get something done clinically in my life. So I, you know, being a laboratorian myself I noticed it's very, very unusual to get so much custom development done. What kinds of support or what kind of a structure do you need in your department to actually pull this off? And maybe as a follow-up question what do you actually need to pull it off and keep it sustainable? Well I mean you've really hit on it Stefan. And we've talked about this before you write software and then you become the de facto support person for that software. This particular piece was, we kept the features very, very tight and specific. And so it's me and one other person and it's functionally one other person. And the software is running uninterrupted for the entire duration. We did add the functionality of having the other viruses being going through it. But I would just say if Maddie leaves I'm probably going to have to contract him back if I want modifications or I'm going to have to do it myself. We don't really have an infrastructure for this. Theoretically there's an infrastructure but then they would have to crawl into the code with the same kind of pain and suffering that I would. So you know there's risks associated with in-house development. But I would say Roche has made a software solution for this but it was not health Canada cleared or FDA cleared at the time when we went to need it to deploy and so it didn't matter. Although they did give us a pat on the back for recapitulating the core functionality of their software in four weeks.