 Hi everyone, happy new year 2023 and new year new location in this case only because building 10 had some flooding but still no it's kind of convenient and nice to come home and just walk downstairs and see everyone so real pleasure to see you all and today to host like we didn't have for the first NHGRI seminar series for 2023 and it's a real pleasure to have Blake visiting today both as a scientist and really as I mentioned sort of teaching us about a lot of the conversations that I hope we'll have with Blake are really about the intersection of doing amazing science with amazing people but even more than that Blake is thinking about how to encourage scientific curiosity and grow that amongst the early learners doing a Montana wild virus hunt for high school students and even younger students going into the classroom and taking a lot of pride in his own scientific efforts but as much really even just in the achievements of the people who work with him and that sense of scientific community that we're building and so when I asked Blake what he wanted me to highlight on his illustrious CV he did mention that I should talk about the undergrads the grad students and the postdocs who have won awards from road scholarships to k99s and we were fortunate to have Andrew Pam and give one of the NIH talks earlier this year who's now gone on from to win a k99 so Blake has really a wonderful pedigree himself but has done it with his own sense of integration and style and has was originally a bachelor's and PhD at the Montana State University and he describes it that he was fortunate to be recruited back there as a junior faculty but I think that they feel that they were fortunate to have recruited him he started working in uh CRISPR and the work that he's going to talk about today really is dates back a decade even from when he was an HHMI LSRF fellow at Berkeley with Jennifer Doudna and over the last 10 years has just been really doing amazing illustrious work in this field with CRISPRs and editing and looking at microbial communities that's been recognized with a number of awards including the PCAS award from President Obama and the NIGMS Director's Early Career Scientific Award so with that I am happy to bring Blake up right thank you so Blake welcome to NHGRI NIH. Thanks a lot thanks William thanks for the the really kind introduction uh very much appreciated and and thanks Amanda for the nomination and anybody else who was involved in the selection committee it's really an honor to be here uh it's a privilege and it's really one of the most enjoyable aspects of this job in my opinion to be able to travel around the globe and and chat with some of the most interesting curious people that you would otherwise never come across and I really enjoy it so thanks for taking time out of your schedules to listen to some of the work that we've been doing but share some of your own as well. Today I'm going to tell you about some of the work that we've been doing on mechanisms of CRISPR mediated immunity and some of the applications that go beyond editing and as Julie mentioned uh one of the individuals that was spearheading a lot of this work is Andrew Santiago Prangos um whose initial work was supported by the Life Sciences Research Foundation Julie mentioned that I was one of these fellows as well so it was exceptionally rewarding to have one of my own trainees then end up with this fellowship uh that was sponsored by the Simon Foundation and also the Burroughs Welcome Fund and Andrew recently got a K99 from the NIH and he was working with a graduate student Wil Henry Jets oh one of the things I meant to have in motion here was they've been working on the structure of this uh CRISPR um this CAS integrase and a complex that delivers foreign DNA to a specific location in the genome with single nucleotide resolution and I think understanding how that works uh specifically has pretty big implications for some biotech applications but just from a fundamental understanding of how these machines deliver foreign DNA in a very reproducible way and then once I tell you about the the structure that explains how this complex delivers foreign DNA to the bacterial genome with single nucleotide resolution I'm going to shift gears a little bit and taking back in time two years to tell you about how our work on the pandemic um collided with our interest in CRISPR in an unexpected way. All right but before I do any of that I'm going to take a few seconds here to just quickly give an introduction to CRISPR-mediated immune systems and I know that most of you could probably come up here and get this slide yourself so um I promise I'll keep my intro to 60 seconds or less so that I don't lose you but I just want to remind you that CRISPRs are found in about 40 percent of sequence bacterial genomes and almost 100 percent of archaeal genomes but zero percent of eukaryotic genomes including eukaryotic DNA that was bacterial derived like mitochondrial chloroplast DNA and I think that the skewed distribution particularly in the prokaryotic domain where there's 40 percent and 100 percent CRISPRs is is not well appreciated. Why is that? We don't know extremely well I think why there's such a distribution but what we do know is that wherever there are CRISPRs they seem to be involved in adaptive immunity and they um these immune systems work in three stages and in the first stage and I'm going to spend a fair bit of time talking about this uh this afternoon is this DNA is recognized as antigenic and that in itself is quite a feat how does this DNA get distinguished from the host DNA and then a short snippet of that DNA is site specifically integrated at one end of the CRISPR in a way that maintains this polarized or there's this polarized evolution that maintains a chronological record of all previously encountered form nucleic acids but the CRISPR is just molecular memory in order to elicit an immune response the CRISPR has to be transcribed in this long pre-chrisper transcript is subsequently diced or processed into a library of short CRISPR derived RNAs that each have a unique sequence that was derived from and is then by definition complementary to this previously encountered form nucleic acid each of these RNA guided surveillance complexes then patrol the intracellular environment and look for uh target nucleic acids but they don't do this by looking for a complementary target as you might think at least they don't do that initially because most viruses that in fact bacteria that we know of at least are double-stranded DNA so since the target sequence is buried in this double-stranded DNA duplex it would be probably energetically expensive and slow to unwind all the nucleic acid in the cell to look for a complementary target so that's not what they do first instead they scan for a motif called a PAM or a proto spacer adjacent motif this sequence of colors is called the spacer and the spacer is flanked by repeats and the origin of the spacer is the proto spacer and it's an identical sequence that's now in the phage but that sequence isn't flanked by repeats instead it's flanked by this proto j proto spacer adjacent motif that's two to five base pairs long and when this complex recognizes the PAM it distorts the DNA in a way that facilitates RNA guided strand invasion and once those two criteria are met both PAM recognition followed by directional unwinding of the DNA and complementary base pairing that activates the nucleus which of course is a discovery that enabled target targeted genome engineering all right but if these systems are so efficient at eliminating invading DNA then why do viruses persist and as many of you know viruses that infect bacteria among the most diverse and abundant biological agents on the planet there's an estimated 10 of the 31 viruses on earth that's roughly one a trillion viruses for every grain of sand and they cause a lot of infections many of these infections are lethal and in fact phage infection is responsible for turning over 20 to 30 percent of the entire biomass in the open ocean every day so if these immune systems are so good at eliminating foreign DNA and yet foreign DNA remains this biologically diverse and highly abundant element on our planet then how do we reconcile this contradiction and of course one way that viruses escape these systems are through mutations in the PAM the other possible way is mutations in the proto spacer and each of these instances these viruses then go undetected by the RNA guided surveillance complex but possibly a more interesting and proactive method for escaping CRISPR systems is virally encoded antichrispers these are small proteins usually 10 kd maybe 20 kd that are immediately early expressed upon entry into the host and these small proteins then bind oftentimes bind to these large ribonuclea protein complexes in a way that really blocks the function either blocks DNA binding or blocks activation of the nucleus one of these critical functions and because they do that they've been really critical tools for teaching us about how CRISPRs work antichrispers serve as these molecular beacons that's that point experimental list to the parts of the machine that are most critical for their function all right but this kind of conflict of course a consequence of this kind of conflict drives diversification and that's exactly what you see in CRISPR systems CRISPR systems have evolved independently at least two different times in the so-called class one and class two kinds of CRISPR systems everybody's familiar with class two systems they've been popularized of course by the cast nine RNA guided endonuclease which is critical for the class two type two immune systems that are RNA guided DNA nucleases but my lab or the work that we've been doing has been primarily focused on the class one type one systems which are biologically far more abundant but less popular given that they're not the primary tool for targeted genome engineering but these complexes consist of multi subunit assemblies that recruit oftentimes recruit a transacting nucleus for destroying the DNA and i'm going to spend most of today talking about the type one f system from pseudomonas originosa although this type one f system is found in many different types of bacteria then in the end of the talk i'm going to transition to these type three systems you can think of type three CRISPR systems as sort of the swiss army knife they include a polymerase within the CRISPR complex they cleave both DNA and RNA and i'm going to explain some of the sophistication of these systems and how we've repurposed them for a diagnostic and why i think these particular type three diagnostics are have a distinguishing feature that makes them maybe perhaps more valuable or useful for diagnostics and some of the other alternatives all right so let's zoom in on these type one f systems and as i said i'm going to talk primarily about the type one f system that happens to be in the pseudomonas originosa genome pa 14 um if you look upstream of the CRISPR there's an at rich leader sequence that's known as the meter and it contains a a transcriptional promoter the results and transcription of the CRISPR which includes a repeat a spacer repeat and so on and downstream of the CRISPR in this particular case there's six cast genes we know that cast one and cast two are among the most highly conserved cast genes of all the different systems and that they're involved in adaptation that means putting foreign dna into the CRISPR locus um we know that cast one forms a stable homodimer and that cast two the other component of this adaptation complex also forms a stable homodimer but strangely in this system and unique to the system there's a unique fusion that's conserved in all type one f systems where the cast two adaptation protein is fused to this large um nucleus that's involved in target destruction and includes both an hd nucleus and a superfamily two helicase and the cast two protein like cast ones also forms a stable homodimer and these two homodimers assemble into a heterohexameric integration complex that's responsible for inserting foreign dna here but since this complex contains this unique fusion of cast two fused to this cast three nucleus that avoid that destroys invading dna the question's been how does this complex achieve these two different goals both adaptation and interference so I'm going to talk about the interference complex here and what we know about this is that the CRISPR again is transcribed in the repeats within this CRISPR locus or palindromic and they form these stable hairpin structures these stable hairpins are recognized both sequence and structure by an enzyme called cast six that binds to the stable stem loop and cleaves at the three high end resulting in a mature CRISPR RNA species that includes eight nucleotides from the repeat on the five prime end the entire phage derived sequence and then this three prime stable stem loop structure and the cast six protein remains stably associated with this hairpin with picomolar binding infinities we think that this is a sub complex that the rest of the complex assembles around and this complex includes six subunits of the cast seven protein that oligomerized along the entire length of the CRISPR RNA and we've in fact shown that if you lengthen artificially lengthen the CRISPR RNA by six nucleotides you can add one more subunit 12 nucleotides two more subunits 18 and so on and so forth so it's really a an RNA template driven assembly of this complex that's capped at the head by cast six and on the other end of the complex it's capped by the tail that's composed of the cast eight and cast five protein that form a stable heterodimer that forms the tail of this complex and it has a distinctive vice like feature here that grips on to DNA and I'll show you how that works here but I think you could kind of get a sense for that just from the complex so the way that we think this thing works is the phage attaches to the surface of course and injects its nucleic acid which in most cases is double stranded DNA and that these complexes initially look for a PAM sequence as I described they don't unwind all of the double stranded DNA to look for a complementary target instead they probably slide hop and scan along the DNA until they encounter a PAM in which case the vice closes around the PAM and it drives a conformational change in this complex as it unwinds the double stranded DNA duplex this is really critical aspect of this complex that I'll talk more about here in the upcoming slides but it recruits the transacting nucleus here we're zooming in on the PAM and the mechanism for PAM recognition this vice closes specifically around a PAM making nuclear based specific interactions in both the major and minor groove but also when it does that it drives this amino acid side change serves as a molecular wedge that splits the DNA one strand of the DNA base pairs to the RNA guide and the other strand gets displaced entirely and is what's targeted by the transacting nucleus so I'll show you here a couple of snapshots of this structure that I think demonstrate the function of how the complex recruits this nucleus this is the RNA guided surveillance complex without binding to any target serum's lab actually determined a structure of this complex where the complex was bound to an artificial DNA target that's duplexed here in the beginning and contains a PAM but then it was non-complementary and only complementary to the RNA guide and then it has this flap of DNA that was hanging off the end and what they learned was that this complex the vice closes on the double-stranded DNA around the PAM as I described before and then hybridization drives an elongation of this complex and a consequence of that elongation is that it creates a big gap up here at the head and when they published this structure we thought we were scooped but we continued to build our model mostly because we were stubborn not because we really had any particular insight and it was far too late that we realized that we had used a different DNA substrate and this made it a big structural consequence that had functional implications so we used a completely complementary double-stranded DNA substrate and much like what they saw we saw that the vice closed around the PAM but as this complex unwinds a double-stranded DNA this displaced strand is critical for driving a conformational change in this helical bundle that you see here it rotates almost 180 degrees and translates up to fill this gap that they saw in their structure and here's why that's important you can see that this target DNA in yellow is solved and exposed right here but what happens when this helical bundle rotates 180 degrees and moves up into the slot is that this feature right here and this feature these features lock this helical bundle over the top of that foreign DNA locking this complex onto a target in a way that it basically has no off-rate once it finds a target it's on and it stays on and the other thing that we learned about this structure that we might have been able to surmise but really came from a structural prediction was that this is the surface that recruits the nucleus for target degradation so in this complex that this surface right here is buried in this complex the same is true that opposite surface is buried it's only when this thing undergoes 180 degree rotation and rotates up into this position that this entire face of this helical bundle is now exposed and what we learned after determining the structure is that when we submitted these proteins to a structural homology search basically a blast but at the structural level we found that this helical bundle had a structural homologue that was encoded in a phage genome and this is a virally encoded antichrisper and there was a structure of this virally encoded antichrisper bound to the cast-3 nucleus and by simply taking that structure and doing a structural superposition of the structure of this protein onto the structure that we just determined we determined precisely how this transacting nucleus gets recruited to this complex that is captured a piece of foreign DNA and I'll show you that just a little bit more here in this movie where you can see that what I'm going to show you is that this target bound complex recruits this transacting nucleus initially there's this major clash here between the cast-1s and the backbone of this complex so that complex has to disassemble and when it does the cast-3 nucleus fits perfectly on top of this DNA bound complex in a way where the shape and charge of the nucleus fits with the complex that's recruiting it all right so what I've shown you so far is that CRISPRs are transcribed they're processed into these mature RNAs and these mature RNA service a template for the ordered assembly of all these cast proteins into these complicated RNA guided DNA surveillance complexes and that this DNA surveillance complex undergoes a pretty radical conformational change upon binding and this exposure of this surface here is a surface that then recruits the transacting nucleus for target destruction but when Andrew came to my lab he wanted to understand what's the role of this complex in capturing foreign DNA and inserting it precisely into the CRISPR how does that work and initially we thought that his work might be just a bunch of single molecule experiments that we have kind of roughed out really trying to understand the kinetics of the assembly of these machines and the delivery of foreign DNA to this position but he noticed he had an observation that really changed the course of his work and I'll try to highlight that for you here but before I do that I should say that we weren't the only people working on these kinds of things in fact my previous advisor had continued to pioneer some of these kinds of questions and Jennifer's lab had made a couple of really important discoveries and I'm going to highlight them here and then explain how this sent us in a slightly different direction so I already told you that cast one is a critical and conserved protein of all CRISPR systems it forms a homodimer it forms two homodimers on opposite sides of a cast two homodimer creating this dog bone shaped like complex that's responsible for capturing and delivering DNA to this position but what they discovered was there's another protein involved in this process that wasn't previously known it's a protein called IHF and IHF plays a kind of an ironic role in this position some of you will know that IHF was discovered and it stands for integration host factor was discovered maybe back in the 1950s by Esther and Joshua Liederberg when they were studying lambda phage and showed that integration of lambda phage into the coli genome requires a host factor called integration host factor for inserting the lambda phage genome into the coli chromosome where it replicates as a prophage that was a really fundamental aspect of of the biology I suppose and played important roles in science history in terms of understanding how viruses replicate and hide out in terms of retro viruses and so on but nevertheless the reason I described it as an ironic role here is because here IHF is playing a critical role in integrating phage DNA just like it did during development of the the lysogen but here it's just a small fragment of DNA 32 bases of DNA that's inserted specifically at the leader end of the CRISPR and the way that it does that is that IHF is a DNA kinking protein that bends the DNA and presents an upstream sequence motif to the CRISPR complex or the CRISPR integrase which is now bound to a piece of foreign DNA and this complex gets recruited to the toe of this horseshoe this DNA horseshoe that's created by IHF and that creates critical interactions between the upstream DNA and the Cas1 protein and Cas1 and IHF so that's all shown here in these structures and the biochemistry that they used to complement this I won't go through the details but I think you can see the upstream sequence gets bent by almost 180 degrees by IHF and as a consequence of that there's this upstream sequence motif that gets presented to the integrase and if you make mutations in the DNA or in the amino acids responsible for these interactions you perturb new sequence acquisition the CRISPR can't evolve if you make mutations in those critical positions so again when Andrew came to my lab one of the things he wanted to do was to study the kinetics of this entire process but while he was out at Berkeley working in Carlos Bustamante's lab on one of these single molecule experiments for for doing these sorts of watching each molecule get recruited to a CRISPR he wrote me an email it was more elegant than this but the gist of it was the distance between the first repeat and IHF binding site is eight base pairs more than what's seen in E. Coli and initially I was kind of concerned about Andrew I thought wow that seems like a fairly detailed nuanced aspect of this system is that really the most important question that we could chase and maybe it's already obvious to you why that's really critical although not a huge number of bases but as a consequence of inserting eight base pairs there not only do you break this interaction because you translate IHF away from the Cas1 protein along the helical axis of the DNA but DNA also has space through it so as a consequence of inserting eight base pairs there you introduce a 260 degree rotation of the upstream DNA so both of these critical interactions that had been described in this structure couldn't possibly be true for this complex and if you just model it you know very sort of back of the envelope style structural modeling here not only is this interaction broken but the interaction that was up here is now rotated to the other side of the complex and clashes with the other molecule of the Cas1 homodimer so it seemed like that was completely impossible structural explanation for how new sequences would be inserted here but one of the questions we had was that back to what I was concerned about in the beginning is this just some sort of mutational idiosyncrasy between these two genomes or is this more of a common trend than what might be apparent from this simple pairwise comparison so instead of looking at two genomes we looked at 20 000 and we compared 15 000 crisper leaders and what we're doing here is taking the first repeat of a crisper and then looking upstream and asking do we see that IHF binding site and if you look any coli genomes there's a strong conservation of an IHF binding site but what was a little bit surprising already is that the understanding at the time was that this was the paradigm for all type 1e immune systems they would require this IHF binding site but lo and behold two-thirds of the 1e immune systems don't have an IHF binding site and instead they have a highly conserved motif that's probably involved in new sequence acquisition but nobody knew about it and it still hasn't been um functionally characterized moreover if you look upstream of this novel leader what we call the lamb or leader adjacent motif there's another sequence motif even further upstream also completely uncharacterized so I guess the point that I'm trying to make at a bigger level is I think if you ask anybody off the street they know that cast genes are pretty diverse they'll probably even tell you that crisper repeat sequences are diverse but I think what we're finding is that leader sequences are extremely diverse and probably more biologically function than has been pretty important than what has been previously appreciated all right so if we come back to the to the E. coli model here's the IHF binding site here's the upstream sequence motif that's shown to be critical in the structure and the corresponding biology but now let's go back to the question that we started with when we started analyzing all these genomes is this difference in eight nucleotides is that something that's found in more than just these two different genomes and in fact if you look in all the type 1f systems you see that there's this distance this difference in distance between the 1e systems and the 1f systems so at least it's not idiosyncratic just between these two genomes it's something that seems to be highly conserved and in fact if you look up further upstream there's not just one IHF binding site but there's two and this IHF binding site is flanked by two upstream sequence motifs that happen to be inverted repeats so the question is what is this what are these conserved sequence motifs doing upstream of the crisper that haven't thought to be previously involved in any aspect of crisper biology and we just took an artistic approach to this again at the beginning and just started doing some modeling asking what would this complex look if we introduced two 180 degree kinks and how would that impact this inverted repeat sequence that's upstream you'd sort of think that this is an enhancer in some ways this inverted repeat sequence we had speculated might be involved in interacting with a homo-dimeric DNA binding protein that would coordinate these two different sequences so we thought there might be a host factor involved but in fact when we reconstituted this system in vitro we could do in vitro integration reactions delivering foreign fragments of foreign DNA to a specific site without any other host factor so it suggested that there's no missing factor here what we have is necessary to do all the integration reaction so what we thought was happening is that this one two three complex is snatching a piece of foreign DNA and delivering it to this leader repeat junction and doing it in a way that splits the repeat such that you maintain the repeat spacer repeat architecture if you just insert new spacers then you end up with an array of spacers but you lose the repeat spacer repeat architecture that's absolutely critical for every aspect of this system and previous work had been done to show that this model was generally correct that the first repeat served as a template for making the the next repeat that would be created meaning if you make a mutation in the first repeat then that mutation would be carried on for subsequent repeats that are added and then this situation is resolved by DNA repair enzymes that would fill in these remaining CRISPR sequences or sorry repeat sequences on either side of the spacer all right so what's happening in the leader sequence that might participate in this reaction so as I told you previously we found these iHF binding sites in the leader sequence and these upstream sequence motifs so Andrew set out to test what might be happening between leader the CRISPR this 400 kilodalton integrase complex and these iHF binding proteins and he did that by purifying the CAST 2-3 protein the CAST 1 protein the iHF heterodimer all the DNA sequences involved and then purified this complex using the SEC and then we put this sample on a new cryo electron microscope at Montana State University and I need to give a shout out to Martin Lawrence a colleague of mine in biochemistry who is the PI on an NSF MRI grant that enabled us to get this microscope and without his hard work we would have never been able to collect the preliminary data necessary to get time to come to New York to collect data at the National Center for CryoEM access and training and with Ed Eng at NC CAT and some of the people from his team we collected or recorded 11,000 movies and selected over a million different particles and then classified those particles into 2D class averages some of which you could see considerable structural detail which was pretty encouraging and others looked more common or contaminant like so through this iterative process of pruning 2D classification and 3D classification we eliminated most of those particles and resolved in a stack that included 200 particles that all contained two 2D class averages that captured the complex in many different orientations and then use that complex to determine the three-dimensional structure and I'll show you that now what you're looking at here is the repeat sequence the first repeat of the CRISPR and then looking upstream and as I told you before upstream of the CRISPR there's an IHF binding site an inverted repeat motif an IHF binding site and then the other inverted repeat and in this movie you'll see that these IHF proteins bind to the DNA and kink the DNA in a way that positions this complex onto this position here for target integration all right so just to be clear we don't know that this is exactly the order of events that happens but this is how we imagine this process occurring so this IHF binding site gets kinked creating these two DNA pillars that present these inverted repeat motifs for symmetrical interactions on either side of a cast tube homodimer these proteins recognize these inverted repeat motifs through non-sequence specific interactions and I think that that's pretty interesting every protein in this complex recognizes DNA all of which are through non-sequence specific interactions the repeat itself is going to continue through this pore right here although we don't have the electron density to see it and there's an active site in the cast one molecule over here and what we're just showing here are some dimensions in the DNA that explain why the complex is driven to the integration reaction rather than disintegration the reaction can go in both directions so you can the other thing I kind of wanted to highlight here is this upstream sequence motif is recognized by cast two and it introduces a DNA bend that same kind of thing happens on the other side of the complex the other cast two subunit recognizes the other inverted repeat also introduces the DNA bend collectively the IHF proteins and these cast two are bending the DNA by almost 360 degrees some total so it's creating this quite contorted structure and you can see here that the foreign DNA is trapped on one face of this integration complex under this DNA bridge that's created by this IHF protein so this major groove of this foreign DNA is pinned between the major groove of this upstream genome genomic DNA on this side and the same is true on the other side suggesting that the integrase has to capture foreign DNA first before it docks onto that complex because it's exceedingly unlikely I think that this foreign DNA could be threaded under this DNA bridge after the integration complex is formed all right so what I just showed you in that structure I think is explains how this complex captures foreign DNA facilitates this transesterification reaction where it trades a phosphodiester bond from the foreign DNA to one strand of the repeat sequence and then the other strand of the foreign DNA then makes a bond with the other end of the repeat sequence effectively tying a non-covalent knot around the cast two dimer in the middle that has to be resolved in a mech through a mechanism that we don't understand that creates a scenario where you have the foreign DNA that's duplex flanked by a single stranded repeat on either side and that single stranded repeat then gets filled in by the DNA repair machinery the last part of this model we don't understand that's a little bit speculative all right I think I've covered some of the main points of what I think we've learned from this integration complex so I'm going to shift gears a little bit here and take you back in time two years which maybe nobody else maybe nobody wants to go back in time two years but I think there was a real reality at the time that changed the course of my research program in a way that I couldn't really anticipate and it ended up colliding with CRISPRs in a way that was unintentional but I think it is interesting and valuable and and I hope you share that but I'm going to tell you a little bit about our foray into SARS coronavirus and how it ultimately led back to CRISPRs so when the pandemic hit I was on an airplane actually back from Hawaii and as soon as I got off the airplane one of the people in the lab had said hey there's this pandemic that is it's probably going to be a big deal and our vice president for research at MSU had sent an email that articulated this same concern and asked us what are we going to do to protect our students and citizens what does MSU have to offer in this in this response and at first I really thought probably at least for me not too much I felt intimidated by the question but once one of the individuals in the lab said that SARS-CoV-1 was detected in wastewater I literally got in the car and drove down to the wastewater treatment plant because as Julie mentioned we had a biosafety protocol for sampling wastewater for viruses that infect bacteria so although the language in my biosafety protocol probably wasn't exclusive what wasn't at all to do with SARS-CoV-2 I think we were within the constraints of the protocol to go there and and sample for SARS-CoV-2 and what we learned from this temporal analysis was that at a time when tests were extremely in short supply we could use one test and measure changes of spread of SARS-CoV-2 in the population in the community so we measured the concentration of SARS-CoV-2 in the wastewater multiple times a week and we showed using epidemiological data from the county that we could predict the surge in the community by four to five days in advance by testing wastewater and I think that that makes sense in retrospect because people that aren't symptomatic are still shedding the virus and there's 100 compliance everybody's contributing and people who aren't symptomatic but still shedding the virus were contributing to the viral load so I think that was a relatively simple but really important contribution in fact sends out being one of my most highly cited papers in even the last two years but at the same time in addition to sampling the wastewater we were also helping the local hospital set up a SARS-CoV-2 diagnostic testing center and as part of the deal for helping them get set up we also started collecting those samples and bringing them to our biosafety level three and we were sequencing the SARS-CoV-2 genomes on a small scale compared to what a lot of other people were doing ultimately we ended up sequencing thousands of SARS-CoV-2 genomes but even so not a huge contribution a small contribution from a big state but what we were surprised to find is that the first seven genome sequences that we sequenced in Boseman all contained a deletion mutation in or 7a and we thought wow what are the odds that in Boseman Montana we would stumble across a variant that's going to end up being you know a global problem and so we were initially kind of concerned about that and so we asked the first question which was is anybody else seeing this mutation and just not reported it because most of the attention was focused on the spike and it turns out that that's true once we downloaded at the time it was probably a million genomes from KSAID and did a phylogenetic analysis in fact we found other people were also sequencing SARS-CoV-2 genomes around the globe that also had deletion mutations in or 7a so we wanted to ask what's the phenotype of this deletion mutation so we isolated the virus and competed against the Wuhan vile type and showed that it has a replication defect and the reason that it has a replication defect is because or 7a is an immune suppressor it suppresses the interferon response so if you delete or 7a you get more of an interferon response and that knocks down the virus if you overexpress or 7a then you get more immune suppression all right but we were doing all this work at a time I told you we were at the hospital trying to set up the diagnostic center and so it one of the things that as a consequence of being for the first time you know we're at an ag school not a med school so having this connection to clinicians and and being just a little bit closer to patients we were intimately aware of the backlog of samples that were waiting to be tested by quantitative PCR and the time delays that were adding to part of the problem of spread of SARS coronavirus and we wondered would there be a role for CRISPRs in some sort of aspect that could accelerate diagnostics but as I just told you most CRISPR systems are RNA guided DNA targeting systems I just spent the last 20 minutes telling you about these type one CRISPR systems these large protein complexes that unwind double-stranded DNA and destroy it and that's true for a lot of the class two systems to these single pet protein effectors that bind to nucleic acids and and target double-stranded DNA but there's both class two and class one effectors that target RNA not DNA they're rare but they're around and some of them are called type six systems these are classified as cast 13s but the ones that we were focused on were these type three systems and I think that for diagnostics these offer some really differentiating features that might be particularly valuable for detecting any RNA in a fairly programmable way so I'm going to spend a little bit of time talking about the type three systems different than the type one systems these type three systems they target RNA and here's how they work in nature this complex binds to a target a complementary RNA typically from a DNA virus that makes a transcript or in the rare case that there is an RNA page some of them can target the RNA page genome as well and when they do that at this complementary binding drives a conformational change that activates a polymerase domain that was originally identified by Kira Macarova and Eugene Coon in here at the NIH probably a decade before anybody proved showed that biochemically they actually polymerize and they polymerize something that's very specific they take four ribo ATP's and convert it into this cyclic molecule it's not always c4 sometimes it's c6 c3 a library really an entire lexicon encoded by nucleic acids that then signal a very complicated downstream immune response in the simplest case the cyclic nucleotide binds to an otherwise dormant nucleus and activates that nucleus which then starts to cleave RNA in a largely indiscriminate fashion and as a consequence of that it pushes the cell into dormancy the other outcome of course is that if it stays in this activated state then the nucleus stays activated and the cell commits suicide and that's an abortive phenotype but built into these complexes this timing mechanism that shuts this machine off so the apocomplex binds to a complementary RNA that generates the cyclic nucleotides that activate these downstream effectors in the simplest case the downstream effectors are nucleases but they can also be proteases they can also be porans they can also be transcriptional regulators they're extremely diverse and really interesting and they respond to a variety of different cyclic nucleotide activators and i think we're starting to see a beam here where bacteria are producing these nucleotide-based signaling molecules from many different kinds of immune responses seed gases and these kinds of things that trigger these really complicated immune systems downstream so nevertheless in some instances these complexes then cleave this RNA and this timed fashion and then they resort back to this unactivated complex that's no longer polymerizing ATP but what we did was in the interest of making a diagnostic which i would argue that's exactly what CRISPR systems naturally are they're viral diagnostics they're just bacterial diagnostics that are sensing viral infections in bacteria we thought we just exploit that natural function to do to have program this complex to recognize a virus that it would never would in nature and we eliminated the nucleus activity that's normally part of this complex so that it only binds activates and then stays perpetually activated creating this signal amplification effect that's unique to something that has a polymerase activity all right so this complex this this work was originally done by Andrew Santiago frangos again and i'll i'll show you how another post-op just pick and pick this up and kind of taken it to the next level but initially what we showed was that this complex can bind a SARS-CoV-2 RNA produce these three different products the cyclic nucleotide that i already told you and just like any other polymerase it makes pyrophosphate and protons we could detect each of these different signals the cyclic nucleotide is detected by activating these carf containing nucleases that are activated by the cyclic nucleotide and they leave a tether that links the fluorophore to a quencher and this liberated fluorophore of course is a signal that we can detect using a fluorimeter the pyrophosphates we can detect by calcium and a change in pH which is the increase in protons we can detect using pH sensitive dye in much the same way that like a lamp based assay works but the problem with these assays is that they weren't sensitive enough to be clinically relevant so we did what a lot of other CRISPR based technologies were doing at the time we took the clinical sample we extracted the nucleic acid we did RT to convert the viral RNA to DNA and at the same time we inserted T7 promoter sites so we could turn this amplified DNA back into RNA which would then be recognized by the CRISPR type 3 CRISPR complex called CSM which would then make the cyclic nucleotide which would then activate the nuclease which would then liberate the fluorophore it's even exhausting to say right and and I think as a consequence of that it was probably obviously not really going to have too much of a market impact but I think what we did show was that there's no cross reactivity to common pathogens the sensitivity was modest but still clinically relevant 200 copies per microliter which is around a CT value of about 30 all right but then Ani came along and and she thought well maybe we can do better and her ambition was to eliminate all of its intermediate steps and just go straight from a clinical sample directly to a fluorescent signal eliminating even this RNA extraction step which is a step that many people just ignore both in terms of costs and time to extract the nucleic acid prior to performing a diagnostic acid and the way that she did this I think was pretty clever she decorated magnetic beads with these type 3 CSM complexes and added those directly to a patient sample in this case you can use a large volume sample where you add the complex to a large volume that contains a low concentration of a target RNA of interest it doesn't have to be SARS-CoV-2 whatever RNA that you're interested in if it's in a complex mixture of RNA DNA metabolites whatever else you might find up somebody else's somebody's nose you can extract and isolate that rare RNA to a very small volume by just now adding a magnet pelleting these target bound complexes to a small volume decanting whatever's not bound there and then with a single liquid handling step add ATP a new improved nucleus we're making improvements all the time and an RNA reporter and I think as a consequence of this what we showed is that we can reduce sampling handle it reduce sample handling and time to result there's no RNA extraction that requires you know high complexity lab or organic solvents or any of that kind of thing and there's no requirement for pre amplification so there's none of the artifacts that also come with amplifying nucleic acids or and it's not sort of susceptible to many of the polymerase inhibitors that you find in a lot of different samples all right so I guess with that I just want to thank some of the people in the lab and acknowledge their efforts in particular Andrew Sambiago frangos Anna Ardham and will and many of the funding sources in particular I guess it's apropos to really say thank you to the NIH who's consistently funded my research program for the last decade most recently through support with an R35 as well as continuing to support some of the trainees in the group including Andrew who has a K99 from general medicine and Ardham who has a K99 through AID I mentioned that we wouldn't have been able to do the microscopy without support from the NSF and the microscope that we purchased with support from both NSF and Murdoch we do some work on algae and biofuel engineering through the Department of Energy and I have conflicts of interest with virus detection systems and surgeon and with that thank you for your attention and take any questions well thanks very much and Sean you can leave the way by going to the one of the microscopes the microscopes microscope and we'll also welcome questions from the online group so start us off yeah hi very cool thanks um I'm interested in these these changes of binding sites in that leader sequence and also the the arms race between phage and bacteria and I was wondering if there's a correlation between the categories and the kinds of anti-prisper proteins that are generated within those classes and if this is one of their escape mechanisms for preventing inhibition of the prispers yeah it's a really good question in fact we have the same kinds of question I guess what I can tell you about that is yes to the first part of your question that they do seem to be categorized according to the different types and you could even see that in the phylogenetic analysis that I showed they categorized pretty well but not nowhere near perfect and in fact what we've been doing recently is developing new software that complements CRISPR detection software that's out there now it seems obvious that you could detect chrispers chrispers of you know we've been doing this for so long now you think you'd be able to at least find them in genomes but in fact there's lots of confusion and it's an imperfect process when you try to do this in a high throughput fashion in fact there was a recent paper that reported over 13,000 chrispers in the human genome so I'm curious some your thoughts on that but nevertheless I think I say that just to highlight the fact that an unsupervised algorithm like this that's just looking for these patterns of repeats and there's lots of variability between repeats even within an array becomes extremely hard so one of the things that we've been doing is categorizing CRISPR leaders according to subtype and according to their 16 s based taxonomic classification and then looking for new sequence motifs that are more indicative or predictive of chrispers and then using those algorithms to complement existing CRISPR detection software to look for new chrisper systems so that doesn't answer your question about conflict but I think there's a lot of value in looking at these leaders and I'll just say that to date I don't think there's any virally encoded antichrispers that stop acquisition in particular at least that have been discovered yeah so with the with the output the readout for your um your essay yeah being this cyclic a nucleotide yeah does that sort of kind of vary to multiplexing you see a path forward for multiplexing these either using different multiple systems or something yeah I think that we're about as far as you got in the last five minutes we've been wondering about multiplexing as well I think you can use different systems to generate different cyclic nucleotides which could have different readouts but we haven't made any real meaningful progress in in that in that in that area I think you know the multiplexing that we talked about that would be a little bit more practical is to try to separate one complex that's bound to one target into one geographic location on a chip for example and another in a different location but then there's complications with that too lots of people are interested in that if you have any good ideas please let me know um thanks I'm just wondering where do you see the role of academic labs and in this viral diagnostic space or just maybe even diagnostic space because it seems to me that there's the large assays that could be run in diagnostic companies quest or something then there's something that could be Jim point of contact at a doctor's office and then there's something that could be done at home and maybe there's other things do you think that that are are you looking at an academic lab to be work like working towards one of those three spaces or do you think that these kinds of diagnostics could be across that in terms of what you're you know where you're seeing this well I mean I think that the platform itself is generally applicable in really kind of agnostic to whether it's a centralized diagnostic or a point of care or whether it's being used in ag or in the healthcare setting something I learned about Montana recently is that we happen to test all the seed potatoes for the entire northwest so one of the major things in potatoes is a virus called pvy and just the lab across the hall or really it's across the street that I didn't even know about is one of the major testers for pvy in seed potatoes so we're right now seeing if our technology can outperform their technology in seed potatoes which I haven't seen their system yet but I'm eager to go over there understand it some sort of high throughput mashed potato kind of generator thing but but I guess I mean more to the philosophical part of your question like what role does an academic lab have to play in this and I guess the my approach to this has been can we learn fundamental aspects like what are the kinetics of these interactions what's the on and off rates of nucleic acid binding what are the fundamental aspects that may have biological insights but also biotechnical applications and the real kind of applied thing really bringing it to market that's the role for industry but I think by understanding some of those fundamentals we'll have some insights in both biology and biotechnology that are valuable but that's at least our ambition I mean I don't want any of my students working on making you know nuanced changes to a concoction that might lead to slightly better marketability of some product but rather sort of what's nucleotides are made if it's not just one then what are the other ones that are made in different stoichiometries what are they signaling to resist just noise that's made by these polymerases those kinds of questions well that's a great answer that is what we're all after it's this basic understanding and so thank you for bringing it back to that with that I'd like to close the seminar thank you all for coming today it's really I see everyone and thanks again Blake for coming and giving us such a great talk