 Yeah, thank you very much Hopefully everybody can hear me So this is my conflict of interest slide It it does take a village to take ideas From a very primitive stage such as the stage that I typically am involved and get them actually out into the real world And and I'm indebted to these companies and I will say nice things about them For those of you who you know want to use your smartphone instead of listening to my lecture I would recommend. This is a distraction for a pged org where you can put your dot on the map here These are all sites where there are Genomically literate individuals and this is my wife teaching a class every where she goes She in addition to being an amazing researcher. She teaches these classes on genetics So hopefully you won't be disappointed by By a survey I found that this tends to work. It's it's rather like it's it's a rather like a tour But the there's articles behind all of this and I haven't found it useful to go through the articles Many of these are unpublished, but it's the same idea. Anyway, these are topics DNA is a material And in C2 method an NT2 sequencing method and then it's open cohort that we just heard about Then CRISPR which probably needs no introduction and then the brain initiative which may also but So DNA as a material and I give three quick examples Which is the super resolution example and then a drug delivery and then Information this is just kind of a small experiment that I did with Pung Yen and William Shee who the three of us and Pam Silver as well are Constitute the DNA nanostructure component of Harvard and And three of us three labs collaborated on making these these long rods that that allow you to barcode literally in multiple colors at High resolution sees about 42 nanometers between these dots the closest space dots So this is way beyond the resolution fluorescence microscope and unlike any every previous method for super resolution This the blinking of the fluorophores and I'm not going to go into the whole physics of super resolution But the blinking the fluorophores is here Regulated by the the binding and not binding a fluorescently labeled a ligand nucleotides to the barcode so that the labeling itself provides In a neat way of doing flickering for those who are interested in super resolution Now we call these DNA nanorobots. They are robots Only in the sense they have three of the key components of robots, which is sensors logic and actuators If you only have two of those you're you're a machine rather a robot and and and here what they do is they will sense either immune cells or cancer cells based on two optimers DNA DNAs that single strand DNAs that fold up and recognize surface proteins of T cells or cancer cells And then only if you have an and Both optimers are activated that makes it a the logic Will it open up and deliver the antibodies to either modulatory or killing this case killing And you can see this little animation of showing it opening up when it binds the little blue optimers bind And here's some electron microscopy showing that we where we replace the antibodies with gold nanoparticles showing the What we designed is what we got and DNA nanostructures actually are much easier than protein design which we also do pretty much all of these things work Without much ado and undergraduates do it in the summer competition called bio mount on so that's example number two of DNA nanostructures we think are fairly useful most of the history of DNA nanostructures has been toys But these two we think are potentially useful and here's another one and we don't know because it's still early days this is the Colbert that was earlier mentioned and Here he is hawking my book. We made it to number 57 on Amazon due to Steven And we also encoded this book all 5.2 million bits into zeros and ones from zeros and ones to ACG and T So all books these days are zeros and ones at some point or another and this is what we call archival storage It's not like your thumb drive where you're just like filling with all day. It's archival storage And it has the advantage for that. It's a very specialized field I mean, although it's a multi-billion-dollar business and most of you don't know about but DNA has the advantage over typical archival storage which we have to replace every three years or so in that this has a 700,000 year test run which is better than most other storage media and And we can store about 200 exabytes per gram so we can fit a zeta bite in the size of a foam Where an exabyte is 8 times 10 to the 18th bits or 10 to the 18th bytes And the way we did this, I mean what? Well, we'll come to a moment in exactly the chemistry and instrumentation But this is the landscape where we have in blue kind of standard Consumer storage ranging from the old-fashioned CDs up to modern flash memory and hard drives so we're getting up into the This is log 10 on the y-axis and For the density and then production scale on the x-axis kind of gives you some idea of How big the market is how cheap it is and that we know is you don't go out to lots of bits unless it's inexpensive And then purple are a bunch of physics based methods Which are hard to scale into consumer markets because they involve things like liquid helium which most people don't want to carry around liquid helium and And then in red or some biologicals this is this is where we we bumped up essentially six orders of magnitude almost seven or eight relative to most storage media in terms of density And then this this year we have some funding from one of the archival Companies to increase it by ten thousand fold so ten thousand fold in one year in terms in in this axis of cost So how do we and this by the way? This was published in science along with Sri Khasari was the professor in this case My perfect my postdoc pretending to be professor and me pretending to be postdoc. I did this mostly with my own hands So you can see why I'm giving a little extra time so this is the this is a The two main there are many different methods that we worked on with these companies from the very beginning inkjet and this Photo lithography which actually originated here in Madison and we had great I would Dedicate this slide to Franco Serena And anyway, these are these are two methods which are essentially based on analogous methods in the non biological world and here you have four different color four different Phosphoramidate chemicals GAT and C and here you have Their flow flow through a flow cell one at a time and where they're where you can get either photo deprotection You can get up to six million oligos per chip and the chips can be as little as two hundred dollars So this is really inexpensive and this brings us to an interesting topic, which is why is everything so inexpensive? so The as many of you may know so that's that was all about writing genomes. We have to read them as well and we read Everything we've written and this is a This example of Moore's law curve, which was which is pretty aggressive. It's 1.5 fold per year multiplicative so these are factors of 10 on the y-axis And if you predicted how long would take to go from the three billion dollars, you know to an affordable one meaning around a thousand dollars Which is right now at about a thousand dollar cost slightly higher price and it was predicted to take Take six decades that Moore's law rates like electronics, but it actually only took six years so don't believe it you don't don't feel limited by You know very long-term trends. I guess is a lesson here You can't say whether it'll take longer or or or shorter Okay, and that's for cost, but the Reading is not just about cost. You could also you could reduce cost by making worse Product, but it turns out the quality is improving exponentially as well and just like the previous exponent the exponent has increased at least in my Limited analysis here, so it's sort of in the point three to three percent range At the beginning of the genome project and now it's around ten to minus seventh as we'll see in just a moment That's one error and ten million and The appetite phase length, which we'll also see in a moment is how long you can read Where you know that it came from the mother or the father in the case of a diploid human, let's say Okay, so exponentially improving cost and quality So what's so what have we done for you recently say what's the unpublished sequencing? So first I have to introduce this this this super One way of getting super high accuracy and it's relevant We originally developed this we meaning Rob Mitra in the 90s and then Jason during 2005 and then with Roddy Dramonic and his colleagues at complete genomics with a series of papers that resulted in the clube fleet genomics system, which is Like a sequencing by ligation But we also helped with the sequencing by polymerase synthesis and here This is an ordered array most of the other methods are unordered their Poisson limited They're kind of random, but this is kind of pretty it shows that each of these dots was micro fabricated Which is coarse and then a single molecule and albeit a rolling circle amplified molecule Completely occupies each dot so the dot might be 200 to 300 nanometers But they're each completely dominated by one molecule and hence you get these pure colors You know yellow magenta blue and green and you just don't get any blueish green or yellow magentas That's that beats the Poisson distribution and it also In combination with another method gives us this really high accuracy So you dilute 10 cells of genomes are kind of broken into the 200 kb range into 384 well plates And now the probability of finding mom and dad's gene genes in the same well is very low It's about a tenth of a genome and then you amplify in sequences by this like Legation method in the previous slide and you can now tell you can get the haplotype phase Which is clinically significant because let's say you have good alleles g on a particular gene So this is one gene two mutations and the paternal genome chromosome gene had two C's as alleles Or you could have the same two C's if you kind of lose track of where you are as you're sequencing along Because everything's very similar Then you could have a see it from mom and a see from dad which would mean you have zero functional copies So one functional copy versus zero functional copies should matter to you If you're going to get your genome sequenced in the physician now. Oh, by the way, just audience participation How many people here have their genome sequenced? Okay We got some work to do. Okay, so that Was done with ten cells not because we were showing off but because ten cells is the right number of dilution so that you get enough copies so that you get high accuracy and And you get the separation that you need for getting distinguishing The maternal paternal copy and for that matter distinguishing other homologs in the genome But we can go to single cell and indeed we can go to subcellular and we kind of skip straight to there because there's a lot of interesting context in In transcriptomes now we're switching from DNA to RNA where a cell is not a Homogeneous object it actually varies down an axon or from one side to another of a secretory cell and so we want three-dimensional information in the context of multicellular and Typically in cell biology we feel limited by the number of colors they overlap in a spectral sense And we're limited to about four colors practically speaking But if we can reuse those four colors without really moving without so that we can find each pixel again We essentially get where in is the number of cycles of you reuse the colors and here cycles We're calling them next generation sequencing cycles Then we get four to the end colors and that blows up very quickly into a very wonderful large number of colors So how do we do this? How do we get 3d imaging four to the end colors where n might be you know? 30 or so at subcellular resolution limited only by the resolution light microscope Which we've already shown is sort of getting into the 10 nanometer range So this is You know we don't need to drill down on all the gory detail of the method But but basically we take a fixed specimen we fix with this odd peg Connecting this succinical Groups so these will cross-link any amino groups so their aminos in the proteins which are fixed And then as we make cDNA we add additional amino groups and they get fixed the protein So you essentially get something that's kind of like a porous gel made up of proteins cross-linking to Circular cDNA so we circulize with circle eye gaze and then we very lightly amplify just enough so that it raises the signal to noise But doesn't make the dots big so now you have a three-dimensional Array of dots and each dot can yield a sequence tag So you have all the advantages of in C2 hybridization But more colors and you have all the advantages of RNA sequencing, but you haven't lost your place in the cell each Position of each sequence read is meaningful now rather than random It's not on a okay So this is a postdoc Jay Lee and Evan Dardi mainly Have been pioneering this for many years Not published yet, but in it's in it's in review so So there it's rather crowded in the cell so the cell has about a mean distance depending on the cell type of 100 nanometers between polyadenylation pop any site in the RNA and So one solution to this is super resolution so I showed how we can go sort of into the 10 to 40 nanometer range and Molecular certification and deconvolution I'll show in the next two slides The idea here is you do a subset of the of the transcripts and Then it will be sparse And it will be easy to for the computer to read that the are the bar are the sequence tags It'll be sparse, but you'll be missing some RNAs You come back in and do another subset different from the first subset and now it will also be sparse But you can add the two together and now you've got higher density in the computer Even though you never had high density in the microscope So here's an example where you have you do the subset you could subset it in a variety ways But here you do a subset it ends in a and here a subset that ends in AC and here's a subset where has no preference You get basically all the all the priming and you can see as you go down this that it gets more and more sparse and easier to Read and the correlation coefficient gets better and better And then deconvolution is kind of something you do anyway I don't mean to be making a big deal about it But it's something that you computationally do to improve the sharpness of and this has been a big a very big impact here, and we have various methods for Quantitating the quality of these things in particular. Here's one that the peels Easily explained where we have we have a database of transcriptome that you're drawing from or or genome and it tends to be the case that that RNAs come from Even though there are some anti-sense RNAs it tends to be there's a dominant RNA in every place And so 97% of what we we see is on one strand the correct strand if there's a database and And and we think there's relatively low noise Now we'll come back to physique as a particular application in a moment about with the brain Now this what is what is it about this personal genome project? this is The only open and it's one of the few integrated and we argue that it's Inequal one, you know, so there's big data, which you usually think of as big cohorts here. We have Sort of cohort independent. We're treating each person as if they are an equal one even though we have ambitions to have a big cohort So we have the idea is not just going from personal genomes to traits and some hugely predictive Algorithm, but we leverage the million fold improvement and cost and quality to fill in some of the blanks in between here so we can fill in the Environmental components some of these are either naturally have a sequence tag or can be made to have a sequence tag like the microbiome is Extremely, incredibly important for this project and their immune response to microbiome tells us In fact our immune response to all sorts of environmental factors kind of tells us what what our body is thinking about the environment And so chemicals leave an impression either on the immunome or on the epigenome that is to say are there are RNA and chromatin reflect certain kinds of chemical insults and we We we also Biobank many different cell types including B cells and fiber blasts induced pluripotent stem cells and some and and we and derivatives thereof and Then we traits traits is a big open-ended thing and this is kind of intended to be like Wikipedia If there's a trait that's missing there, then you can recommend it and we'll try to get it in place This is now spread internationally in US Canada UK and a number of other about five other countries are in are in process And we have approval for fully open access, you know world's only open access source for this and There it goes and here's we have an annual meeting on DNA day in April 25th and Hope some of you will join us there. Oh, is there anybody here? This is a PGP It's more than people and you haven't had your genomes because that's that's a real pity. That's our fault. Okay So So this is the genomes environments of traits We require a hundred percent on an exam. This is so that we don't tell them later Oh, there's something really Bothersome we found in your genome. Do you want to hear about it? We don't that's not the time for them to say no We don't want to hear about it because at that point you've already kind of told them that there's something to hear about So we try to make sure they know what they're getting into And in terms of privacy as well, and then we have stem cells which can differentiate into a whole variety of human tissues here via the the Teratoma model where you get diverse tissues and here fully in vitro and some of the some of the Traits are quite a whole organism. This is a Functional magnetic resonance imaging of my my brain actually these slices are virtual not not actual as you might have guessed and This project has now Achieved Some governmental attraction in the sense that in an unprecedented collaboration between the National Institute for Standard Technology and the Food and Drug Administration they wanted to have genome standards for the for the world and And they looked around for properly consented individuals and found only one project worldwide which is ours And so they're starting with eight trios from the personal genome project meaning mother father and child And this will and you can learn about it genome in a bottle org and also the NIH in code project is now taken on the personal genome project the encode project is trying to get at all the Regulisis regulatory elements or every base pair in the genome finding functions for it, and it's been limited by access to Properly consented cell lines and more beyond consent. You'll see some other advantages technological ones we have Now I'm not gonna spend a lot of time on how but these are examples of Inequal one studies where you either discover a new allele and what you can and what you can do with it Or in some cases you rediscover or you you apply them But these are cases where people were not getting what they wanted out of medical care and they found it in their genome So, you know like the Volcker children's probably the most famous one where they sort of child Was three years old getting loss of intestinal surgery and they did his genome and found out it was an immune function And so cord blood and he's fine five years later John Lauerman was one of the PGP Early adopters and he was basically healthy although he had in his medical records These dark spots on his retina or scotoma and he had leg pain That was bad enough that he got hospitalized and they actually and this surprised me They did a genetic test due to his leg of thinking it might be a clotting disorder They did what kind of a standard genetic test not even they had no family history of a factor five lightens that clotting factor Polymorphism and he he had he did not have it now You can't conclude from that that he has no genetic risk factors and indeed when we did his whole genome We found that he had a jack-2 mutation Which was somatic it wasn't a germ line And and that is does explain the scotoma and leg problems, and he's taking aspirin, which is not such a bad lifetime price to pay So these are but the point of all these these are sort of in equal one studies some of the more researchy than clinical of Looking for Deleterious alleles and what you can do to solve them so in each of these cases the solution is on the far right But what about the opposite problem? This has been under Represented, which is what if you have a rare protective allele rather? So these are people that manage to live past 110 they're called super centenarians And the take-home from this is not that if you want to live past 110 You should smoke and drink to excess But that maybe they have the same kind of environmental problems all of us have but they have something else that Biological is protecting them that is allowing them to get past all the infections and the chemicals in their environment So what so we and others are in the process of sequencing many of these individuals And you'll have to stay tuned to whether we actually find any protective alleles or not But there are a few that we can look at as examples and it also illustrates more important than just this idea protective versus Deleterious it illustrates what we can do to move from a correlation based Human genetics research to causality based because in many cases the n equals one is quite clear I mean that is to say you know I'm arguing that all of us are an equal one and to do research properly we need to treat it as such But there's some cases where it's very clear that in the whole literature. There's only one child who is Documented as having a myostatin double null meaning both Maternal and paternal and there's one child who has a myostatin receptor double null if I've correctly interpreted the literature But and these so it's hard to get 10,000 of these Children to do a big study But you can do you can switch to causality and actually becomes more convincing than a Than many GWAS studies because here's three different animal models. You don't have to do three, but in this case there were This is the double-muscled cow and then on the right is the myostatin dog which is you know To scale and this is the the the intentionally made mouse On the right as well So that's myostatin double nulls and that's the top of this list and here's kind of a list I've collected of rare protective alleles in addition myostatin. There's LRP five Which is not a double null, but it's a particular set of there's a few alleles that result in in a heterozygous state of extra strong bones so so strong that they will they will damage surgical drill bits and saws and And these people are very heavy in water dense in water PCSK 9 is kind of stimulated the the pharmaceutical industry because these people have incredibly low LDL cholesterol levels In fact when I first heard about it just I said can they be alive and they're not only alive They are they have very low coronary artery disease CCR 5 and FET2 are viral receptors, which apparently are fairly normal health-wise In the double null state But they are abnormal in that they are resistant to HIV and norovirus Now a P P could be on another slide about deleterious alleles The reasons on this slide is decode genetics found although there's many alleles of a P P Which cause you to be at higher risk for Alzheimer's there are some that cause you to be a much lower risk Maybe delaying it by 10 or 15 relative to controls So I want some of that Okay, now how do we get some of that? How do us regular folks get the Good alleles the protective alleles that other people have so Timothy Ray Brown was one of the first Wasn't exactly gene and therapy But he had a he he had leukemia and AIDS and went from being one of the most unlucky people of the world to be the luckiest because his donor for the fixing the leukemia Was not only matched for HLA, but also was a CCR 5 double null, which is the co-receptor for the AIDS virus and This in part along with other biological facts inspired clinical trial of This company sangamo, which I played a very tiny role in many many years ago But don't currently not currently evolved in but they made zinc finger fusions with a Bacterial in the nucleus if okay one and only when they're near one of these two these two sets of four fingers each Are near one another will they bind well enough that the that the nucleus will cleave and make a double-strand break and Then make a mess of the CCR 5 gene both copies of CCR 5 This is now in phase two clinical trials and looks quite promising in terms of toxicity and efficacy although it does work better on heterozygous and homozygous and most people in the world are not Heterozygous for this but anyway, it's it's I suspect it will get approved by the FDA fairly soon But there might be something better you know I this is this is effective and low toxicity for for cleaving Targets targeted genes where you design the the the zinc fingers to bind to you to whatever gene you want wherever you want These are very challenging to do experimentally. They're very expensive to do And there is a better way we think now although it's still early days This is CRISPR for genome engineering and not just humans but many other organisms there are so we've Participated since of almost the very dawn of each of these methods in Seven different ways of doing recombination Maybe more but definitely they're two of them involve RNA at two at the top group two entrons Which was part of my thesis and CRISPR, which is more recently I mean and then they're five they're protein based and one of them It's actually DNA plus protein based the DNA is finds a homology region while most these other ones It's just the protein and I'll just quickly summarize these two in a couple of slides the the mage Multiplex automated genome engineering uses this lambda red single strand oligos. It's currently limited In its best form to E. Coli, and we have a little bit of Similar method in yeast And then CRISPR, okay, but let's but let's talk about what you can do with mage rather than exactly how it works What we've done is Applied it genome-wide to the coli genome So each of these little blue dots is or bars is a place in the genome where we've engineered a codon we've engineered every single instance of of us of the stop code on UAG into a stop code on UA a So 321 changes genome-wide we did this in little blocks, and then we conjugated in that the important thing was that this this was a case of Truly genome-wide engineering where we changed a function. We didn't just make a copy of a genome We engineered it so that and it wasn't something was gene engineering It was really genome-wide and we and this had three things that we were aiming at and three things that we found ice Metabolic isolation we can make the organism dependent upon a non-standard amino acid which is Topic number two we can put in non-standard amino acids now at very high efficiency Without any crosstalk with other machinery in the cell and finally we can get genetic isolation so that the DNA functional DNA can't go in and out of the cell and and our ultimate goal is to get multi virus resistance in this case this one was T7 virus but the idea is if you change the genome the genetic code enough It'll be resistant to all viruses a priori all natural viruses. This was done Farron Isaacs is now professor Yale or assistant professor Yale and Mark LaJoy is a graduate student And this was this we have three science papers come in two at one of them's come out and two more this month Oddly enough, I mean who would think that you'd get three science papers about E. Coli codons But there you go some things happen But CRISPR CRISPR is not mage. It can use mage as its input But it is a phenomenon that really traces back to E. Coli in 1987 at least this is the earliest one I can find But it was basically neglected as a technology It was it's actually junk DNA for a while and it was finally discovered that it was kind of a bacterial active immunity and then just in January it turned into a technology where we and others Showed that it could work in humans and then sent so there's a few articles This is actually just a tiny subset of literature now. It's a few months later humans mouse zebrafish rabbitopsis Drosophila seal against yeast and many others so it it seems to work in other people's hands in contrast to many of our other technologies Not that they don't work there's just there's hard and they require companies to get it to work This doesn't require a company Except maybe to make some short oligos and what happens is these short oligos go and go to into a guide RNA Vector so this there's a little bit of constant RNA and then the variable part here in green and and rather than the protein Recognizing the DNA this the protein helps the RNA recognize the DNA There's a little bit that's two bases that are recognized by the protein And you make a triple X where you have two DNA strands an RNA strand and so basically you really only have to program about 15 to 20 base pairs in here And this was published in science in January Prashant Mali as a postdoc and Luhan Yang just got her PhD a couple weeks ago and here this demonstration was Doing homologous recombination, but you can also use it for what I showed before with CCR 5 we can make it nulls But this homologous recombination we took two we made two targets right next to each other so showing how close together you can get these guide RNAs and And you can do one or the other or both if you do both then you typically get a deletion at the correct endpoints And we compared it to another recent technology that we and others worked on called talons I'm not describing most of these acronyms because they don't really make much sense to me anyway But anyway talons We're so it went zinc fingers talons and crisper and we compare them in the next slide So here is that that that case of correcting two different guide RNAs near each other overlapping a talon site so they're as close as they can be to one another and to You know to being at the same point in chromatin all the chromatin effects and so forth And this was picked to be a site that's not particularly talent and sensitive There are sites that really we cannot design a talent to work But this was a fairly typical site and there is a 9 to 22 fold improvement going from talons to the guide RNAs And you can see there's a fairly small range here about a factor of 2 In efficiency Between the two different guide RNAs that are near one another What else can we do a guide RNAs we can make deletions so you hold in one end So you make this is two guide RNAs so two two of these double strand breaks Where one hand the left end is held fixed and we move the right guide RNA along it And you can see it's fairly flat in terms of efficiency sort of in the 10% range for these particular cell types. We can get up to 40 50 percent And this we we've the biggest deletion we've we've attempted is 83 kb And it's about as efficient as a 1 kb deletion So this works over pretty long distances. This is the work of post-doc Susan Bern who also did this experiment where she replaced so almost to what we do is in human cells We're trying to develop gene therapies and so forth in this particular case We took the human thigh one which is interesting immunological locus and we made either one or two double strand breaks on either side of it in order to Replace it with a mouse thigh one. These are really convenient because they're great immunological reagents for faxing so here you see fax sorting with just the left guide RNA just the right guide RNA and With both and none and you can see what we're looking for is this box here Which goes it's about 0.1 percent in the control 0.5 percent In the just one end if you get both ends it shuts shut up to 16 percent So you can do this Replace little bits of genes large bits of genes deletions up to 83 kb Maybe more and they were not limited to double strand breaks We can make nicks and we can make variations on this where we've knocked out both nick both nickings So there's no cleavage whatsoever. So it's completely Nuclease minus and then we can add on domains either onto the protein or onto the RNA by coupling it to app to RNA binding domains so you can put things that modify chromatin or Transcriptional activators or oppressors on the protein or the RNA we've done both and Some of this is described in nature methods and some in nature biotech now specificity you This in principle should come up with the mega nucleases with the mage with zinc fingers the talons But it comes up a lot with CRISPR for some reason and that's fine And this is what we've done in our first paper We focused on computational and empirical searches This is the one I showed you the very small one that I showed you but we generated computational For the entire human genome Guide RNAs for the whole genome Focusing on sites that were unlikely a priori to be off target So so we we took the whole genome and they filtered out the ones that were that might have a hit elsewhere and that allowed us to get 90% of the Of the genes and about 40% of the exons within each of those genes That's one there's two ways compute computation empirical We can also use nicks rather than double-strand breaks nicks greatly Two nicks are better than one double-strand break because each is independent And and we've done that That's item three and then for activation so this is not cleavage But activation we can get not just two but multiple sites and so here's an example of that where We we rigged it in such a way so that each individual activation CRISPR Cas9 is fairly low level of fold enriched fold Enhancement of expression with activation domain But as you get up into five and ten sites you get up to a 30 fold Increase over baseline of the transcription of this endogenous gene And this was published recently in nature biotech so so and these these are where the positions of those Natural so that we didn't in engineers we engineered the guide RNAs to match to natural endogenous sites scattered from Up up to 2.6 kb upstream for the transcription start site Okay, and we have other things in the pipeline for improving Specificity, but it's right now. It's already adequate for almost everything that we have in mind except possibly Human gene therapy which we're working on as well Now this is John Ock. He's He co-runs my lab with me and he's we've had Wonderful decades together and he's the one that designed these hundred that 90,000 guide RNAs that we published in January And as I said 90 percent of the human genes and 40 percent of the exons of those genes We're also targeting not only these messenger RNAs for in exons, but also activation upstream of messenger RNAs long non-coding RNAs micro RNAs And these are the number that we've targeted and we're using those in an interesting project of trans differentiation, which will be in the next slide Which is We hypothesize that or we wonder if any these are just cell types around the edge You don't need to read it But whether any cell type can be turned into any other cell type directly or indirectly and if so how many steps does it take? How many factors it take this is kind of a general question the one could ask and it could be fairly hypothesis Independent and and I talk about this a little bit in this like a systems biology opinion piece as an example of that here's reprogramming Not too different from what other groups have done in terms of turning IPS cells into neurons Including some groups here in Madison but but in this case we you can see this cluster this clump of Pluripotent stem cells they love to sit on top of each other and then With one of four different genes we found now this is shown Neurogenin one but but for three others we can get them to differentiate into a specific neuronal type What appears well by RNA seek analysis and morphology be very specific and they migrate out from this mass and they make this kind of this loose network of Bipolar cells we call them Where they have just exactly two processes coming out for them almost population-wide and this is a 98% Conversion maybe more in in merely four days this would if these are what we think they are in the in the brain This would be on the order of 70 or 80 days during normal human development, so You could call this an artifact or you could call it a technology okay, and we're and and This is kind of old data, but we have electrolytic activity and synapse formation in these neurons and the and the point of all this is not this particular Neural pathway, but with that we're trying to do this systematically using either the CRISPR method or other methods of of over producing or or not making nulls and Nulls something it's hard to do with RNA. I we did RNA sequencing on these and and we can I'm not going to go through all the details, but the point is we can characterize these cells and the problem with is there's not a lot of There there are Brain tissue in the databases, but they're not a lot of examples of single cell transcriptomes Where you have a pure cell type or a single cell type? So we we're hoping the in-situ sequencing will help us find out exactly what kinds of neurons these correspond to if any it Could be that this is not a natural these are not natural neurons. They don't but but they're interesting and that this is what? You can push Pluripotent stem cells to become and so we're they're telling us something now we've We've worked with Dunn-Ingber's group and William Poo and Kit Parker on organs on chips that all these different organs are represented by Model systems where you can have multiple cell types in a in a mechanically Significant environments for example here's the lung which has a air epithelium support layer endothelium blood and this behaves much more physiologically than any previous Culture condition rather like having cells on it's just a flat plastic With Kit Parker, we've done this contractile Assay for cardiac cardiac tissue and we can reproduce individual human Mutations we can test causality as I mentioned earlier not just animal models, but in this organ on a chip model Now I'm just going to end on a quick introduction to this brain innovative neuro technologies this was Something that my colleagues here listed here and I Somehow managed to get on to the political radar not that were any of us are particularly good at politics but this was announced by Obama in April 2nd and And I love the the acronym that he and his staff came up which was innovative neuro technologies Because it it's meant to convey something other than a juggernaut Unstoppable mega project but really to try to bring down costs and improve quality and maybe bring new modalities So what might those modalities be we've had a series of papers on this starting on some of them that predated that announcement in neuron and science and ACS nano and neuroscience Illustrating the various points, but the one I want to focus on just to wrap up here is this idea of integration or a rosetta brain we call it and the idea is you'd like to have a background information on behaving animal where you monitor perception and behavior and Then use four different kinds of fluorescent and T2 sequencing remember beginning of the talk fluorescent C2 sequencing We are developing a way of turning activity meaning brain activity of various biological activities like Calcium or other signaling molecules into a DNA ticker tape which we can read with fluorescent C2 sequencing We've got a little publication on this But it's by no means reduced to practice the connectome that we in Tony Zadar are collaborating on Where you can find the connectivity of every neuron to every neuron at the synapses that they make thousand synapses per neuron in some cases a Developmental lineage where we can develop use CRISPR to make barcodes real-time during development And then finally the transcriptome which is the what way I illustrated already you can do with fluorescent C2 sequencing The idea is not just to read the genome, but the model it and then write meaning that you you change the developmental biology you change the the Neuronal activity in order to test your ideas and the earlier you do that the sooner you know how far off you are in your knowledge So this is the big wake-up call a synthetic biology is And I'm not gonna go through all the bites that we get free season does it's not in my opinion Not daunting computational, but other people will almost certainly disagree But these are the different flavors of the rosetta the lineage to connectome the brain spike potentials or other activity that might be slower and so on Here's the example the brain Alan Brain Institute they did a serial section through mouse and human brains Looking for transcripts and they did them one transcript at a time. And so you needed to You need you got kind of sparse sampling of the brain and you needed lots of brains and the thousands of brains to cover all the transcripts But still a terrific resource and Have it out there, but we'd like to repeat this Getting the transcriptome from one brain. It's easier to align the different images different brains have different Connectivity and so comparing different brains is problematic, so we'd like to do it all in one and that's the idea behind the Rosetta brain where we not only get all the transcripts in one brain, but we get all these other modalities as well and and that's my I'm gonna include on that particular fantasy and and this is people I want to thank on the Brain activity that map now renamed brain for innovative neurotechnologies and they're on our papers Sorry, I couldn't spend more time on it And these are people to thank for the personal genome projects and other people on the technology that mentioned all the way along And I'll just leave this up while we have a conversation with Questions, and I think there's still plenty of time left, but anybody's free to go. I guess so, you know, thank you so so It seems like they are more methylation sensitive than then Crispers Partly they they recognize a DNA from the major groove, which is where the five methyl C is there may be some new Amino Acid or some something that will help with the five-month-old C but there are other sites that that don't have any Cg dynucleotides in human that that essentially You know our my colleagues at Harvard to Chad Cowan and and others have found you just get zero recombinants out of hundreds of tries Or or NACJ and and we don't know what that is But and why and then they did they did ten of these and they did CRISPR side-by-side in every case CRISPR gave like 30 to 80% efficiency for things that we were getting 0% on the talent, so It's a little early to declare it, but I think that's the way it looks Well, I hope neurons aren't special But I'm certainly not naive enough to think that everything's going to be easy and I even said these may be Somewhat artifactual in some states and we won't know until we really get lots of trans single-cell transcriptome data You know we have reason to believe that it takes more than one factor to say for example Go from fibroblast to induced pluripotent stem cells, right and and I think to some extent we many of us felt that it well It's going to take four factors to do everything well But here's the good news is there are some cases where it's four and some where it's one But what we hope is that by finding all the cases where it is one we can have that as kind of a you know a starting point and Then we can do two factors at a time sort of semi systematically Maybe using the ones as starting points to seed it and then three and then four and five and go from there but it's it's basically it's a high it's a sort of a genomic Hypothesis independent way of finding out what the developmental landscape is starting from a cell that clearly wants to go a lot of different Directions based on the teratoma assay teratoma. You really it really goes all over the place Yeah, I didn't mention that yeah, but but any question is fair. So go ahead What are the limits currently and the extinction? Yeah, so the limits Okay, so I would say so the extinction big big topic we have a little talk on that but It's it's a non-field so far. Although there's a lot of people Working on it talking about it whole meetings on it I think the first set of things is Bringing back things are in the freezer the second set of things are bringing out certain traits Well, which will enable species to survive in the modern world In other words address the issues that made them extinct and So for example turning elephants in the mammoths so they can survive in the tundra Which might be a better place for both them and the tundra and I could give a whole reason for why that Actually could be good for global Temperatures But you know it may not take changing the entire genome to achieve those ecological goals and you could get back quite a few traits with a fairly small number of Genetic mutations and that's our hope is to explore that space Initially as the technology keeps on its exponential curve by the time we get so that we actually want to change Every base pair for some compulsive reason The technology will be much cheaper. I think that's a project that could have big environmental Positives but we need to be cautious and we should wait until look or we should work for the cost down Yeah Yes Yeah, right Yeah, so Fortunately, so this is one case where I was Obsessing at the beginning about getting poly a adjacent tags as was the case in ESTs and sage early and fortunately my People working on this project completely ignored that and just went for randomly primed and that has the advantage that If you have something blocking the polyase site junction, you're doomed Potentially and while if you do random priming you you get multiple places You might miss an exon in a particular molecule, but unless that exon is systematically covered with a protein You shouldn't miss it in the cell as a whole That said if you really want to get all these things you should we should do more than we're currently doing to get rid of the Proteins and to get rid of the secondary structures We When we plot I didn't show the plot we have plot where we just saturate some certain RNA We looked at the saturation the RNAs and you can see every place we get a read is where the secondary structure is is lower But it's it's pretty well covered the whole transcript is covered, but you can see the spikes So anyway, it's like RNA-seq basically Randomly primed RNA-seq in principle you can get splice forms you can get allelic variations So if you have a heterozygoposition so you can get allel specific transcription, which is a cool thing But we're it's still a baby technology. Don't try to keep expectations low Yes You'd like to see what's your genome. Well, would you like to do it with your own hands? Or would you like somebody to do it help you? Well, so so there's different way so if you wanted to do it quickly and not with your own hands alumina offers their clinical service Cleo approved Then if you want to participate in the personal genome project It's getting better mainly because the international groups are far more efficient than we are Meaning the Boston group, but that's mainly because we're doing so many things They're kind of proof of concept things rather than pushing people through but this there is a line for the Boston part, but if you're Qualified for one of the other countries Or one of the other better yet one of the other ones that's starting up in the United States We'll get that there's there we have ways we're working on that will get the backlog and There are rumors that there may be ways of getting a million done. This is how rumors spread, okay Other questions Yes Coverage So the question just I'll repeat the question For the personal genome project. Are we getting good coverage of the sequence relative to other public genomes? There are very few public genomes other than ours meaning at any level even the thousand so-called thousand genome project Does not have a thousand high-quality genomes. It's a handful of high-quality genomes our all We have 200 high-quality genomes that are that are covered at 40x or better Which is very Well so so that so going public or private genomes, there's no human genome that's been finished There's none where we there's about two to five percent of the genome that's not known and it's and that section is not Something we should dismiss the cop my colleague Steve Carol McCarroll in my department recently showed that in some of those Regions that are not sequenced are many many human genes that have been making it very hard to interpret their Parallel logs which were in the parts of genome that have been sequenced they start look they look like snips But in fact, there's just a completely different gene or a related gene that's hiding in the in the center mirrors for example So one of our ambitions is to sequence through all the center mirrors and other gaps But we haven't done that yet, but but but the personal genome project are among the highest quality genomes Period as far as I know Thank you