 Okay, thanks, Aviv. In the next 15 minutes or so, I'd like to reinforce a couple of themes that I think I've heard emerging throughout the course of this workshop. And first is the importance of implementing or doing genetic perturbations or perturbations of gene function in order to functionally annotate the genome. The second is the need to develop detailed and scalable phenotypic assays so that we can assess the consequences of perturbing functional elements in the genome. The importance of continued technology development so that we can get better at doing these types of experiments and doing them in large scale. And the necessity to address the genetic redundancy inherent in eukaryotic genomes to do any of these things. And I'll start by telling you about a project that we've been doing in the model system, the budding yeast system that looks at systematic combinatorial genetic perturbations to make genetic networks. And that, I hope, will set the stage for a discussion of how these types of projects might be implemented in mammalian cells and why that might be important. All right, so of course what we've been talking about over the course of this workshop is what we've done in the context of encode projects so far, which is to generate data to predict functional elements in the human genome and selected model organism genomes. What I think we need to do, and that many people at this meeting have been reinforcing, is that we need to generate unbiased genome scale functional data that will enable useful modeling of biological processes, useful predictive modeling. This means we need to develop methods to look at regulatory events across the genome, phenotypic assays, as we've been talking about, on time scales that are relevant to molecular mechanisms. And again, all of this, I think, is necessary if we're going to reach a goal of a prediction of quantitative traits, including disease states. So how do we do this? We continue to support efforts to engage both model organism and human geneticists in joint efforts to systematically functionally annotate the genome. And I think this idea of cross fertilization between, I can't remember what it was, John that was just saying that, between the model organism and human genetic communities, that link is there, but it's not strong enough. And I think we need to continue to reinforce that link. So I'll just start by reinforcing the point using the work that's been done in yeast of the fact that there's a serious problem with genetic redundancy in exploring, functionally exploring, the genome by genetic perturbations. And we know this, we've known this for some time in yeast, since the yeast community has spent a lot of time creating reagents for looking at systematic genetics, or doing systematic genetics, including the yeast deletion array, which contains about 5,000 strains with complete loss of functional leels of all non-essential genes, as well as arrays of strains with conditional leels of essential genes. And so we have these reagents, and we've been looking at them using a variety of readouts, including cell fitness assays. And it's very clear, as many have pointed out here in the context of functional elements, that most single genetic perturbations are of little phenotypic consequence. And that means even complete gene deletions do not affect cell growth very much, and in fact it's hard to find unless you do very detailed assays of phenotype. So this means the eukaryotic cell is highly buffered, highly genetically buffered from the consequences of single genetic perturbations. And so this has motivated us and many others to develop systematic methods for the study of genetic interactions, and also to explore phenotypes that we can implement on large scale. So we've been doing a project to systematically explore genetic interactions in budding yeast, and of course we need to know what we're looking for. And we're simply looking for phenotypes that are not predicted by the combination of the individual mutant phenotypes. And so in this, looking for genetic interactions, you can think of measuring any phenotype of interest. Expression of a gene, cell growth rate, cell morphology. So our big question is, can we use genetic interactions to systematically define the function? The answer there is clearly yes. Biological pathways, connections between bioprocesses, general principles of genetic interactions, and I think discovering general principles of gene regulation has come up at this meeting and also has missinheritability. Can we use maps of genetic interactions to explain the missinheritability that's common in the genetics of complex disease? So the readout that we've been using mainly, the phenotypic readout that we've been using mainly is cell growth rate. As we look at colony sizes of proxy for cell fitness in our assays, we can do this in a very rapid way using robotic replica pinning, and we've developed methods for automated creation of double mutant arrays. So all this is very fast, and we can quickly generate many double mutant arrays, measure colony size, and then look at genetic interactions using that phenotype. And so the type of genetic interaction that we're looking at here is typically called synthetic lethality or synthetic growth defect, and the idea is that we should be able to define biological pathways because genes that function in the same biological pathway should share patterns of genetic interactions. And this is very important, so if we can make genetic interaction profiles, that's a phenotype, for all of the genes or genetic perturbations, we should be able to infer these biological pathways and predict gene function. So the goal is to assay all yeast double mutants for these interactions, so that's 36 million double mutant combos. We consider this project done at least in standard growth conditions because of technical limitations, and we've assayed 16 million double mutants looking at half a million or so double mutant interactions. This explores most of the non-essential gene space and a lot of the essential gene space in yeast. And what we do to visualize these data is take advantage of that principle I just told you that genes that function together should share or in the same biological process should share patterns of genetic interactions. So what we want to look at is not individual genetic interactions, but genetic interaction profiles as a phenotype. And so the way we display our data, and this was developed by Anastasia Brishnikova when she was a student in our last, not Princeton, and Chad Meyers at the University of Minnesota. The way we visualize our data is to take each of the genes for which we have a genetic interaction profile and link it to another gene through an edge that reflects the shared genetic interaction profile degree. So the more genetic interactions you share in common, the closer together you are on the network. And the idea is then we should be able to visualize these clusters of functionally related genes in a biologically meaningful way. So here's a picture of the yeast genetic interaction network on the non-essential genes only. And the only point I'm making at this stage, there's a buzzing here, but I don't, can you hear it? I thought maybe I was like, oh, good. Could have been me. Is that you can see clear patterns or clusters of correlated genes with correlated genetic interaction profiles on this network. You can make the same network just considering essential genes. And the only point here is this is a much denser network because essential genes are hubs on the genetic interaction network. So they share many more genetic interactions with other essential genes than do non-essential genes. And then you can combine this in a combined network where we get about 4,000 of the yeast genes on this network simply by looking at correlated genetic interaction profiles using a single phenotypic readout. So is this really biologically informative? So Anastasia decided to computationally annotate function in this network by taking node A in the network, for example. And finding all the other nodes, B, that can be connected to node A through a number of steps, a defined number of steps. So in other words, we're defining A's neighborhood, B. And then we look at functional enrichment for various go terms in the neighborhood, in B's neighborhood. And then we apply that to the network I just showed you. So here's an example of that. Here's the combined genetic interaction network I just showed you. And here's Anastasia's annotation or computational annotation of this region of the network. And it's highly enriched in go terms that indicate roles for these genes in vesicle function. So it's a very nice, tight, functional definition of genes in this neighborhood as having roles in vesicle function. And so you can continue to do this for the entire network doing functional annotation at different levels. What I'm doing here is showing you a network view. You can also display these type of data as a dendogram because we have measurements for the degree of interaction between each gene. And then you can look at the dendogram at various levels. And what's cool about this is you can get different functional information at average level. So for example, at level one here we're looking at the whole network. It's simply just looking at the network of correlated genetic interaction profiles. If we look at level two in this hierarchy, we start to define basically sub-compartments in the cell, the cellular compartments. So the nucleolus, the cytoplasm, the baccalaureate membrane, endosome, et cetera. So you can start to see that level of functional organization in the cell. Look at a deeper level on the hierarchy. You start to define biological processes. As you can see here, we've got a lot of physical trafficking and sorting, ribosome biogenesis, mRNA processing, transcription, and chromatin. So really at this level you can define about 20 or so functional neighborhoods of biological processes, again, using this very simple readout. And then if you delve further into that, we can get much more information. Again, just looking at correlated genetic interaction profiles, you won't be able to see this, but we're just looking at a part of that big network. And each of these circles shows you basically approaching complex. So we're getting down to the level of approaching complexes and pathways. For example, the MRX complex, polyhazardial organization processes, RNAs, ACEs, the single pathway, et cetera. So by getting deep into this functional hierarchy, we can start to really define gene function. And so why is this interesting from the point of view of any kind of functional annotation of the genome is that because this type of network is very rich in functional information, we should be able to use these looking at the functions of certain genes that are known to infer the functions of unknown genes in the same biological neighborhood. And so we started that kind of exercise with these data, among other things. And this is just a picture of that neighborhood I'm showing you. And what we've done here is put on place all of the 37, that's all there is, completely uncharacterized these essential genes on this network. And they all fall into these functional groups on a network. Oh, sorry. I wander around when I talk. Okay, it can't be helped. So we've placed them all in these functional neighborhoods here. And so we can make predictions, for example, that this particular gene has a role in mRNA processing. But before we wanna make a more precise prediction, we can delve deeper into the network and see that this gene shares genetic interaction profiles in common with a couple of complexes, protein complexes involved in messenger RNA, cleavage, and polyadenylation. So we would predict it has roles in both of those processes. So you can call up your buddies in the yeast lab in the yeast community, in this case Claire Moore, who does a NASA and shows that indeed this particular essential gene has a clear role in both of those predicted processes. And again, it's kind of remarkable. This is a self-fitness readout, correlated genetic interaction profiles, and we can make these precise molecular predictions. And importantly, we're renaming all these genes now from these ORF named YJR141W. This one's called IPA1, for important for polyadenylation. But I should point out that this really reflects our aim, a lot of Charlie Boone's aim, to name all of the genes after beer of some sort. So, so far we've managed to do that. Yeah. In any case, so in the context of these genetic interaction profiles, I'm gonna reflect on some of the questions that were posed to us when we were asked to speak at this workshop. What are the gaps that we have, that we need to fill through potentially encode projects? And again, just getting back to the major point, is that we do need now to move on to functionally annotate the human genome. Mainly we've been talking about regulatory elements. I would argue that you need to combine that with perturbation of coding elements so that you can make the comparative analysis. And this requires, of course, considerable investment in on-bias genome scale data collection in a variety of experimental systems. This will be useful from sort of an overall philosophical point of view because we know that these types of baseline studies, studies where we're trying to develop a reference map or a reference data set are very useful, not just for systems biologists, but for any biologist that's studying a specific problem because we can put genes in their global cellular context. So I think we need to apply what we've learned about systematic annotation of the genome from use to human cells. I think this does require continued support of model system projects. This was brought up before. And especially this is important too because we can generate lots of good data to help develop computational methods and models so that we can infer gene function. It requires investment in establishing community-wide resources. The East communities have done this very well. We've developed all kinds of rays of strains that allow us to systematically perturb gene function. And as people have been saying, like genome scale mutant collections are CRISPR-based collections. And we need tools for the rapid and quantitative analysis of cell states and phenotypes. Here I would argue a good thing to think about is automated image analysis of cell images as a biological readout. And this can be developed in a model system and then transported to any cell for which you can do a comparable assay. And no surprise, I'll argue that we need to consider genetic interactions on a large scale in mammalian cells. So here's a couple of things where I'm thinking about the phenotypic readout of genetic perturbation we might use. And again, this could be regulatory element perturbation, but I'm not gonna make a strong argument that that's the only thing we should do. And so we need continued technology development for genome engineering so that it will get easier. We need to map all the essential genes, discover all of the essential genes in our selected systems and measure the growth rates of single genetic perturbations. And this should be done as Mike I think was saying across a variety of models that we select. You need this kind of accurate measurement to look at genetic interactions or to make any kind of reference map. We use this information to select specific perturbations we'd like to look at in a combinatorial fashion. So what query genes would we like to take through a genetic interaction screen? And we can also use information from yeast work and other model organism work to make those choices. So you don't have to do the all by all. That's not really feasible. You can make sensible choices about what query genes you'd like to look at. And so I think this would produce a functionally rich reference map for interpreting the consequences of predicting regulatory elements. And it really will be necessary for functional annotation. And so getting back to this sort of phenotypic readout thing and the cell biological readouts, I'm sort of thinking of big data cell biology type projects. It might be useful to think about developing a toolkit of reporters of sorts. So say fluorescent markers of all sub-cellar compartments. This has been done in yeast. Markers of cell cycle position, reporters of TF activity. And then we could use this toolkit to look at the consequences of genetic perturbation using CRISPR and other methods, but using single cell quantitative cell biological readouts. I think this could be very powerful because these assays are readily adapted across a variety of systems. But this means we need a serious effort to improve our ability to analyze information in cell images. And again, this demands a continued cross fertilization of biology and computer science. But really what we need is a plug and play type of approach where scientists can interpret their images using the same automated imaging system. This could be a community project if we wanted to talk about that. So the SGA projects in the lab are led by Michael Costanzo, a senior research associate in our group. As I mentioned, Anastasia Brijnikova, who's now at Princeton University, still works with us in analysis of our data. And Chad Meyers is the lead, a computational person on these projects. And all this is done in collaboration with Charlie Booth. That's it. Questions? Vinda, Vinda. So, I mean, Brenda, this is really beautiful work. I've heard you speak many times, but I'm just wondering that at its base, this depended on somebody creating a set of deletion strains. That is correct. I mean, without it, you know, and so after all this experience in yeast, I mean, what do you think is the possibility of doing it? I mean, yeah, the mammalian genome is 20,000, but you did close to, you know, it was done in 6,000, so at least maybe half the genome. Right. I mean, there could be a careful selection. So is this, how feasible is this today given when the yeast stuff was done? Yeah, all by all is not feasible today in mammalian cells and I don't even think it's necessary. In what we know now and from all the information we've got in yeast, because we've, excuse me, done all the experiments, is that you can actually get a very high fraction of that information by screening a smaller number of gene ab mutants, right? So we don't have to screen the whole array anymore to get almost all of the information. So that really is a scale thing that makes it, and we could apply that same approach to mammalian cells. So what is that small number? That's what I'm trying to get at. This is a lot of the essential, well, 1,000, 1,000 mutants in our case. We can get about 80, can't remember, 88% of the information we get by screening the entire genome. So you'd have to apply that kind of computational approach to decide what your array, you might screen in mammalian cells, and then you can use full genome CRISPR libraries. You can build those query lines, for example, that you choose and then do whole genome CRISPR knockdown experiments. Knockdown experiments. The reason why I'm bringing this up is, you know, Daniel MacArthur and other Chris Talasman, you know, there's enough exomes now, there's enough curation that finding a set of 1,000, 2,000, whatever. I think you could do it. Top genes from just exome sequencing in humans can give one a very good idea of tolerance of... And we know a lot about, so what you wanna know is, and there is information on this, is what is the fitness consequence of the single genetic perturbation? And we know that if something has a fitness defect, when you knock it out on its own, it will give you a lot more genetic interactions than if one doesn't. So you would pick those, stuff like that, right? I would point out that in mammalian cells, often you would not go with an arrayed strategy, we would go with a pooled strategy, which means that you're not, you would go with a pooled strategy, we'll point it out yesterday, which means that you wouldn't be building necessarily a resource of cell lines that you maintain. You would be creating it anew. I think you would build a resource of query strain lines, but the other resource would be a sort of a CRISPR library with barcodes, yeah. And then we have Dana, then Frank, Olga, and I think someone else was raising their hand and I missed. So I've always been enamored by interactions and that's why I love that East work so much, but in human I'm concerned, there's a couple obvious levels of complexity, the larger number of genes and what tissue to look at them, but there's also the question of many more genes or gene families have replicated themselves and have more copies, is double mutations enough for human even? No, clearly not, but I think in the first pass, you would eliminate in your choice, in your list of queries, you would eliminate big gene families, you're not gonna get information that way. So I think in the first pass to start building a reference map, you would have to just not deal with that. And even in yeast a lot of the duplicated gene pairs don't show strong profiles of double mutant interactions. You have to look at triple mutants. We'll do frank and then we have to come to the... So you mentioned a subset of genes would be enough to fill out pretty much the entire pattern. Of course, could you take the homologs of those in human if they exist and use that as your basis? Yeah, for sure, and in fact, essential genes are much more highly conserved, so you could definitely take the information we have in yeast about what gives us large numbers of genetic interactions and apply that to making your list. We move on. Thank you very much, Brenda.