 so far, especially if people have ideas on how these ideas integrate with each other. We've heard something about the history of ENCODE and what it has done, recommendations from the NHGRI sequencing workshop, recommendations from ENCODE PIs. We've also heard about related projects of people are interested in discussing more where we might have data housing and have interoperability of projects that's one thing we could do or other topics that people are interested in discussing. Nancy? Thanks. So I think the level two comment that was made on getting to the gene targets of some of the elements in ENCODE, there could be really good avenues for collaboration between ENCODE and projects like GTEX at a variety of different analytic strategy levels. So beyond just the EQTL thing to move into some different kinds of analytic approaches, say involving prediction, where you could look at how different weighting on different kinds of ENCODE annotations altered your ability to predict expression of genes that might give you insights into some of the targets for some of the elements. So I definitely think that the ability to layer, and of course, so I know less about FUNVAR and some of the other things, but it's easy to see that there are opportunities cutting across all of those that involve that particular level two question of how do you link the units being characterized to gene functions? Yeah, I think this is a common thing that we hear is feedback at NHGRI that people would love to be able to say, this is the gene I care about, what are the regulatory elements for it, or I think this is a regulatory element, what are the genes or genes it relates to. One of the issues that we also hear as holding back those sciences is are we there yet where we can reliably say this is the best way to do it, this is the confidence of those predictions. So do people have thoughts on how important is it to have only absolutely correct predictions or predictions that are testable but could turn out to be incorrect or having different sets of predictions, where would we want the slider to be set at that to give a useful trade off? I think they should be testable and actionable, not correct. Correct is you can't tell if they're correct or not until you've actually tested them. So it's the wrong bar to set, it should be a draft, it should be provisional, but it should be testable. And I think that's actually, that is a big challenge, that's where you have to decide whether it's a model organism thing or a cell line thing or because that's or whether technology needs to be pushed further. And also in that context you need to define testable for what. There's a, I heard today and also in the, I think the previous meeting that there are two very distinct views of what is the level of vetting that an element needs to get. One is very, I would say transcriptionally mechanistic. So looking at the effect on events like transcription or what is actually regulating what at the molecular level. And the other is thinking about a much more physiological definition of phenotypes. There are both phenotypes, they're just at very different levels of resolution. And I think it would be very useful to understand where the community stands on that spectrum of functionalities. Over on and then on. I absolutely agree. I think it's actually would be a real disservice if we decide to go with the correct predictions only that said, because for one we don't actually know what interpretation algorithms, other data will come out that might actually make something that will toss out now to be our most exciting interesting next correct prediction. And the phenotype being a moving target and whatever data all of these other projects are generating, something that we might not be able to interpret based on the cell line that ENCODE looked at, but once we see the data from different tissues, from GTACs, we may be able to actually put things together, or EGTACs I suppose. And then I do think that also makes it even more critical that when we make these data and predictions available, we're completely clear, especially for end user biologists, what every, so right now I was talking with my computationalist head where I just want all the data and we'll look at it and figure out what to do with it and I have a feeling every other computational biologist would rather have that as well. That said, we really need to make sure that end user biologists don't have, if they don't want to dig through all this data to try to figure out what is, to which extent is each prediction validated, that we have that very clearly defined. And I think ENCODE, DCC has actually been doing a good job moving towards that and really making the website usable and easily parsable. It's not fully there yet for the level of complexity that we're talking about for ENCODE 4, whatever this is going to be called. But I do think we need to be very careful and I think that the other, it's not just the DCC question, it's also a question of evaluation. I know I've given you guys a hole in the head every time at the advisory board meetings or whatever we're called. But I do think we need to really think for ENCODE 4 or whatever we're called the next project of how do we evaluate these things? How do we evaluate methods and data so that we do have some label that we can give to a biologist or a computational person in terms of how much we trust things? So, Anjanada and Dan. So I was going to comment on the correct. But correct, of course, is a probability measure in many of these cases. So that's the first point. Second point is correct for which cell type, because we all know that different enhancers link to different promoters depending on the cell type. And that any given enhancer can also link to multiple promoters. And so I think at some level we have to understand how different complexes of transcription factors that are present in those given cell types choose one enhancer that might link to a particular promoter of a different one. And I don't know whether that, which is mostly protein protein and protein DNA interactions beyond chip assays, is within the mission of NGRI. So that's all I wanted to say. You're thinking maybe like this large scale stark analysis with StarSeq finding that some promoters pair with some enhancers and not others and promoter class? Well, even more than that. Let's suppose you have a catalog of the enhancers that in ten different cell types link to a particular promoter. You then have to cross it to the transcription factor catalogs that are available for those cell types and say, if we've done chip C, can say transcription factor A binds a cell type, one, two, and three, but not in the others, what might be cooperating? What other transcription factors are binding to the other enhancers that makes that enhancer specific for that promoter in that particular cell type? And that I think is protein protein interactions as well as protein DNA interactions. So for what that's worth. So I think I'm going to try and wrap up a few things that were said and pinpointed. I think that it's important not only to say correct, tested, not correct, but a very important component of any prediction is a confidence score. And if you have a good confidence score associated with these predictions and you have a method and that confidence score and it's accuracy tested in a large number of tests, then you can sort of get a better feel for those that aren't tested. And particularly, as John has said, context is very important. Some contexts are easier to test and others are harder. So are we going to limit ourselves only to the context that we can test, the ones that are easy to test? Therefore, if you have a confidence score associated with these predictions, as well as a large body of tested predictions so that you could get a confidence in your own confidence score, that is what you really need. You need a number. You need a probability for each prediction. It's reliable. So I can be confident that I got what you said. You're interested in the confidence score, not of how accurate we think the prediction is, but how often the predictions turn out to be accurate? No, I'm interested in having a computational method give a confidence score and the computational methods that we'll believe are the ones where the method's confidence score matches the empirical confidence score of them being tested. If we have a method that can automatically from the data generated confidence score and we could test it, not every single prediction because that's unlimited, but a large number of predictions to gain the confidence that the score reflects pretty accurately what we see in the empirical validation. Because we have today some predictions which are based on statistical correlations, so they have p-values as part of the prediction, but then we don't have testing on them to say how much confidence should we place in those predictions, the part that you're addressing. Yeah, the score just needs to go beyond the correlation. Right. I think Aviva's next. I think this is something that Rick and Mark alluded to and that is the issue of interactions and I would say more generally nonlinearities. So a lot of the ways in which I think the existing programs are trying to assess these systems is going at them one at a time. One nucleotide change at a time or one genetic perturbation at a time in an engineered setting and so on. We all know that the systems are actually nonlinear and so you need to go after combinations and there are big technical challenges with doing this, both experimental and computational. I think this is something that, again, as a community we should be discussing. So to be clear, we're talking about gene-gain interactions or element-element interactions or... So, you know, anything from if you're thinking very mechanistically cooperativity and transcription factor binding and the effect on expression levels which is nonlinear to if you're thinking about genetic variants and modifiers, that's a nonlinear interaction between which can actually manifest who knows what molecular mechanism is. Or...? Hi, I don't know if you're too close. So I just wanted to say two things. One thing just in response to what you just said and sort of a second thing related to expanding on that. So first of all, I very much agree that I think we should be thinking about interactions and connections between things. And the second thing I want to point out though is when we start thinking about that, I do think it changes to some degree the way we tend to visualize the genome. I mean, we have, I mean, one of the powerful things about the end-creditation or the way we think about the genomes, we had this very simple kind of linear system for looking at things, genomic coordinates and so forth. And I think it's going to require just simply a new way of visualizing things or a new way of just to interact with them if we start thinking more and more of things in terms of connections. And that's just enlarging on that a little bit more. I want to just say a tiny thing. I think you made an excellent point, which I really, I didn't realize before you made that point, but it really crystallized in my mind, is that traditionally with NNCO, we always think of the DCC as being kind of where the data goes and it's also the portal. But I actually think it's really useful of splitting those two concepts. And I really think it's useful for NHGRI in general to have a really good genome portal. I mean, maybe a number of portals. I think it's just really important that NHGRI have this way from the whole world of viewing the genome. And that is almost different from the depositing of NNCO data and just the NNCO project. And I just wanted to put that out there. I think that is important. I think John and then Mike's next. I just wanted to second that back to the point that you were saying before that there's many different types of users. There's users who are pretty unsophisticated. And if this is all about precision medicine and this sort of thing, you'll have a lot of clinicians who's, and in many ways NNCO is a great gateway drug so that we've been talking about abuse, but, and getting people in slowly and having them have that learning curve. But many people won't have the sophistication, ability or interest or time to get into the detail. So I think that is key. If I might just quickly add to what John said, I think this issue of who's a sophisticated user and not, what I often see is people are sophisticated on different aspects of what NNCO data is and how to use it. And some people think they're unsophisticated because they don't understand the biology of it, but they know how to do the computation or they think they're unsophisticated because they don't know how to compute on NNCO data but they understand the biology. What I'm seeing is sophisticated people that don't get every aspect of it. So I think Mike is next and then you. Yeah, just, I guess two comments. One on this particular point, I do like the idea of a common portal, at least coupled with the mechanism of capturing all the stray data. I know this relates a little bit to a project we have, but which is trying to get all this other dating and get a commonly processed because there is a lot of data out there that could be brought into this whole picture. I think another topic, this might shift things in a little bit different direction, but I think it's worth exploring more about the relationship between GTX and NCODE and how to leverage that better because I do think even if you look at what's happening in NCODE and you look at what's happening in GTX, in some respects, they are merging a little bit. That is to say, in GTX, there's a pilot to sort of look at all the NCODE types of assays on many tissues and in NCODE, there's a pilot to actually look at lots of different tissues. So it does seem like this could be leveraged at a much higher level than what's going on now that could be, I know it's been touched on as part of these discussions, I think as part three and part one, if I understand it right, of the future vision, but I think that could be flushed out in a lot more meaningful fashion to really be able to leverage the maximum out of both projects in a way that could go well beyond. And I think if you think big sky out of this, you get a lot of information, you bring in a lot of different ways that touches on many of the different points that have come up through the discussion so far. So I can elaborate further, but let's turn it over to someone else. Well, maybe Nancy should go because she was clearly responding. I want to move at a different place. Well, and I just want to add to that the opportunities in this system's genomic space for the intersection between NCODE and GTX because part of the way you can think about prediction is using genome variation. So thinking about the genetically determined part of gene expression, of protein levels, and part of the way that we might be able to provide some feedback on annotation of these elements to genes and proteins can be in the quality of the genome predicted level versus the measured level that we're getting out of GTX. So this sort of prediction performance is a key thing for geneticists, right? I mean, we're looking for the genome variation that leads to disease through these very endophenotypes that are being measured. And so there is an opportunity, I think, to pull this into a more system's genomics perspective and give some feedback on these elements and maybe how they get annotated further. So I absolutely agree with Mike. I see your comment is falling up on what Dan was saying earlier, that one needs predictions, but then also testing of them to learn how accurate those predictions are to see how to improve them or how much to trust them depending on the time and money that you have. A little bit further than that. It's a distinction between, so when we measure protein levels, that's affected by many things, including the genetic part of the transcriptome, including non-genetic factors and lots of them. So it's not a perfect readout, but when you can see that including information on end code annotations in developing the genetic predictors of protein levels, for example, improves the correspondence between your predicted level and your measured level, you've captured some additional valuable information and that's all I'm saying in terms of using this better to get to that level too that was raised in how do we move the elements to units that they control? Ewan? So I also kind of wanted to take the questions in a slightly different place. The discussion seems to have not touched on the kind of completing the catalogue component of this and I'm just curious about whether that's because, so I think there's three reasons why. It's so obvious that it's boring and it's not worth discussing. The other reason is because people believe it's going to be done soon enough that it's not worth worrying about. And then the third one is it's too hard to do so we're not going to do it. I guess it's a fourth one. So I'm just curious about which of those three is there and I'm up for a show of hands about whether people are finding it that, completing the catalogue is just boring or not worth doing or a given in the next two or three years. Okay. Would it be useful to have a straw poll? Too boring, too obvious and too hard. Can we vote for more than one? And we're largely going to be discussing that topic one. But Ewan, hold on a minute. It doesn't matter if Mark thinks it's boring because he's not going to be doing the experiments. You mean, or does it? Maybe it's the analysis too. I'm just curious about whether it's kind of, again, I'm just curious about why we're not discussing it. Ah, well, that's a good question. That's part of the discussion. That's the thing. On my record, too boring or too much? Too obvious or too hard to say? I think the too obvious, too boring but obvious is that it's a no brainer, is what you were saying, Ewan, you know? Yeah, what do you mean by too obvious? Yes, I mean, so what I meant by too obvious is that it is going to be pretty much done in the next, someone will do it in the next three years. It's a given that- I thought it was number two. That's number two. What is the number one? Sorry, sorry. It's the wrong time zone for my brain. I just said that. Just one. Just one. You want me to ask you a question? No, I'm sorry. Okay, so let me go back to just two options. Option one, it's too boring to discuss because we know that we're going to do it. Option two is because it's too hard slash expensive. So it's not worth doing it. In option one, why you believe? Okay, so we meaning the consortium. Okay, I'm just trying to make this work for my brain that's at 1 a.m. in the morning. And I think this could be a useful exercise. Who said that it wasn't a useful exercise? Let's vote on what should we, are we in favor of a cow? Okay, here comes the four options from Aviv. Aviv, nice one, let's go. There was option number one, which was too boring in the sense of, we won't learn that much from it. Option two was too obvious. Let me just go through them because I've been saying the same thing like five times now. Too boring was option one. Too obvious was option two and that, but you unmeant, somebody would do it, why us? And then option three was it's too hard slash expensive as is. And then we spurred in right here because you were sitting right next to him. I think you and points have been washed through some transformation here. These are not the points that I heard him making. Yeah, we could discuss this over dinner and come back to it and talk about it too. Also, are we talking about human or mouse or some experiments are easier done in some systems than others? So how about, I think this is a good idea to advance the discussion. Why hasn't this come up? How about if we regroup on this either during our working dinner or after dinner? And we'll clarify what the choices are that we're gonna consider. But I don't think. So I think after you and we had question. I was just a comment on is a phenotype driven geneticist. I guess my question is this just another database that I go into and look and try to develop a mechanism or is this gonna change the way I look at phenotype and search the sequence databases? So this is very powerful because I think it can lead to interactions among different regions of the genome that are more accurate than what we're currently doing with crosses and collaborative cross and so on. So how does this integrate into people who are just simple old-fashioned geneticists? So might I ask you to clarify what kinds of things would you hope that it could do? Well, ultimately it's you wanna get a mechanism. Back 10, 15 years ago, if you got a gene, if you got a sequence difference that you could link and causally through the knock in or knock out relate to the phenotype that you were ecstatic. So now things are much more complicated. So if it's a modification and a certain cell type, is this gonna change the way I look for modifications? You had a follow-up on this before we go to Aravinda or is that okay? I just wanna point out, so at least Ross and John and several others and people outside of Inco, but at least I know of these have taken genetic data and put on top of that, a whole bunch of in-code things. And actually, even though there are only a few cell types for some of the assays, there are a lot of cell types for some of the assays. And so the combination of all of those got you into categories of disease, a whole lot of things made sense. And we, and that's for complex disease, it's been even better for rare disease because of stronger Mendelian-like things. So, I mean, I think the idea is there and if we go further, then it'll be even better for sub-cell types and things like that. Can I just have one point, I just wanna say that I'm just reflecting on that previous workshop in the summer, I mean, it's not clear exactly what function is, genome functions, but it seemed very clear in the summer workshop and that one of the things people wanted exactly what you said, they wanna be able to go from kind of a statistical association to some form of more mechanistic, so that's what they mean by function. And I saw that, that became very clear to me and I think that is important. I agree very much with you, whatever that means. Function is kind of not exactly defined, but I think that problem is very defined. If I just stop us for a second, how many of you know that in-code has some statistical predictions of linkage between regulatory elements and genes that are offered? And so, maybe one, two, three, four, five, six, seven, eight, nine, 10, how many of you find them useful or think they're problematic, find them useful, some? And how many of you think they're problematic and not the kind of thing that is what we're looking for? Do they have confidence scores associated with them? Do they have, do they? So confidence scores in the sense that they're the first level, how accurate, how much confidence is there in the prediction as it's made, not confidence that then the predictions were then experimentally tested to get a sense of if the confidence of the prediction was a p-value of 0.9, that'll be right half the time or 90% of the time or whatever. So I mean, I at least found them tremendously useful. I think what's lacking in this goes back to Aviv's point about the distinction between approximate molecular function versus kind of physiological, right? I think validation at the level of approximate function would much more greatly enhance the value of the resource than validation at the physiological level. At least that's my view of it right now, but you can get a lot more done and you don't have to worry about false negatives and all kinds of other things. So Aravinda and then Brenda had been waiting patiently for a while and Ross too. So hopefully I'm not gonna model the issues late in the day and we have no alcohol. So I think both Nancy and you and Ray some very important questions, but I'm just gonna answer this in the following way, that perhaps all of us who also do biology, we tend to think of biology as being infinite in almost every dimension we look at. But and I think I tend to look at the world as somebody who is quite ignorant in the sense, we still don't know, you know, we have a good estimate of the total number of genes in the human genome and the mouse genome and we still don't know very well what they do. And so I think the same goes for all these other elements, regulatory and otherwise, exactly what they do, how they do, the fact that this universe is infinite. I suspect it is, but I have no idea. So you and his question is relevant. I think if we keep doing it, it's like, you know, drawing any function if you add more points and I think we need to add more points to know, is it just combinatorial reuse of the same programs again and again and again, in which could still give us a very large number or they're completely distinct programs, which is what so far we've been led to believe. So I think that experiment needs to be done. I think the part that would be useful to have is, I think Nancy was speaking about the specific prediction of genotypes going to gene expression is what having the elements does to expression. And if we can describe it well, which I suspect that there are some systems in which we can do it exceedingly well and there are many other systems, maybe even in the same cell type, you know, the other regulatory networks that we don't do. So finding out where we sit, that is how much of the or how many of the elements do we have to know in how much detail to at least get the expression right would be a first step for this phenotype driven genetics. So elaborate a little bit more on that, but also get back to the point about how quote unquote traditional geneticists might take advantage of these type of data, which is happening already, but I think it will happen more as we move more into looking at different phenotypes. So I think the next phase that we've been hearing about this of this project should involve incorporating a lot of other readouts and phenotypes into how we examine genetic perturbations and functional elements in the genome. But with respect to once we generate that type of data, more phenotypic data, I think for any traditional geneticist who's looking at a number of genes or a disease gene, putting that in the context of the global pathway and the global network will be very valuable. And I think we'll change the way people think about this particular pathway or gene. Well, yes, so I wanna start with that last thing from you and which kind of broke down and pull it back to something you said earlier, which I got confused about, but I think this will be helpful. So I personally, and I'll make this point in my talk later, is that the number of dimensions we need to fill in for the full matrices is too much to ever really achieve. However, we can get a lot further along than we are now and our measure of how far along we are are the things that are of end to end that we're just mentioning. We have to see how good our predictions are. Now, I thought it was, talk about being obvious. I thought it would be obvious that we need to pull in data. I mean, we, the larger community, but this is where in each year, I could again, could be a catalyst should play a strong role in pulling in data from multiple consortia. And I would love to see it uniformly processed, but at least get the data in, even some process data that's not from the reads. And so people could work with it, computationally savvy people can work with it at a very detailed level, but we also need to have output that helps, phenotype-oriented geneticists and biochemists. But I heard you and say we can't do, that this was impossible. That they're pulling all the, but I got confused because I don't know what a data broker is. I mean, it seems like that in fact we, it's a resource issue. And it's not gonna be cheap, but I don't see why data from these multiple consortia can't be combined into a way that this, that that will work. And please tell me what I'm confused about. So it's setting the boundaries to this. So as those consortia connect up and they connect to other things, and before you know it, you end up having all of human biology, all of mammalian biology trying to be present in your portal. And that doesn't scale particularly well. So you've got to decide which, this kind of organization is a good thing. And, but you've got to kind of be a good citizen and play in an ecosystem with many other players. And so you have to make sure that your data isn't an island or you're, the organization of your data isn't an island. You're brokering back into a bigger system so other people can use it. And you're allowing other people to build alternative portals using your data. And as Mark pointed out, I think once you've separated out the broker functionality from the portal functionality, this becomes much clearer. Portals are focused on user groups. Brokers are focused on data streams. And those two things really don't have to have the same. They're really quite separate things. Okay, and I need to learn more about that, but maybe you could tell me, do we need an Uber structure for this to work? Or can it go like Wikipedia or something? Or maybe Wikipedia has an Uber structure? No, I mean, or the Uber structure is the, that's what both in Europe is the Elixir project. And arguably in the US, the Analogous project is the BD2K project. And, you know, they have flaws, neither process is perfect, but they are the shows in town to coordinate this. And rather than trying to stand up and say, well, if we coordinate this, then somehow everybody should sort of all bet us. That doesn't work. Everybody's got to come together inside of a community that does that coordination. And those two are the two community forming processes for this big data connectivity. Now I understand better. And certainly, Encode shouldn't claim us the center of the universe. You're getting trouble with that. Well, it's the center of the universe for certain people, quite rightly. It's just not the center of the universe for all people. Yeah, just quickly point out that the IHEC data portal in a remarkably short time brought together a lot of the process data from a lot of the projects. The way it was done is not ideal yet, but in a very short time this was done. And it's very helpful and a big advance. And it works in two different ways. First of all, a lot of data are there. But second of all, you go to that site and you don't necessarily know about all of these projects existing. But because the data from more than one project is there, you might find things that you didn't know to look for, which is a powerful thing. I'd also point out that the Encode DCC, in addition to hosting Encode data, is gonna host the genomics of GGR, genomic gene regulation data. And we're also bringing in the roadmap of the REMC metadata and process data and we'll point to the raw data. So that's collecting some of that data in one source. I think we have Matthew, Tully, and then Dana. Yeah, so I'd like to just follow up on this debate that we've had about finding the connections between enhancers and target genes. That's totally valid and I definitely go from it. There's initiatives to support that. But one of the main uniqueness of the Encode data is the sum of all transcription factors, like hundreds of transcription factors for which we have binding profiles. These allow us to actually assess what are the upstream targets of genetic variants. And so for instance, if you have a variant, you figure out the exact... So we've done that for breast, for instance. We've asked if all the breast cancer genetic variants, what is the key transcription factors that typically maps to where these variants are? And what we find is the estrogen receptor, the number one targeted drug, the targeted agent or targeted protein in breast cancer. So we can then learn a lot by figuring out the upstream as opposed to always going downstream. I find it very interesting that we're going back to finding the gene when Encode was actually based on understanding the non-coding DNA. So I think that's a strength that we should not dismiss. Finding the downstream target, yes, but increasing the number of transcription factors for which we have data in a larger population of cell is also a very valid approach to move forward and very unique to Encode. Who is next? So I wanted to ask or comment on the balance between genome annotation as functional elements in the genome and then annotation of the functional effects of genetic variants. So this is something that Mark already mentioned that I think most of the people in this room are interested in what genetic variants do. But then Encode data and other similar datasets are hugely useful and I think very much used by people who are doing these kinds of cases versus controls, treatment versus no treatment type of studies just comparing different environmental conditions and how the genome function changes with these. And like it's not completely clear to me kind of how much resources are put into the variant side and then just the annotation, like genome function annotation side. Two more quick, Dana or the passengers. I wanted to comment back to Ravinda and ask you. Oh, yeah. Kind of back to something Ravinda raised, actually two points that he raised. The first that he said, and I think that's what I got, that it would be even just getting expression right would be very useful. I think it's that point unfortunately. Yes. And if this is indeed the case, I was actually amazed. I didn't know that at the relative paucity of expression data that was collected as part of Encode compared to other things. If I look at the RNA-seq data and so on, it looked like so little compared to the easiness of actually collecting the expression data and the fact that it is very functional data that can be interpreted in many ways. So I want to put a little bit of plug for RNA. And the other thing is that Ravinda, I saw that Ravinda was saying that everything seems very idiosyncratic, that it's a never expanding universe, but at least at looking at transcriptional programs, they're incredibly modular and reused. The space of possibility is so much bigger than what we actually see in profiles. Even the deeper and deeper and deeper that we get into them, that's why also imputation works well in expression data. So I think the problem is probably more bounded than we think in terms of transcriptional states, not in terms of every indiosyncratic molecular interaction. Just for clarity, there is an RNA-seq experiment that I suspect with every cell line or tissue. So it's not like there's a void sitting there. And obviously that means there's a lot more experiments besides RNA on each of these analyzes. And that's why the numbers are different. I can't envision that there's any cell line or tissue that doesn't have an associated RNA-seq experiment. Usually multiple times. Some of the primary cells and tissues don't have RNA-seq, but RNA-seq within code is covered a relatively large number of the biosamples. I think RNA-seq is covered the most biosamples than DNA. So it's covered a large space, but not necessarily every sample, but certainly all of the deeply sampled. I'm surprised there are any gaps. I guess the point is that a single RNA-seq profile is not as informative as perhaps perturbation to see what those elements might be doing or perhaps single cell to see how those elements might be causing the variability at the single cell level. So because RNA is so easy, the value and what you can glean from one profile, even if your only goal and intention is to interpret the role of the DNA element, is much smaller than what you can interpret if you play a little bit more with RNA-seq, multiple profiles per tissue, which is pretty cheap and easy to do these days. So it is six o'clock, so we've reached our working dinner time. Those of you that have ordered meals, they're outside and you can go pick them up. Those of you that brought your meals, they're wherever you have them or whatever snacks that you have. We are the federal government, so I have no food to hand out. And if I could help from a small number of you, you know who you are to set up the poll and that can be one of the things that we do over working dinner. And I for one would be quite curious to see who's thinking has changed by our discussion of unbiased mapping, who learns what from that and who just had the same opinion. It's NIH review kind of thing. It's not ready yet. But this is, yeah, this will be fixed. And so please go ahead and get your food and thank you very much.