 Okay, morning everybody. We were part of a group to think about function and integrating function with sequence variants and we went a little bit beyond that because we think that not all function is genetically inherited. It was this group here. We had a spirited interchange for more than two hours yesterday that was very helpful. So and I guess we had observations. I don't think there was a lot of disagreement. There were certainly some new ideas and ideas that came up but one of the themes or key things is that this is a really important part of what genomics should be doing and obviously NHGRI. The time is actually right for a variety of reasons. One of them is that there are lots of variants and the other is that there are lots of ways of studying function of this that are being developed that are much, much higher throughput and way better than what we've had in the past and we actually hope what will happen from this is that even better new ideas will come up to, in addition to the ones that are available now to really tackle this problem. But clearly there needs to be some way to and NHGRI can lead the way on this and sort of does already but can and maybe a bigger way to integrate the functional information with the DNA sequence variants that we've been spending this time talking about. And so one of the first things that came in our planning for this and in the discussion is that the F word, what does function mean? And there are lots, we didn't get into the debate about the arguments of different approaches and different levels of function with regard to evolution or other things but we did talk about the two that probably really matter to this group and that is when you talk about function you can have a molecular function, a DNA sequence variant in a promoter affects transcription. That's something we know how to measure, things like that, a DNA sequence variant makes a protein not be made any longer. We can understand, measure and codify those and study those on a large scale and systematize it. But the other way of thinking about function then of course is whether that DNA sequence variant then leads to the outward phenotype because of course the outward phenotype at the organism or the person level is way, way more complex than measuring a promoter in vitro for instance. And so of course those need to be part of the goal is to try to link those but we did talk about those differences because one thing that is being done to some extent and we think needs to be done in much more systematic way is that first part of molecular biochemical and even cellular assays for function are ones that can be done in high throughput ways where we really can think about measuring not just thousands but hundreds of thousands for very, very many of those variants. You heard about some of that already. One point was also made that the DNA sequence variant and affecting let's say a transcriptional cis-acting element, those are very linked together. They're closer whereas the organismal, the variant in organismal and changing an outward phenotype is a little bit further removed or maybe even a lot further removed. And one of our members of the group came up with the point that we really need to find the, so what is it that we need to do? We want to find the sweet spot. We want to be able to do this or we ought to be able to do this on a large scale but not have such ambition that it's ridiculous and it can't be reached. So the idea of figuring out lots of readout for modest investment and you're already seeing some of those, you heard about some of those yesterday. And then there was discussion about the different models that we have, you know, if we work with a particular cell line or a set of cell lines or those really good models for a disease or a phenotype, it's mouse, a good one or not. Even people even talked about other model mammals or vertebrates. So this was an interesting multiple discussions about this is that there are a couple of ways to think about coming at this top down and bottom up and we've marked cleverly color coded these so that you can see the yin yang here but one of them is that you just go and somebody referred to this yesterday as agnostically you go and you just figure out all the variants and you might even make those it's not just fine them you might even make them and then you figure out, sorry, then you figure out how they then intersect with those found in disease studies. So Jay's talk last night, Jay Shinduri's talk last night was an example of that but the other one is starting with the list of disease variants and then figure them out functionally and it's not you don't want to do one or the other we think you probably want to do both and have them kind of meet in the middle. So I'm just going to continue on from Rick. So another aspect of this resource we're talking about this sort of another dichotomy is on one hand we would envision a kind of functional resources something to do with a sort of large number of genes variants all types of sort of breath oriented thing that puts together results of many standardized high throughput experiments. The contrast of this of course is studying function in detail you know a kind of depth oriented approach looking at particular diseases particular genes and obviously this requires domain experts very detailed assays many of many of these things can't realistically be scaled but we felt that they were very important and we also felt that they're not necessarily the province of NHGRI but they perhaps could be in kind of partnership with other groups and the consensus of the breakout group is of course we need both of these things together we need both the breath and the depth and we felt that the glue to put both of these things together would be a really good informatics infrastructure that really ties together these overall kind of breath type of landscape with the detailed analysis on particular things and how to do this in a simple fashion and so forth we see is quite a challenge. So we've now sort of talked about the main aspects of this resource that we envision but there are sort of other considerations that I came up in the breakout group one was we're really talking about scaling to a whole genome level but there's of course another type of scaling that's going beyond the whole genome going to whole population doing functional genomics on an entire population and we were very enthusiastic about this to some degree motivated by the great success recently of all the EQTL and related projects and we urge people to really start thinking about a almost a personal functional genomics that goes with you know personal genomics the vision is in the future in addition to having your personal genome sequence you will be doing personal functional genomics on yourself doing gene expressions, methylomes what not on yourself over time and analyzing them and we need to be prepared for that type of data and then another thing that came up in the breakout was that functional genomics is really valuable beyond just characterizing variants. There's this idea that we can use high throughput sequencing to do more things for instance to characterize cell types irregardless of genetic variants for instance to develop biomarkers and a great example of this was the challenge talk last night from Aviv Regev where she talked about single cell transcriptomics and this human cell atlas project. So now I'll just try to summarize everything that we talked about in one slide. So the key idea as we think that our breakout was about integrating function and sequence and we think this is a great time to do this on a large scale basically as we have lots of variants now we also have lots of new technologies for you know giving us function on a large scale. We felt that the right way to do this would be in terms of a large scale resource project providing a functional framework for the entire genome and you know there's some aspects of this resource that we looked at what type of function should we look at molecular cellular organismal we think the first two really scale well and can be systematized well. There was this dichotomy between bottom up and top down should we look at all possible variants or all possible regions of the genome that intersect them with disease variants or should we start with the disease variants and then figure out doing broad functional assays on them. And then there was also this other dichotomy between the sort of breath wise standardized resource and then really the domain experts drilling into the genes again we need both we need to integrate them with informatics or architecture and then finally there was these other aspects that really came up the scaling the functional genomics to a population the personal functional genomics and then this idea that the there's more to functional genomics than just looking at variants and and so forth and that was really a summary of our our session. Let me add one thing that I meant to emphasize is that the multiple members of the group felt where you see something where you say well that's not scalable that maybe one thing in HDRI should do is challenge people to make them scalable so think the impossible now because I who was it was Joe yesterday said we would have never dreamed we'd be where we are with sequencing or at least most of us wouldn't have I think five years ago and so it's hard to scale the organismal well maybe somebody will come up with a way to do it and maybe one of the things in the nature I should do is put those challenges out. So so it seems like there you're talking about two things one locally in the genome what what changes in gene expression are mediated by a particular variant and trying to say that I'm not sure exactly where that fits whether some of that fits under and code whether some of that is its is its own project and then once you know once you think you know what product is being affected and how then trying to figure out how that how that product is deployed and and and altered the former is more obviously scalable than the latter although you can imagine at least creating resources so that when you pick when you want to go in and do deep studies you can you can have that but did you talk did you talk much about how how this interacts with encode yes and certainly thought about that and there were several encode people in the room as well but it's but and it's beyond encode as well the epigenome project and several others as well and whether you call it encode or not it's at least a big part of that mission is to find and that's that was combination is a combination of non coding as well as the coding elements like in some sense the coding the the coding elements are not completely finished but much much further along than the mill probably million different sys acting regulatory elements that and that clearly is part of this a major part of it but let me bill I'm not sure if you were getting at this but somebody yesterday and I can't remember said let's don't forget about the proteins who do I you and others maybe and certainly that matters too so one of the questions is whether you could so so we know how to to take sys acting transcriptional regulatory elements and test them on a high high scale and probably could do a lot better than that you heard one yesterday there are several others when you start to think about proteins and you have variants then they it seems like each protein becomes its own project except there are classes of proteins you know iron channels you and so you could consider having ways of testing those you know at least classes of those and maybe way beyond what I can imagine now ways of doing that on a on a maybe even an ultra high throughput way and that's probably valuable to consider both I can I just add that I agree with that and we have lots of examples in model organisms even that don't have a backbone that that that there are a lot of cases where are if you measure RNA levels you have a very incomplete view of of gene product functional gene product deployment either because of gene of RNA editing or because of variations in in the translatability of different products or or localization of products localization of RNAs and and proteins extremely important as well and so not having all of that information as a catalog would be extremely valuable so this is a on the one hand very exciting direction to consider and I guess I'm sure that you thought about this in your session but it it can be helpful to learn from history when one things about scaling up functional you know systematic functional efforts because it there those can be very vulnerable to scaling up false positives and false negatives and you know type one and type two errors because one doesn't know often exactly when one is making those errors when you get into a systematic experimentation so one could imagine a couple different ways of approaching this one would be taking very well defined assays that that are where one understands signal to noise extremely well and maybe in those running running systematic studies where you're only going to look for extremely strong signal so you know things that are sort of in the two fold range here and there I mean whatever whatever one defines but they would be extremely strong signal and that's how one would and maybe that would be looking at variance for which we really don't know the function so the so very strong signal would give us a clue as to what the function might be but then the other stream would be to pick context where one can partner with disease experts people who have been studying these areas for years and can and can help advise context specific extremely exquisite assays that are really only practiced by a few but where the parameters are very well understood and then in those put maybe kind of a more candidate set of alleles where we're trying to understand function as opposed to sort of saying we're going to take you know four assays and we're going to run everything across because otherwise you these kinds of things and easily lead to lots of spurious data artifacts etc if we're not careful so you just articulated better than we did exactly that point about the the two different you know top down versus bottom up approaches and and so clearly you would want to do that when you want to delve into the details and that and that while we said that's not scalable maybe there are there are lots of experts you know part of it is just is is the coordination part of that and so that there's no question that that should be done but your point about the false about you know you might be testing things that aren't important if the assays are ultra high throughput and very very inexpensive I mean you could argue test every single variant we've ever seen you know even and maybe we could do that but we probably don't need to because we have other hints and maybe even other evidence that this region even though it's out in the middle of nowhere is probably involved in transcription because there's a whole bunch of stuff going on there you know his stone marks etc so a guided approach for that slightly agnostic but still guided based on other kinds of function might be one of the ways at least to study transcription elements I just I just want to add to Rick's comment that you know part of really integrating the kind of domain experts with this breath sort of oriented resource could be developing kind of our estimates for the things you know if you put this data together in an intelligent way you can use one to calibrate the other and vice versa which I think would be very powerful yeah those are great comment and we're hearing some very exciting ideas and for large-scale projects could have a really big impact I wanted to generalize what he just said to emphasize that we have to have objective metrics of progress we were stating these big lofty goals and we need to have some idea of whether or not we're really making progress a really tough one for this one is to take your predictions to a panel of domain experts and as one member of our group said yesterday well would this really give an accurate diagnosis and an action plan to a patient it's that those are really tough but I think we should embrace those but also the I think you might have alluded to this but things like the cast competitions you know to have a have a set of say well in this case it was not known protein structures and can you predict them but we could we could do the similar thing we're doing it for our gasp and other things but having metrics of progress you got to have them I was gonna say I will certainly agree that having metrics of progress is important but I think it's also important to think about you know maybe this resource is not necessarily going to have predictive value for the patient but maybe it's going to be something that people going to use the starting point for follow-up experiments I mean I think you know what encompassing resource that's really meant to enable genomics throughout the NIH just beyond NHGRI is really going to be the starting point for people going to drill into specific things and really understand them so I think we need to think about exactly what metrics would be good in that context so so if one goal of an HRI is to connect sequence with health and disease it's striking to me that an item missing from both these breakout groups is the absence of an effort to connect drugs that can impact genome function with efforts to understand variation and that there's it's just one new chemical probe JQ1 directly against BRD4 can have it's been shown to have a positive impact in preclinical studies on cancer cardiac hypertrophy inflammatory and inflammation and it's a it's a reversible male contraceptive and mice so you know and there's a broad broad effort to develop preclinical small molecules that impact the genome why in the world wouldn't we want to be engaged in that so we should have written that out because we did discuss that and so yeah IPS cells or whatever cells and pharma is actually and other groups are very high throughput screening of cellular types of assays so there's no reason why we can't mesh that and in fact people some people do but the idea that you have the have the variants you know in a cell from the person it could be or however you want to do it and screen use that to test for thousands of things compounds not just drugs other environmental things as well so that's clearly should that's what we met by cellular assays we should have expanded and that I but I think that's a really important one I see and and let me just add one thing and connected to that I think we have a very unsophisticated view of all the proteins that interact with our genome we just touch the surface of that so if you if you want to connect sequence variation with with function we just need to understand the other 99% of the proteins that are interacting with genome I know this is you mentioned this but I think it and it was mentioned yesterday but I think this is an area where model where model organisms is really quite important to have a model organism line because this is there's going to be a scenario where you want to know you want to be able to close the loop between here is a set of assays we did to study the variation in this protein and then this is how it impact or all these cis regulatory elements this is how it perhaps impacted this cell model and then in a model organism you can go all the way and say for some subset of those this is how it impacted the organism and it's that latter thing which you you can only do observational studies in human and by having a model organism component to this one can close out that whole system and I do think that I mean the obvious one is mouse but I think I think fish may well be quite useful in this space as well just because the cheap and cheapness and scalability of it well I mean obviously we very much agree with your point and I just like them says that one of the things that did come up in the breakout that unfortunately we didn't emphasize on the slides is really the importance of conservation really you know things are conserved both in terms of the sequence but also in terms of how they behave across humans and model organisms really have a strong handle on you know what's important in the genome and I think that's a really important way of understanding variants and and sort of prioritizing them you and just to argue a little bit yes observational in humans except when you start to take cells out and then manipulate them the way we so you can do experiments but they're not no no cellular experiments are totally in zone and maybe organoids and I think that I think all of that should be should what should do those things and I think it's an incredibly positive thing but there's a there's obviously a level that you can't do and and and that's that you've got to have a model yeah one thing we didn't talk about much was how to implement these sort of things is this a series of our what I don't know if that's beyond the purview of this workshop but is it is it a Carlos Carlos but maybe I could see this this is very heterogeneous area I think the concept could still be described now and Bob Henry just side barring on a little bit and it will take some creativity as the best way to do it and you can envision this implemented in smaller fashion but you could also see it implemented in you know center sort of activities especially the scalable parts too and I think it will take some creativity it's the best way to implement different aspects of what's presented here I just want to ask if there's one more question because we do have to move on