 Hi everybody. Today I want to describe work that was carried out as part of the ENCODE 3 project in a collaboration with Bing Ren at Ludwig Institute, UCSD, Len Pinocchio and Axel Vistle at Lawrence Berkeley National Labs, and Barbara Walde Caltech. The project involved spatial temporal analysis of fetal development in the mouse. Our part of the project was to study the dynamics of DNA methylation during these developmental stages in a variety of organs shown here, which resulted in about 120 plus whole genome bisulfite sequence methylomes. This data was integrated with histone modification data from Bing Ren's group, chromatin accessibility from Bing and gene expression data from Barbara Walde, and integrated. And I want to just describe briefly some of the highlights of the study, which we recently published. The project was headed by, in my lab, by Yopong He. And Yopong found nearly two million or so differentially methylated regions, either comparison of DNA methylation profiles between tissues or within a tissue over developmental time. Remarkable dynamics were observed from E10.5 to adult with both programmed losses and gains, dramatic gains actually after birth of DNA methylation at what we presume are developmentally regulated enhancers. We predicted these enhancers by combining the DNA methylation data with the other modalities, histone modifications, and open chromatin and changes in gene expression using a algorithm that Yopong developed and published in an earlier paper in P&S program called Reptile to predict putative enhancers. And this performed well in the validation experiments carried out by Len and Axl. One of the most interesting things we found, I believe in these studies, is that in addition to the CG methylation shown here, symmetrical methylation, we found a significant fraction of non-ZG or CA methylation that accumulated during fetal development, mostly in non-dividing cells and post-montotic cells and various muscle tissues as well as in neurons, that accumulated very interestingly on the gene bodies of many transcription factors that are actually involved in the genesis of these organs, the master regulators of these organs accumulated MCH over gene bodies. And presumably is associated with repression is at least correlated with reduced expression of RNA. And really that's all I want to say about this work since it was published recently and it's been available on the bio archive for actually over a year. I want to describe more recent studies in our group that have been focused on the brain and particularly on the role of non-ZG methylation in cell identity. So shown here on the left is a cartoon of CG symmetrical methylation and asymmetric CH or CA methylation, largely in a CA context, but it can be found in these other contexts as well. And the dynamics of that compared between CH and CG. So CG methylation, as I showed you is actually very dynamic during fetal development, and it's also dynamic after birth, but what you can't see is that there are gains and losses of methylation that sort of normalize out in terms of the absolute level of methylation, because the non-ZG methylation has actually a fraction of, although I told you that exists in fetal stages, that's a fraction of what it accumulates to in adult tissues and you can see here in mouse and in human and these are bulk cortical neurons, cortex from mouse and human. And this occurs in a very interesting period where synaptogenesis, those connections in the brain, most of the volume of the brain here is actually due to connections in the brain and not new cell division, although cell division is going on at some level in certain parts of the brain. So we've been studying this for some time. It's in part because it turns out that this is the dominant form of methylation, at least in the human brain. So we've taken this to now single cell level, trying to understand the role of these epigenetic marks in the diversity of neurons. So just shown here are the cortical layers in human cortex, layers one through six. And the excitatory neuron groups just represented by the different kinds of morphologies of neurons that occur in the different regions of the brain. These are neurotransmitters that define the cell type and gather into neurons that are dispersed throughout the brain and they're different morphologies. Our hypothesis is that those cells and their morphology are linked to changes obviously in gene expression but also change epigenetic changes. So we developed a pipeline, and we here was Chung Wong Luoh, who was a former postdoc who now has his own laboratory at UCLA, Human Genetics Department, to carry out analysis of the entire mouse brain by producing 18 different slices that are 600 microns, and then dissecting those regions. And this is the work of my long term colleague and friend, Marga Bayruns, who's an outstanding neuroanonymous and neuroscientist. So Marga has done careful dissections of these anatomically defined regions of the brain. We prepare nuclei. We actually have been enriching for neurons because the project is somewhat neuron focused on relative to non-neurons. Well, although we collect the fraction of those in each experiment. And then we use standard sort of gold standard sodium bisulfite conversion to convert sites into urusul, except that the base is methylated. And we use high throughput robotics and noviseed sequencing to profile. So far 45 of 118 regions that we're planning to continue to profile. Isochortex olfactory areas at the campus and cerebral nuclei. And in parallel, this batch of nuclei is actually split and given to Bing Ren's group and Bing has been carrying out in all of these regions. Single nucleus attack sequencing, which I won't have time to talk about, but that work is also on the bio archive along with with our preliminary results. So I just wanted to show you what this data looks like. I really encourage my folks in the lab to actually visualize the data. And so we built a browser that allows you to visualize single methylome tracks. Okay, this is a heat map showing the different layers of the cortex with both inhibit excitatory neurons here and inhibitory neurons here and some non-neurons down here. And what you can see for this region of the genome, and these are the levels of methylation, you can see depletion of methylation is marked here in yellow and increased methylation in the flanks here is increased is shown by blue. What you can see just by looking at I for these hundred and 66 nuclei or 69 or 56 nuclei are these depletions which are very highly correlated with gene expression. So essentially we can predict out of the 50,000 genes in the human genome including all the non-coding RNAs and the latest gen code annotation. We can we can we can identify something like 90% of those genes, you know the 10% we can't find are too small for our level of sampling which is about 1.5 million reads past filter that per nucleus. So these are complete methylomes obviously that would be we'll see we've done so 100,000 of these so far. But if you collapse that data into tracks that you might be more familiar with looking at you can see here that average when you average all these methylomes together you can begin to see that there is depletions of methylation in the gene body here, which predict expression and in non-neurons. So you can see that this gene is a non-neurons excitatory neurons or non-neurons or inhibitory neurons this gene is not expressed. This information allows you to not only identify what the cell type is but where it exists in the brain. Okay, I don't have time to go through the machine learning of this, but we can buy anatomical dissection we know where these neurons are from non-neurons excitatory regions of the brain like the the CA region of the hippocampus or largely one cell population here. Other regions of the non-neurons are scattered throughout the brain and inhibitory neurons can also be have very enriched locations within the brain. So this just shows some of the UMAP embedding of that those 110,000 methylomes with excitatory neurons here inhibitory neurons here and some of the clusters of non-neurons here. And then we carry out additional levels of clustering as shown here for some of the inhibitory neurons cell types and then for these level three clustering of this group here you can see in fact that we can if we use gene markers we can even go to a subtype level an additional level based on other markers that are identified de novo. And that same information is shown here in the different layers of for these 45 brain regions, ultimately resulting in about 161 subtypes. And this is very similar to the numbers that the RAN lab has found 160 I believe was the number for those identified by using. So what can we do with this information here again shows a pattern of two different regions of the brain let's just call them A and B, where you have hypometallation in the CH context here which would predict expression of this BCL 11 B gene. Okay, and below the pattern of again this is summed of a number of cells that in these two in these two regions region A and region B, where you would see where you also see hypometallation here upstream these would be predicted to be upstream so this is gene body. So what we can do is take all of this, the regions all of the thousands here of regions that that are identified as differentially methylated among those 68 excitatory neurons subtypes and we've done the same thing for inhibit inhibitory neurons. So we can compare every region pairwise so for example, the hippocampal CA one region can be compared with that are identified by these marker genes can be compared to the hippocampal CA three region. And identify within just this little cluster here comparison hundreds of differentially methylated genes and thousands of differentially methylated upstream regions and these pairwise comparisons. So what what we would predict here are the genes that are expressed by hypometallation in the you know, in the gene body here and here what we would predict would be hypometallation here and be for example, of upstream, putative upstream regulatory elements, and we can overlock these upstream regulatory element regions with the transcription factor database and identify enrichments that are cell type specific throughout the brain and are associated with specific gene expression. And so, carrying out differential methylated gene analysis finds us marker genes and genes that are expressed, we can also carry out motif enrichment by carrying out the CG pairwise analysis identified. We're particularly interested in transcription factors that are in these reasons you'll see why in a second. So what what can change has devised is an impact score which summarizes all of these pairwise comparisons between groups of clusters, where he created a, let's say a phylogeny or more or less a hierarchy, where, as you'll see in the next slide where we can develop nodes of where you have enrichments of methylation in one cell cluster and not so much in the other cell cluster and you can add these up and create an impact score. Okay, for a left branch or a right branch and, and this branches are used to are built from both the gene body expression of genes as well as the motif enrichments and you'll see this on the next slide. So this is the kind of hierarchy that Han Xing is built for both excitatory and inhibitory neurons. And these are the different regions of the brain. These are the major cell classes, which then are sub categorized based on their gene body hypo methylation of sets of genes that are enriched in one branch versus the other. And so some of the major branches are shown here in these examples. So here are a list of brain specific differentially methylated genes that are much longer lists. But just to show a few examples. Here's a split between branch one and two or two and three or three and four etc. And shown here are the enrichments, either on to the left side or to the right side. And at the top in bold here are the enrichments for transcription factors. We're particularly interested in transcription factors that are hypo methylated in one branch and hyper methylated than the other. And, and the details of the clusters are actually shown here where you can see that there's enrichments on one side of the branch versus the other side of the branch by, by the colors here. And then what we can do is we can link that information to enrichments that are branch specific that in the motifs. So in other words, if you find that there's a interesting set of transcription factor gene bodies that are hypo methylated and likely to be expressed as predicted to be expressed from our data. You can then look at the corresponding motifs for similar enrichments. And what you find is, is that even for members of gene families, where there are many members that might have the same motif, only those transcription factors that are predicted to be hypo methylated are likely to be the ones that are targeting those motifs that are, that are reduced in the methylation. You can see here the differences between EP, these are two different regions of the brain, where you have hypo methylation in both or hypo methylation only in one cluster or the other cluster. So the prediction of the expression of the transcription factor can, can then be correlated with hypo methylation of the motif. So just to wrap up then I want to first thank you punk he who was did a Herculean job of the, the encode three project. He's now a scientist that garden health in the Bay Area, and Chung Wong who carried out the, the most of the development of the methods that we're using in my lab and I only had a fraction of time to talk about his primary work he's now UCLA. Qing Ting and Han Qing have done all the analysis that's been of all the 100,000 plus methalomes that's been carried out by staff in my lab. I mentioned Jesse's group, who is critical and don't song in the development of the, the three C assay that's linked to methylation Her group is a collaborator on other projects that use methylation to study projections market carried out all of the annotation and hers group her group is purified all the nuclei, and all this work has been supported by NHGRI or the brain initiative. Thanks very much. I'm happy to take questions.