 So next, we have a Valter Muleman from a knowledge calisthenic at MIT. So he's going to show us how to use another public tools for denotation for non-coding variants, Hepler-reg. All right. So like the previous speaker, I am also not the developer of Hepler-reg of the tool I'm presenting. I've also been an active user, mostly yesterday. But I will do my best in, you know, putting across the main concepts behind it. So like the previous tool, you know, the first two bullet points are essentially the same, right? In the sense that we have a lot of variants that we find in all these GEO studies. There are mostly non-coding regions and we have no idea what they do, right? So we can use data from ENCODE and from the roadmap epigenomics project to get a little bit more insight into, you know, what these regions are relying these variants actually might be doing. Now the key, I think the key difference between regular MDB and Hepler-reg is that Hepler-reg takes a slightly different approach to things in the sense that it kind of exploits the LD structure of the correlation structure, Hepler type structure of the genome. And I'll try to illustrate a little bit what that means. So this is, this could be, you know, some example genomic region with a number of SNP locations. You got in the red, you got like some lead or tag SNP that you've measured on the SNP array. And on the y-axis, you see some kind of association with a particular trait. And as you can see, both of these are in non-coding regions. Now what you can do is you can use chromatin state data from ENCODE or from roadmap and you can sort of like overlay that with this region for a number of different cell types, right? And, you know, maybe you can find that your SNP of interest, your red SNP there is actually inside an enhancer region, for example. It turns out that this is not the case here. So you can say, okay, well, now we can't really explain what this SNP does or what the underlying function of the SNP might be. So we just move on to our next SNP in the list. Basically what Heplo reg does, and this is, I'm just showing you the chromatin states for GM12878 here, but what Heplo reg does is it looks at what is the LD structure. So which SNPs, which nearby SNPs around the lead or tag SNP are in very strong LD with your lead or tag SNP, right? And the idea is that if that LD structure is strong enough, that correlation is strong enough that you can also look at the other SNPs that you can impute or you can drive in other ways. So even though our lead SNP is not inside an enhancer region, we now have a bunch of candidates here that we can also look at, if we assume that this is the Heplo type block, and we can also look at whether these SNPs are maybe in interesting regions. And as you can see in this case, you know, at least two of them are in enhancer regions that are specific for GM12878. This is just a cartoon example. Now if we actually look at a real example, it's not that different. Maybe the cartoon example was derived from the real example, maybe. So here we see again our lead SNP, and it falls actually inside an enhancer region specific for GM12878. But maybe we don't have any further evidence of what's going on there. Maybe if we look at the underlying sequence, so we, you know, we take this SNP and we look in regular MDB, there's little evidence to be found, right? So that's basically where it essentially stops with regular MDB. For Heplo reg, you can now say, okay, let's actually look at the other SNPs that are in the same LD block, and see whether there's anything underlying those SNPs that might actually give us clues onto what that region might be doing. If you look at that one, in particular, that's a SNP that's in strong LD with our first SNP, it actually turns out that if you look at the sequence, at the particular SNP variant here, that you can actually see that it actually strengthens a certain binding motif for ETS1, which is actually a predicted activator of lymphoblastoid enhancers. Moreover, it turns out that if you look into other variants found in this particular GWAS study, that a lot of those actually also affect the ETS1 locus. So by exploiting the LD structure, you can get a little, you basically get a second chance at finding a potential regulatory mechanism. So this is really sort of the main difference between regular MDB and Heplo reg. So the initial paper was published in 2012, and more recently there's been another update with, you know, like an update of ENCO data, update of roadmap epigenomics data, more motif data is in there. There's links to EQTL studies, et cetera. So this is what I just said, basically. And again, all these slides are going to be or are available. So look at this if you want. One very useful thing to point out is that if you look at this latest NAR paper, it actually contains a very short but sweet tutorial that takes you step by step through the process of using Heplo reg for a particular study of interest. So what does it actually look like? This is the stunning graphic design behind Heplo reg version 4.1. There's different things you can do. One thing you can do is you can just, you know, if you have a particular snip of interest, you can just fill in here the RSID. Alternatively, you can provide a region, genomic region in this format here. You can also upload a text file with a number of snip IDs or alternatively, and that's what I'm going to show now, is you can select one of many, many GWAS studies in which a number of variants have been associated with a particular trait of interest. So here I'm going to focus on this ADHD study, which found 26 variants. As you can see here is to be associated with ADHD. So if you would select this one, we press Submit, then we end up with a long list of 26 of these Heplo type blocks associated, one of them associated with each of these 26 snips. Okay, so what I'm showing you here is only one out of these 26 Heplo type blocks, right? And all of these snips are in very strong LD, as you can see in this column here, with our lead snip. So that means that, you know, all of these are very good candidates for, you know, further studying the regulatory mechanism on the line that particular locusts. And it's actually, the list is actually a little longer because it's actually, and there in the red, did you see that in the red? So if you focus here, I'll say, I'll say whenever it passes here, you see the tag snip, there, right there it is. So all of these are in very strong LD with our lead or tag snip. If we would actually still look at the result page, if we click one of these, one of these snips here, then it looks a little bit like this. So in this Spartan view, what you can see is that for all roadmap epigenomes, so these are all different cell types, and I'm not showing all of them, because there's about 130 of them, you can see what the chromatin state is, according to different chromatin state models on the line those regions. And if you do that for this particular snip, part of the ADHD study, you see that it actually turns out to be a relatively specific brain enhancer. Now you can do a little bit more than this. You can also, if you scroll down on this page, you get more information on other studies in which this particular variant was discovered, right? So for example, HEPLA-REG also includes the GRASP database, which basically contains a huge catalog of all kinds of QTL hits. You can see that there's a bunch of studies in which this particular variant has been associated with changes in expression. So that's another thing that HEPLA-REG provides on top of regular MDB, is it actually provides links to EQTLs. You can actually have a sense of does your variant actually have an effect on expression. And it has several of these kind of databases, it includes. Importantly, and this is where HEPLA-REG is again similar to regular MDB, is it also looks at whether a particular variant is going to disrupt or create a particular binding site. So in this case, we see that there is a P300 binding site in the reference allele, or around the reference allele, and this is the logout's ratio of the logout score of that particular motif instance. And we see that with our particular SNP here, it actually disrupts that motif very, very strongly and the logout score very plummets. So these are sort of the different things you can do with HEPLA-REG. You don't have to know the LD structure of your variants, right? So HEPLA-REG takes care of that. So it allows you to sort of like, you know, widen your search a little bit. You don't have to find, you know, you don't have to stick to your particular variants. You can look around it basically. And HEPLA-REG takes care of all of that. What it doesn't do, what Regulium does do, is it does not provide a score for, you know, these are the different sort of tiers or levels of evidence for your variant. And there's several reasons for that and we can talk about that later. What it also does not do, which is something that is great about Regulium DB, Regulium DB allows you to score any kind of genomic location, right? So it also means you can do, you can use novel variants or rare variants. HEPLA-REG is completely pre-computed. So for Regulium DB, you could provide any kind of VCF file of regions of interest. HEPLA-REG basically is a VCF file, right? It's just a heavily annotated VCF file, which is also available for download. Right, so this is the paper. Like I said, take a look at it for the tutorial. It takes you through some of these ADHD example loci. Take a look at the website itself. It's a slightly shorter URL right now. And lastly, I want to thank Luke, who developed this thing. And Jill Moore, who both provided me with some of these slides. Thanks. Yeah, you can sort them. I think it's very debatable on whether scoring is, I think it's very hard to make a scoring non-arbitrary. And I think the scoring of Regulium DB, you know, it's certainly very useful in the sense that you can at least get a sense of what level of evidence there is. But binding of transcription factors or the presence of motifs is not everything, right? And it sort of goes in against a little bit of the sort of the unbiased nature of the GWAS studies, I guess, right? So it would definitely be great. But I think devising a score like that is something that is kind of a field of its own. It would be great. But it's not necessarily a question for Hapla-Reg. But is there a plan to do the same type of interface or analysis for the mouse? Or is it just for human? Yeah, good question. Yeah, I totally agree. There should be. I don't know. I will ask Luca Minolas. I will ask them. That's a good point. It should be there. I agree. The mouse there is only MM9. Okay, thanks. So before I forgot, I also want to announce for all the people who are presenting in the workshops, actually, we will be in the HAPDAX tomorrow from 7 to 9, or 730 to 930, whatever. So we will be there. So like Jason Ernest or Eric or Valter will be sitting on the table with our name on it. You can come here, grab a beer, and ask us all the questions, all right? So let's, again, thank all the speakers in this session. And we'll take a break now.