 Good afternoon, my name is Michael. I'm delighted to talk to you today about screen. I'm one of the students working on it for a bit of a year now. My hope is to give you kind of a high level overview, a roadmap for different pieces of screen. So as you're playing with it yourselves, you can kind of figure out where to look again for some feature you wanted or some information that you wanted. If you're playing with screen now, which you can go to at screen.encoproject.org. If you find something really interesting, feel free to ignore me. And then you can point out something interesting later. Hopefully the demo gremlins will leave you alone for now. So we have these millions of candidate regulatory elements. And unless you're using the command line in Linux to sort through them, it's very difficult to actually manipulate them. So we needed this visualizer that we've called screen, search candidate regulatory elements by encode. And this is where we're trying to integrate all the candidate regulatory elements, all of the information we have about these from other encode data sets, as well as some other external resources. To combine them all into one place, kind of a one-stop shopping, we can go to search for your favorite gene or your region or your snip and see if there's any potentially functional CRE there. This is the home screen. And before we dive into the real meat of screen, I just wanted to point out a couple of extra tabs here that you might be interested in. The first one might be there's an AASG tab. Right here, you can go and download today's workshop slides. You can get a handout we're gonna have that Jill's gonna go over within a few minutes. There's a survey at the end that you can fill out once we're done. And then if you're really interested and excited as we are by the encode encyclopedia, there's also a mailing list we just started. In addition on the main screen, there's a few other tabs. One is an about page where you can have all the details that Jill has talked about about how the CREs were actually built. One of the other developers has devised a series of tutorials that will be similar in nature to what I'm talking about with you today. But if you wanna go home and take a look at these, feel free. And the last part is, well, where are these CREs? This page, the files tab, actually has a download link to all of the CREs. Both the cell type agnostic that Jill talked about, as well as the over 700 different cell types we have across human and mouse. Broken down, as Jill mentioned, by five group in nine states. All right, so let's take a look. Let's, the search page bifurcates in the two directions, a human search and a mouse search. Today we're gonna look at the hemoglobin, gamma subunit B subunit two, and we can just click on here and load the main page. Okay, there's a lot of things going on here, but let's focus on the most important thing. The most important thing here is this main table, which is actually the result of the search for CREs. So we put in the coordinate, chromosome 11, this particular start and end of base pairs. And here are all the CREs in that region that meet certain thresholds we've specified. First off, all of these candidate regulatory elements have a DNA Z score greater than 1.64. So basically, what you can think about this is, we've taken all the DNA sites and we've sliced off the top 5%. These are the strongest DNA Z scores. The main table here is then a breakdown of the CR itself. We have a million of these elements, how do we number them? Well, we've come up with a scheme. We've encode HD19 CREs and with a certain number of digits after them. If a CREs has been plus or minus 2KB of a TSS of a transcription start site, we're gonna label this proximal of P. If it's far away, we're gonna label this D. For some special CREs that have both DNAs and H3K43 or H3K27AZ scores greater than 1.64 in the same cell type, we're gonna label those as concordant and give them a little star. In addition, for these cell type agnostic CREs, we're indicating the max Z score for three different epigenetic markers. So we're gonna look at if it has a Z score and H3K43 greater than 1.64 across any of the cell types and likewise for H3K27AZ and CTCI. For each CRE, we're gonna show you across all the cell types the maximum Z score, both for DNAs, for 4M3, for 27AZ and CTCF. We'll give you the chromosome it's on, it's start site and the length of the CRE. We also then show you the six nearest genes, the three nearest protein coding genes and the three nearest genes in general to this particular CRE. Since there's still quite a number of CREs here, we still have 93 here in the selection. We actually have a little mechanism where you can select your CREs of particular interest from these 93. And this is just a basically a little cart. And you can click on the cart, you can click on the cart icon itself, you can come up with a separate page that's just your CREs that you're truly interested in. And last but not least, we can display it, you can visualize these on the UCSC genome browser. Now, so these are the cell type agnostic CREs. On the left, we'll give you more tools to further filter and restrict the search for these CREs. Most important is perhaps the cell type. A lot of times you would just have a particular cell type of interest. So you can do a little search here and we can for instance find our cell of interest K562. We select K562, the search is repeated. And now we have a few less CREs, I think from 90 we're down to 53. So these are the strongest CREs in this particular cell type. We have other facets here, we can change the chromosome if you'd like, you can change the region manually if you really want to. And you can also adjust the Z-score thresholds. For instance, if you were interested in CREs that have promoter-like signatures, you could bump up the lower threshold for H3K4ME3 to something maybe 1.64 or 1.4. And the search will repeat with a smaller number of CREs returned. Yeah, sorry, the region changed. Now, for each CRE, you can then click for more details. So if you go to the row and click on that, we're gonna give you a detailed page. This is where we start to tie in all of the external resources that we have as well as start to interrelate CREs amongst themselves. The first page here, the first tab here is top tissues. This shows you across all the cell types available, what is the Z-score for that particular, what is the Z-score of the signal for that particular CRE. These you can search across, sort these and whatnot. This is gonna be present for all four of the assays, DNA, CT-CF, 4M3 and 2MAC. The next tab is the nearby genomic features. So nearby genes, other CREs nearby with a certain window, other SNPs that are nearby. And these are also searchable and browsable. We're also showing you other genes in the tab that the CRE is in, as well as other CREs within that tab with a certain threshold, certain cutoff for distance. Next tab, this is where we intersect all of the ENCO project TF and histone data sets. So for the thousands of ChYPSIC experiments that are available, we're gonna run through those, do a fancy bed-tools intersect and give you this list. So in particular here, there is ChYPSIC TF, TF of max that has 10 experiments that actually intersect with this CRE. If you click on the 10, you then get all the experiments that are intersecting here, and you can click on the link and actually go and grab that file. In addition, we link out to one of the tools that we've been building for a while, Factorbook. Factorbook is another visualizer that's a ChYPSIC TF motif centric visualizer. Here, we're gonna give you a little overview of the particular TF. We're going to take all the ChYPSIC TF peaks for these different experiments and look at build aggregation plots that the histone marks around them within a plus or minus two KB window. We're gonna run a motif finder, a mean chip, and look at the top, on the top 500 TF peaks for this particular experiment, see what kind of motifs there are. We have some techniques for tossing out erroneous motifs and we can talk about more of that at my poster. We've also built heatmaps around all of these ChYPSIC TF peaks. Here, the columns are ChYPSIC TF peak and the signal here is the actual, in this particular instance, the histone mark signal, the 10 KB window and for other TFs in a plus or minus two KB window. And last but not least, we also have some nuclear zone positioning data for this particular TF. As you can imagine, the CREs and the motifs might be very deeply interrelated and very important. So one of the things I'm working on now is integrating factor book deeper into screen but as a separate tool, at least it stands alone right now and lets you investigate some things. All right, back to screen, next tab. We've also intersected the CREs with a fan to five cage transcriptome data and you might be kind of interested in this if you're interested in link RNAs. We've also taken all of the RNA CIC data from ENCODE project and are displaying it here for the gene nearest the CRE in this case. There's a few different ways of viewing this data. We're gonna give you the cell type it's in. We're gonna give you a different ways of actually looking at this. You can look at this for RNA CIC data at the TPM or FPKM. You can look at this grouped by tissue. You can look at this grouped by maximum tissue for a particular score or a particular units you like. Similarly for TSS activity, we have some rampage data and we're also are displaying this here now. If you have multiple TSSs, you have multiple transcripts available, you can scroll through these and also change the sort order and how you display these two. Jill was talking about relating mouse and human CREs. Here we do lift over from human to mouse and mouse to human and we have the lifted over. If the lift over intersects another real CRE and the other species, they may actually display that link here. You can click on that and investigate those. Two more tabs for CRE details. Because we have so many cell types, and it's very difficult to have kind of overall view how active these are in different cell types, we came up with this idea called mini peaks and these are basically little mini snapshots of a window of plus or minus two KB from either end of the CRE. And we actually are showing you a down sample version of the signal track for each of these cell types in both DNAs, H2 core, H3 twice of an AC and H3, K4, and E3. On these can be sorted and searched likewise. One of the projects that Jill's been working on are enhanced target gene finding and we're starting to give you an indication here of links based on Cheap Act data or EQ2Ls and you can also look at the gene involved and also the data section based upon. So that's the CRE details for you. A lot of people when they come to this, well, the most important thing in many ways is to look at this in the genome browser and we tried to build a bunch of tools to help you configure the genome browser, give you the best experience when looking at the CREs in UCSC. So for here, for instance, you can actually search across the 700 or 400 or so cell types available and add in whatever you'd like here. This configures a dynamically made UCSC genome browser track hub that you can then open up and actually view and add in your own tracks and do whatever you like with that. This does take a little while though. So one of the newest things we've been working on and this is still really in beta is an integrated genome browser here and this is still a little bit raw. Well, we are actually starting to become quite happy with how it's looking. This is available if you just click on the little circular box next to CRE and this is purely dynamic, all JavaScript. We're using some of the IGP.js code and the rest is ours and hopefully we'll continue to work on this to expand this and this might make you stay in screen a good bit longer than otherwise. So that's something that you can play with and let us know how we think of that. We have a few other tools available. We just have a, you can now upload your own text file in the normal bed three format and we'll give you back a shopping cart of all the CREs of the top 1000 CREs that intersect in your regions of interest. So that's most of that for the CRE search application. We have a few other things though in screen. One of them is the separate gene expression app, similar to what you've seen before in the CRE details page. However, you also can start to filter down on the biosample types as well as in whatever particular select departments you like. Halfway done. Few things left. There's another app, let's take, let's say we go over to the mouse world. We have an app that shows you differential gene expression for mice. As you saw, we have some very nice mouse datasets. So we developed this tool. Basically, this is taking the mouse RNA-seq data and brain DEC2 and outputting the results. Here you have two different comparison cell lines, one versus the other. And you can deselect these and look around and find whatever in particular you'd like to explore. On the main screen here, we have a full change of expression. We then also have in dots, all of the enhancer-like signature and promoter-like signature CREs in this region. They're also available on this table. You can sort and search. You can also look for the genes involved. So each of these boxes here is overlapped. It's a particular gene. And what the red and green meaning that what's in our creative strands. One last thing is a GUI's visual hazard. So if you're interested in SNPs, natural question to ask would be, well, what CREs overlap my SNPs of interest? And Jill's been working on this for quite some time where we have a wide range of studies. You can go to the PubMed link for more information and you can go to the particular GBA study you're interested in, look at a particular cell type you're interested in, and then look at the CREs involved in overlapping the SNPs in that study. We can link out to Ensemble for the SNPs. You can link out to GeneCars with Gene. And of course, the good old UCSC browser. We do show you the total number of LD blocks and the percentage of LD blocks that actually overlap those CREs.