 Hello, everyone. Welcome back. This is the workshop session 3. My name is Feng. I come from Penn City University. So, as Tripping and Cricket have shown you that ENCODE has generated thousands of datasets and has made predictions in the magnitude of hundreds of thousands. So, unless you have a full-time bioinformatician in your group, it's going to be hard and a downing job to go through the predictions by ourselves. So, that's why in the next few hands-on and live demo workshops, we are going to show you some of the online tools we developed and hope those tools can make your life easy. And this workshop will be a live demo. So, we'll see how it goes. I treat this workshop as more like a classroom or more like a computer lab that I led when I was a graduate student. So, my job is to make sure you can follow what I do here. And probably at times, the pace will be too slow, but just bear with me just to make sure everybody is on the same page. So, mainly, today I'm going to talk about two browsers developed in my group to assist you to go through the encode dataset. So, let's see. Okay. The first website is called encode element browser. So, do you all still have access to the web portal, encodeproject.org? You do? All right. So, if you go to Encyclopedia, click Encyclopedia about. Did anything happen? Okay. So, in this page, just like you can introduce, there are some ground-level annotations, mid-level annotations, and high-level annotations. So, what I want to show you first is gene expression. Because I can guarantee you, if you are a student or a postdoc, that your PI, the most asked question from your PI is, is this gene expressed in this tissue or cell type? Do you really want to download all the dataset from the encode to answer that question? Right? Hopefully not. Unless you're a big show-off. Go to the middle, ground-level gene expression. There's a link to query. So, everybody just feel free to do it. Unless it's getting really slow on my computer, I may ask you to stop. Right? But, for now, we are still friends. Click query. I will bring you to this wonderful website. Of course, it's developed in my group. Click human. Okay. Four very simple options. You can search a gene expression, search all the predicted TF bonding sets, open-crompton, for a given region, or around a gene. And option four, you can search a six element that is linked to a gene based on denature one hyperacetam sites. Right? So, in this text box, you can just text any gene you like. For me, my favorite gene is SOX2. So, it's typing SOX2. So, it also has the autocomplete function. Click SOX2. Click submit. Oh, what? The router is really performance wrong right now. Cool. So, in this figure, it can show you the gene expression across 160 something cell types. So, each bar is a tissue or cell type. And then the value is the transcript permane cells. I also want to mention all the data are actually processed through the same pipeline. And I would say majority, if not, all the data are actually generated by Tom Jener's lab and the router's lab. So, Tom is actually sitting in that table. He will give a talk later today. Okay. So, the tissues are actually organized according to their tissue of region. It's manually curated by my student. And if you go further down to this website, let me step back a little bit. So, for this picture, actually, you can download it and directly use it for your publications. So, you can click here. Save a JPEG or a PNG. See, it's working, right? Pretty good. But if you use it for your publication, there might be too many rows. You really don't need that many bars, right? So, that's why my student devised this very smart function. So, if you go further down, you can see you can choose what tissue you want to display in the query, right? You can just do this big categories. You can look for all the arrays or only the nucleus arrays. You can use the total arrays or the polyselected arrays. Here is where you can select all the tissues or deselect all the tissues, right? So, we know it's a gene hot express in stem cell and some other pretend cells. So, let's go further down. Let me search ESL or I think it's H7 stem cell. Choose one, two, three, four, five, six. Choose random six data sets. Go back to the top. Here you can update graph, okay? Collect this. Update. Voila. Now you're plotting the gene expression for these six cell types, right? You put your mouse over the bar and it shows you the value. So, these values are also normalized. So, you can directly use it for comparisons, right? So far so good. We can follow the instructions. Keep in the back. You can see the screen. Can you speak up? Cut off? Oh, the color is just, it's according to their tissue of origin. I think it's just my student gave them the color according to how her feel that day. So, yeah, to distinguish them. Yeah. I think they look pretty. So, that's a gene expression. Okay, your PI come back. Okay. So, these genes expressed in stem cells. So, show me what TF or what components are located nearby the gene or in a certain region. So, you can go back to human. Option two, search for a six element in a given region. You can be not a search for a hundred million or please don't. Let's just be civil. Let's start just a chromosome one. Let me try one million to one point five million. Click submit. Probably should have used a smaller window. So, whoever got something displayed on their computer with your hand. Okay. Yeah. Good job. So, in this page, there are several tables. The first table is all the DNAs one high percent of sites. Of course, they can mark both the answers and promoters or in general any TF binding sites, right? So, there are two columns for this table. The first one is the site that coordinates for this open crumpton. The next column is in what tissue? This DNAs one site is present, right? So, you can see this is open crumpton in many, many tissues. In kidney, in all that are K-2, all the cell types, all the tissues. If you go further down. Okay. So, there is another table called TF binding sites. So, this shows you all the TF binding sites in all the tissues surveyed by encode. Probably, we will have a more updated version soon, but this is the version we have, I think, late last year. So, this time, this table has three columns. Again, the first column is the coordinates of the TF binding sites. The second one is what TF are we talking about here? Paul two, record one or this guy. Sometimes, when you have multiple, means there are more than three or five TF is binding that region. We don't need to specify all of them. And then column three is in what tissue this TF is present. So, you can see Paul two is binding in GM, one, two, eight, seven, eight. K-2 have these binding sites, so on and so forth, right? And then, if you do have a bio expedition in your lab, you can save this file as a CSV file. It's a text file. You can direct and manipulate and do your comparisons with some other features you care. Pretty straightforward function. So, option three, say, you forgot the gene loci, right?