 Welcome back. So the next little while before lunch, I'm going to try and get through as much of network visualization as we can cover. So the idea, I'm going to go over a few slides just to introduce some concepts, give you some tips, tell you a little bit more about network visualization. And then, which is mostly based on the ideas presented in the how to visually interpret biological networks primer that you guys, that was distributed before the class. So if you read that, this will be familiar. And then we'll move to an interactive demo of Cytoscape software, which is a free network analysis and visualization software that is developed by many people, including us. And then that's a fairly powerful tool. So I'll just try and spend the rest of the time going through the basics and answering questions about specific things. So I'm going to cover just a few topics. So the lecture is not going to be that long, actually, and we'll be able to get into the demo pretty quickly. And then I think the afternoon we'll go into even more detailed stuff. So the next part is going to be the basics. So we heard about networks in various different contexts. You can have molecular interaction networks. You can have functional association networks. And pathways are types of networks. There's a lot of network data and biology. And networks are generally useful, as we heard about, for kind of figuring out relationships in your data. And it's really particularly useful when you visualize those relationships as this visual network and you interact with it. You can move the nodes around. You can see how things are connected, how data is connected to each other. There might be specific regions that are interesting and she'll talk about. But for networks to be visualized, you have to lay them out. And there's this idea of automatic network layout. So if you just took all the connections between all of your data and you plotted them fairly randomly, it would look like this. And it's not very easy to interpret. So this is sort of before layout. And after you lay out the network, this is the same network laid out. And the lab serves to reduce the overlap between nodes and edges. So it tries to arrange the nodes so that there's as little overlap as possible. And in a fact, it actually, most of the time, is putting together nodes that are highly connected closer together and nodes that are not very connected, like this node and this node. They're further apart from each other. So that's sort of the general idea. And you can do this manually if you want, but it would take a very long time. And so there's a number of different automatic layout algorithms that people have developed to lay the information out. And I'll show you those interactively inside Escape later. So the layout algorithm that I should have showed some more pictures of this. But the layout algorithm that we used for this network is called force directed. There's sort of a general class of layout algorithm where it actually has various different names. But the idea is that it represents the algorithm works by thinking about the network as a physical system where nodes are pushing away from each other, they're repelling each other as sort of a repelling force. And edges are like interaction between the nodes are pulling them back. And if you think of that whole network with those opposing forces interacting with each other, the force directed layout algorithm actually simulates this physical system and says, OK, if we calculate all these forces, and we update and iterate over that, eventually you get a network where the forces are sort of evenly balanced. And it's sort of like if you had a real network of springs and magnets or something could throw it up in the air and it would all jumble around and land and would probably be organized more sort of similarly to a force directed network. So there's other types of network layouts, like hierarchical, which I'll show you a picture of later. If you have data that looks like a tree, like the gene ontology networks that we talked about yesterday, there's automatic layout algorithms that put the root of the tree at the top. Or if you have a phylogenetic tree, for instance, they put the root of the tree at the top and they have a sort of more tree-like result. So but in general, force directed layouts are the kind of general ones that work best for most networks. So just a couple of different tips here for how to work with networks. Network layout, automatic network layout is good for not very, very big networks. So once the networks get too big, you've seen some examples of this, like that big orange hairball network that Lincoln put up, which I actually made a long time ago and now it's like the example of what not to do with network layout because it's a really big hairball and you can't see what's going on in it. So network layout algorithms are good. They work really well for smaller networks, up to 500 or 1,000 nodes. Something like this, it works really well with. And if there's not too many edges connecting everything, so if there was an edge connecting everything to everything else, you would never be able to get rid of the edge crossing so all the edges would be crossing each other and it would look more like this. So if you have, and so this is something that happens all the time, people say people use these automatic network layouts and they still get a hairball. So to avoid hairballs, and that happens when there's too many nodes and edges and the network layout algorithm can't figure out a way of presenting the data clearly. So if you have something like that, it's a good idea to reduce the number of nodes and edges. And one way to do that, and I'll show you how to do that inside Escape, one way to do that is just focus on a specific area of interest. Or if your edges in the functional association network that Lincoln showed where the edges represent different types of relationships with a score associated with it, you can just look at the best scoring edges. So if you have some score on a protein interaction, the confidence that represents the confidence of the protein interaction, and you're getting too many edges, then just reduce the, remove edges that are less confident and just look at the top confidence ones and see how the structure of the network looks and you can gradually add information. So there's a number of different types of layouts which I'll mention. As I said, force directed is probably the best general one. And so that's probably always a good one to try first. And then there's other types of networks that work better for hierarchical, as I mentioned. Network layout works for tree-like networks. So automatic network layout can do a pretty good job with many networks, but if you wanna really make a publication quality network figure that eventually, once you figured out how your network is, how to visualize your data, it's really a good idea to manually adjust the network layout by moving the nodes and edges around and I'll show you how to do this. And also when you're finished, if you load the network into a drawing program, a lot of figures that you see in papers, the network has been loaded into a drawing program and then people add things like arrows or highlights and they might move labels around. And so that's a useful tip. So Site Escape, and this presentation is sort of leading into a demo of Site Escape. So Site Escape allows you to kind of zoom in to a focused network, as I'll mention. And the other thing Site Escape allows you to do is color and change the, visualize the network in lots of different ways. So one of the things that most people usually wanna do is visualize gene expression data on their network. So Site Escape allows you to do that. And but to do that, you have to understand sort of three basic ideas. So one is that the networks that we talked about, nodes connecting to each other, that's sort of just the network information. In addition to that information, there's usually a lot of extra information associated with the nodes and edges. So for instance, you can have, if the nodes represents a gene, you can have gene expression data across multiple different conditions or protein peptide counts or other information associated with the nodes. And the edges can be associated with information like the type of interaction, whether it's a co-expression link or a pathway link or a literature link. It could have a weight on it, as Lincoln discussed. So the edge can, that weight can represent a confidence value, as I mentioned. So maybe you have very high confidence values indicate strong interactions and low confidence values indicate weak interactions. So all of that data is associated with the network. You can pull in gene ontology terms, things like that. And then to visualize that, there's a lot of different types of visual properties. So in Site Escape at least, there's all these different types of lines that you can use to represent edges. And you can have nodes that are circles or squares or arrows. You can have different types of arrows you might wanna represent different types of information like an arrow represents a directed interaction, as Lincoln mentioned, or a little T symbol here represents an, you might wanna represent an inhibition interaction in a network. And these are all different types of visual attributes. So anything that you can imagine in terms of visual attributes, color, shape, size, borders, whether the thing is transparent or not, those are all different types of visual attributes. And what you wanna do is sort of think about these two types of things and say, okay, how do I wanna represent individual nodes and node and edge attributes as visual attributes? And once you figure that out, which is really a creative process, by using Site Escape later, we'll see exactly how to do that practically. But you have to say, okay, I think it'll be a good idea to represent gene expression data as color. And not just a color, but a color gradient from red to white to blue or something where red indicates low, under-expressed, white means the expression isn't changing and blue means it's highly over-expressed. And so once you decide that, you can map your node and edge attributes to visual attributes. So that mapping is something that you have to figure out. And what you can get is something that looks like this. So here's just an example of a network that we mapped a number of different types of information on. So in this case, and this was presented that Nature Biotechnology Primer that we passed around. In this case, we represented basically some aspects of gene function from the gene ontology, specifically cellular location information. All the nodes in blue represent, all the nodes represent proteins, in this case from yeast, and the interactions are basically protein-protein interactions. And we colored the nodes based on general functional categories, which are relating to mostly complexes and parts of the cell with these proteins are found in. So this is the kinetochore, nucleosome, replication fork, and then the lines between the nodes are just straight lines, but there's different thicknesses. So the thickness represented some information from gene expression data that we had, which was how correlated these two proteins are to genes are in gene expression data across multiple conditions. So if you see a thick line between two genes, it meant that those two genes are always co-expressed at the same time or not co-expressed at the same time, so they track each other. And then the last visual attribute we had, we had data about how high the expression levels were at the maximum amount across an experiment. And this was an experiment of gene expression over the cell cycle, I think, and the certain genes were expressed really high at some point, and certain genes didn't really get up to that much high expression. So the bigger the circle, the higher the expression. And so just visually interpreting this, you can get quite a lot of information out of this, and there's sort of three main patterns that we've found are useful with any type of network. One is the guilt by association idea that we talked about earlier, and you'll hear again from Quaid later. That is, things that appear close to each other in the network are more related to each other than things that are farther away in the network. And if you know the function of genes, you can infer that things that are functionally, and you can sort of see here that all these, the guys that are colored the same are in the same area of the network. And that's not really an accident, they're all functionally related and connected to each other by protein interactions. And you can use this concept of predict gene function if you don't know the function of one of these genes, but it's close to a bunch of other things. If you know the function of, then you can predict that. And Quaid's gonna go over that in a lot more detail. The other thing, so that's sort of one idea, the guilt by association. Another, and it works with any, it can often work with any type of attribute that's related to the interactions or edges that you're visualizing. Another idea is dense clusters. So you can see here that there's a few different parts of this network. It's not everything's connected, everything else. There's a dense cluster here. These things are all sort of interconnected. These things are all interconnected. And if you look at those, those in this protein-protein interaction network, they represent protein complexes. So dense clusters often mean something. Lincoln mentioned this before in his talk. If this network represented social, like the Facebook network, dense clusters would be cliques, groups of friends that were all friendly with each other. And in the protein interaction network, they often represent complexes or parts of pathways. And depending on your network, they could represent other things. So those are things to look for as well. And then finally, you might just be able to see global relationships that the replication fork is not as connected to the kinetic core as the nucleosome is. And those types of relationships might give you some sense about how closely connected general processes are. And some of those might be non-trivial. So it gives you a good summary. So whenever you're looking at a network, you can use those three ideas to try and help to try and interpret it. There's actually a couple of different ways of representing, a few different ways of representing networks. Just to mention it, we're not really gonna go over this too much, but some of these might be useful for you. The typical, and just generally as a concept, typically most people will work with networks in two ways. One way is a list of relationships. And this is if you have a spreadsheet with gene A connected to gene B, gene B connected to gene C, and you just have a big list of those things. That is sort of the default way that people store the information on their computer. And you can have weights or other attributes associated with those. So here we have A connects to A1 connects to A2 and it gets a weight of one, A1 connects to A3 and it gets a weight of three. The most of these relationships are not directed. They've got an arrow on both sides, in this case it means undirected, but one of these things is directed. So you could have a type of interaction, a type of relationship that you map, directed, undirected, it could be other types. Blue and green here are highlighted as blue and green and blue and green here and these other representations so you can see what they look like in different representations. But this is the core concept that you really have to understand to use network analysis tools. That you represent data and think of these things as columns in the spreadsheet, one column, two column, three column, four column, and you can add more columns here that represent additional attributes. In this case, when you represent columns, additional attributes in this type of format where you have A connects to B, these attributes are associated with the relationship. So the fact that we have three here is associated with A connects to A1 connects to A3. Three is an edge attribute in this case. You can also have a separate table of node attributes that you store your information in about nodes. So the network we've seen these networks, this is what this network looks like as a list. Sorry, this is what this list looks like as a network. And obviously it's much clearer how the relationships are. So this is really the power, this illustrates the power of network visualization. Some people also might want to represent networks as a heat map. We're not really going to cover this too much, but you might see this, it might be useful for you. The network is not, well, you can think of it as a list and you can think of it as this network and you can also think of it as a matrix where on one side of the matrix, you have all the nodes and on the other side, you have all the nodes as well. And you put a color or a number in the matrix where node one connects to node two. So in this case, there's a one connects to a three and you can say a one connects to a three here and we colored the higher the strength of this connection, the darker the color red here in this heat map view. We've also clustered this heat map. So we put columns that are similar to each other close together and that helps you kind of visualize things that are related in this heat map. So a number of papers actually use this representation, they find this representation useful. If you're really interested in when to use this one versus this one, in general, networks are 95% of the time people are interested in looking at networks, but networks are only good for sparse when the connections are sparse enough. If everything is connected to everything else, a network is not a good way of representing it. So then this heat map view actually becomes quite useful. And alternatively, the heat map view is not great for sparse information because it wastes a lot of space. So all of this sort of light yellow color here is represents that there's no, basically means that there's no connection between any of these nodes. And so you can see there's quite a lot here and it's really just using up space that is not useful for you. So in this case, this is a more efficient representation. So that's the relationship between them. Any questions so far? Okay, so that was very basic. Just some general concepts behind network visualization. It's not really that complicated. So we went over, didn't take that much time to go through all of the information, but basically, the key things, automatic layout is really required. It's the first thing that you need to do to visualize networks. Once you do that, you can start thinking about how the network is telling you about if you can find, basically trying to find interesting relationships in your data. And hairballs, these highly connected networks that you can't really interpret can be avoided by focusing your analysis. And you can visualize a lot of different information at once. That's useful, I didn't really mention this. Forgot to mention this, but when you visualize information, all of these multiple different data on a network all at once, it is very useful to see the relationships among not just the genes, but also all of the other data that you're visualizing. And so it sort of integrates a lot of different information together. And you can just quickly see the relationship, hopefully easily see the relationships. Okay, so any questions? So I'm gonna switch to site escape and I'm basically just gonna go through a demo of site escape, going through the basics. What we've done in, actually I'll talk a little bit about site escape first and then go into the demo. But what we've done in the slides that you have is I'll present a few slides talking about site escape and then all the rest of the slides in the book are just copies of what I'll be showing you live and they're there for your information so you can reference them during the lab. Try to take screenshots of the menus that you have to use to access different types of features. So you can have it as a reference. Okay, so site escape as I mentioned is a network visualization and analysis software and you've hopefully all installed it on your computer. This is what it looks like and it provides a lot of different functionality for literature mining and gene ontology analysis and searching regulatory motifs and networks. There's a lot of different types of information available. It's made available by a number of different groups. There's actually nine academic and industry groups that collaborate to build this software over since about 2001. And the basic idea for using this type of software and network analysis in general is you wanna collect information about relationships among say your genes in your list from different sources, databases, literature, expert knowledge. This is where you might know something about relationships among your genes that nobody else knows so you put that in there and your own experimental data and you collect that all together. Usually you're collecting this in a spreadsheet like Excel or something like that or tab delimited files but you can also load it from different databases and then you visualize it and analyze it as a network and site escape. So site escape allows you to manipulate networks. You can open them and visualize them. It provides different ways of, different types of automatic layout. It allows you to filter and query your network so just give me everything involving protein kinases. It allows you to search different interaction databases which we'll talk about. So you can pull in protein interaction or pathway information from different sources and there's a, that's sort of the basic functionality of site escape but there's a really big active community around site escape that makes available quite a lot of user information, tutorials and case studies and documentation and also there's a lot of the analysis functionality comes from plugins. These are things that you can download that extend the functionality of site escape. So when you download site escape by itself it does a really good job of visualizing networks and it can pull in data from different sources but if you wanna do a lot of network analysis you have to download additional plugins and we can talk about that. Okay, so that's basically it before we move to the demo. Does everyone have site escape installed on their computers? How many people don't have site escape installed on their computer? Has anyone not verified that site escape was working on their computer and tried it out? Has everyone tried it out? Okay, so there's one caveat that we saw with a couple of people had problems with. When you start site escape there might be a couple of different ways of starting it and if you double click on the site escape jar file it won't load any of the plugins. So it'll look like site escape is working but there actually will be this window. The first thing that will tip you off to that is this window at the bottom will be missing because this is actually a plugin. So you have to start site escape depending on how you installed it using the site escape icon or if you're on Windows the batch file or something that will load the plugins up. Okay, so basically I'm gonna go through just giving you a tour of site escape and then in the lab we have some specific instructions of files to load up. I'll be loading up the similar files but you can mostly watch me right now just to see what site escape can do and ask questions to see if there's anything interesting. So usually once you load site escape up you wanna load up a network from somewhere. So probably the simplest way of loading up a network is if you have a network from some other source that you've been collecting and you have it available as a text file or an Excel spreadsheet, you can click on load a network from a table. Let me try this here. So I'm gonna load up files that are available in site escapes sample data directory. So on my machine I've installed, I've got lots of versions of site escape but they all have a sample data directory and that sample data directory is filled with all sorts of basically little files that you can try out. A lot of them are called GAL filtered. They start with GAL filtered. That is the, it's a network of protein-protein interactions and protein-DNA interactions that was derived from a paper almost 10 years ago that was looking at how different transcription factors or regulators of galactose metabolism if you knock them out, how they affected the network and there's some gene expression data associated with it. So this is just a problem and it's a reasonably sized network so you can use it and get the idea of how site escape works. So there's a lot of different versions of networks of this GAL filtered network in here. If you, and I'm gonna load up an Excel, see if I can load up, yeah. Okay, I'm actually gonna load up. I forgot that the Excel thing is only for nodes here so I'm gonna import a network from multiple file types and I'm gonna select the SIF format. So the SIF format is a simple interaction format. It's basically just like that list representation that I showed you in the presentation. There's also GML, which is more complicated format and XGMML, which is even more complicated format. So those are really formats. SIF format is sort of the simplest version. So I'm gonna import this here and the first thing that you see, so it says that 331 nodes and 362 edges were loaded and the first thing you see is the network is sort of randomly organized. So as I mentioned, you have to go and lay the network out so the layout menu has lots of different types of layouts so I can try the circular layout that organizes everything in a circle. Let me just make this a bit bigger. The force-directed layouts that I mentioned are, there's quite a few. Some of them are actually called force-directed like this set-escape layout force-directed. So that looks pretty good and some of the layouts are not called force-directed but they are actually force-directed. The Y file's organic layout is one of the nicest layouts that looks pretty good. So there's a version of force-directed that's called organic and even this spring-embedded layouts are also a type of force-directed. So there's different types of force-directed. Here's a hierarchical network. So this isn't really a hierarchical, here's a hierarchical network layout. This isn't really a hierarchical network so it doesn't really look great. I'm gonna go back to the organic layout. Okay, so that's your layered algorithm. The next thing you probably wanna do is interact with the network and if the network is fairly big you won't be able to see much so you have to zoom in to see what's going on and I can click on, there's some mouse shortcuts for this but I'm just gonna click on this zoom button here and eventually as I zoom in I start seeing the nodes and edges and you can actually see the labels on those nodes. I can click and move these things around. I can click to select a set of them and move them around. If I wanna move around to different parts of the network I can use this navigator window here. So you can click this little circle and move it around to see different parts of the network. So I want to zoom up here to see these little guys that are separated. I can do that. So you can also do, if I wanted to rotate this, you can go to the layout option and you can say rotate and you get a little box that pops up here that allows you to rotate the whole network. If I click on this little box, rotate the selected nodes only and you can rotate those around. You can scale them. So you can make them, oops, and I need to scale selected networks only. You can basically make the edges in general longer or shorter and there's also a lining and distribute. So you can, this is gonna make the layout of this network but I can just align them all. So these little options here allow you to work with manual layout if you're not satisfied with the layout. Basically manual layout means that you're gonna click and drag nodes. If you want to align them and align, you can use some of these tools or scale them. Some of those are useful. So I'm gonna turn this off and we lay out the network. Okay, so that's the, you can also select a bunch of nodes and then you can zoom in to the selected region. Okay, so one of the things that you'll notice if you are working with networks a lot, especially bigger networks is, set escape doesn't show you all the information on them that's being visualized when you zoom out to a high level and it just does that for performance reasons. So you can interact with a network without drawing all the details because at this level, you can't really see all the details anyway but if you do wanna see the details, you'll notice as I zoom in here, eventually the node labels become visible right there. So if I zoom out a bit, the node labels disappear and that is, that level of zooming and the level of detail that you get are controlled by different preferences inside escape but just one tip, if you wanna always show the detail, there's a little, in the view menu, there's a button called show graphics details and it'll just force the labels to be on all the time and you can then, that might be useful for you, especially with very big networks. So in this view, the length of the bridges between the nodes or the length of the edges is created automatically and arbitrarily to make it into a nice looking nest. But you're here, so by moving the words around, you're not in any way changing them. Correct, correct. Yeah, so that's a really important point I forgot to mention with automatic layout. So most automatic layouts by default don't consider any information and the length of the, they don't represent information and the length of the edges. But some do, if you have a weight associated with the edge, you can make stronger weights, stronger edges closer, shorter and weaker edges longer and I'm not really gonna go into an example of that but there is this set escape edge weighted force directed layout and there's also an edge weighted spring embedded layout and those can be useful if you have weights on the edges. Okay, so the other, so that's the basics of moving around this network. Okay, so I mentioned that you need the sort of, the basic information is about a network so if I click on these, there's really not much more information loaded into set escape right now other than nodes and connections. So to load more information, I need to load attributes and there is an Excel file that is available in that sample data directory that I'll load with attributes. So this import menu, I used already import network, multiple file types so there's various file types of network file types that set escape loads. You can also load a network from a table or Excel and you can load a network from the web which I'll go over later and then this section here is importing information about attributes. So I'm gonna load attributes information from a table or Excel file and if you overload text files into Excel this interface is very similar so let me just cancel out of this first. You can load node or edge attributes here. You just have to select which one you're loading. You can even load network attributes. You just have to select that. So I'm gonna load node attributes. I'm gonna select the, and the files that I'm actually using are listed in your in the lab slide in your books but I'm gonna open this gal ex data pvals file and that stores a bunch of expression data about the nodes that I loaded in. And when I click open you'll actually be able to see what the file looks like. So this file has gene, some common name of the gene and then some expression data and this is actually showing you what it looks like here. You can make that bigger. So set escape, this importing attributes from tables by default is thinking about tab delimited file formats. In this particular case, the difference, the separations between these numbers and values here are spaces. So I have to click this little box here, show text file import options and then I get a bunch of options about, is it, do we need to think about tabs or other things? So I'm gonna click space. As soon as I click that set escape recognizes that it puts these into the proper format and there's different columns. I also notice in this file at the top the first line of this file has names for the columns and by default this loading panel doesn't load those up. So I'm gonna click this button here, transfer first line as attribute names and start at import row one. So as soon as I click that, these, those names jump up here. Now the columns are named by something that I like. So one more thing when you're, when you're loading this up is there might be columns that you want and columns that you don't want to load up. So one of the problems that I'm gonna face here is that some of these columns are named the same. So this is called gal one RG and this is also called gal one RG. If I try to import this, so this cable give me an error. It says you can't have two column names with the same name. So you see these little check boxes here, you can click on the column and it will turn, it will select it for import or not import. So I can just click back and forth and I'm gonna get rid of these guys because they're named the same. And this is the information about fold change that I wanna import. This is information about the P value of the full change, how significant the full change is. I'm not gonna worry about that right now. If I wanted to import that this way I'd have to actually rename these columns and a spreadsheet. Okay, so I think I'm done here. There is one other thing that's fairly important. I'm gonna turn off, which I don't need to do here, but I'll tell you why. So I'm gonna turn off the text file and import options and I'm gonna click the show mapping options. So when you are, when you've loaded a network, the names of your nodes, say those are gene names, gene symbols. Often people will use gene symbols for the names of the nodes. Those names have to be unique. You can't have repeated, you can't have two nodes that are called the same name with the same identifier visualizes two nodes. Sight Escape will just see, anytime it sees a node with the same identifier it will put them together it will just consider them the same thing. If you wanna have multiple names for the same thing you can load those names up as attributes and visualize the attributes separately. But the identifiers in Sight Escape for the nodes have to be unique. And if you wanna load attribute information on nodes and edges, you need to understand what type of identifiers you used for your nodes and have those matched in the file that stores the node attributes. So in this case I was lucky because the first column in the file that I loaded up has gene symbols, in this case just these yeast gene symbols. And the node names, the node IDs in the network that I loaded up also have the same symbols. So Sight Escape can recognize that Y-H-R-O-5-1-W over here is the same as Y-H-R-O-5-1-W over here and it will then link all of this data into that node. If your first column of your data is something else that's not an ID, then this import won't work immediately. And so if that's the case you can just click on the show mapping options and you can select the column that is the primary ID that should be used to match your nodes up. So just to repeat, if you have your primary ID always in the first column you don't have to worry about this but if you don't have it and you find that your attributes are not loading up this could be one of the problems you just haven't matched up the names properly. So I'm just gonna leave that because it's fine and I'm gonna import that. Okay, so one of the problems with, so everything imported, one of the problems with the current version of set escape is it doesn't tell you that everything imported correctly and you don't actually get any feedback. So this is something, a user interface bug that we have to fix. And so I'm gonna have to show you where that data went and so you'll know and you can look for it next time when you're loading in your own attributes. So this panel here we haven't really used much. This panel here is pretty important because we're using it to navigate and this panel I haven't talked about, sorry, this panel stores that just has the list of networks that I've loaded up. You can load up multiple networks and it'll kind of be shown here. So the, yeah? So if you load multiple networks, what happens? They just appear beside the, they'll be in different windows and you can click, you can use this panel to select back and forth and you can go back and forth between them. Okay, so I'll show you that. Yeah. Do you have existing? No, you can do that if you want but that's merging networks. That's a separate operation. The, if you're, so that's a good question. So if you have lots of different networks up here, the, ideally, I can test this but ideally if you select the network here, well, actually the attributes are fairly global. So there is a way, can't remember the exact way of doing this but I might, if I can't figure out very quickly, then I'll stop and I'll figure it out later. Oops, it's the wrong import. So I'm just trying to answer this question here quickly. Yeah, I think that in general, the attributes are loaded globally and you need to select, if you have the same IDs and multiple networks, all the networks will get all the attributes and a way to avoid that is, there's different ways to avoid that but that's actually something then the future version of set escape. You're gonna have control over, so you can load up. Yeah. Yeah. Yeah, so I can talk to you about that later. That's sort of the general way that set escape is working is the attributes just get loaded up. Okay, so I've loaded up my attributes and you can't see them now. So I'm just gonna show you, talk about this panel here. This is the data panel. This is where you can see the data that you've loaded up and you can click on this button right here which is meant to represent different columns and if I click on it, I can see that the columns that are available and none of them are selected. It'd be ideal I think if they were selected automatically and I can, I loaded up the gene name, the common name and I loaded up these gal1rg, gal4, gal80. I can click on those and click on the common name as well and then I click outside of the box to get rid of it and there they are. They're in this, you can see that they've been loaded up here. So that showed up because I had selected one of these nodes. The nodes are by default, when I click on a node it selects it and colors it yellow. So I can also select a bunch of different nodes and as soon as I select them in the network, all of their attributes become visible here and I can scroll up and down and see information about them. You can also select nodes in here and as you select them, they'll be colored green so that's sort of an additional selection thing. If you don't want, if you loaded up the wrong attributes, you can delete attributes so I can delete some of these attributes and they'll be gone. You can start again. You can also create a new attribute here and this is completely editable. It's a little bit like Excel that you can click on these things and if you double click on it you can actually change the name of things. So there I changed the name. I can change these numbers if I want. I can create a new attribute and just type information in. It's usually not very efficient to do that. Usually you're loading it in from Excel and there's some other sort of functions over here that are less used. Okay, so I have a network. I have attributes and right now I have node attributes that I'm showing here. I can also select edges. So when I select edges, they turn from, in this case, they turn red and I can click on the edge attribute browser here, this little tab, and click there and I can see attributes that I've loaded up for edges. In this case, there's only one attribute which is the interaction type which was in that CIF file and there's different types of edges here. There's pp which stands for protein protein and pd which stands for protein DNA. You can name your interactions anything. So I just keep, doesn't know or care about anything, what information you're loading in. You can load any kind of network. You can name your edges and nodes anything you want. It doesn't have to represent genes or proteins. It could be representing anything, any people or anything. And you just make up these names. They just have to be consistent. Okay, so before we move to mapping the attributes to the, okay, so maybe I'll just do attribute mapping right now. Okay, so I have the nodes, edges, I have the attributes here. I can click back and forth between the edge and node browser. This is important to recognize because some people accidentally click on edges and then they can't see their node attributes so they need to go back. And now I want to visualize these attributes as visual properties and set escape. I want to do that mapping. And this is pretty much the unique, most powerful feature of set escape and this visualization mapper. So if I click this tab up here, this mapper, I can manage visual styles and visual styles. So this mapper maps attributes to visual, data attributes to visual attributes. But it also manages just the default styles. You can save different styles. A style as a set is a mapping and a mapping. You can define multiple mappings. The defaults are, the default visualization is shown here. So this is showing one interaction. If I click on this, I can change the default appearance. So I can change node properties, edge properties and global properties. And this is a good way of, this is a good place to play around with set escape and just try and see what the different visual properties do. It shows you what they look like and what they all are. And you can change them and then you can see that updated in this picture and also in the network. So I'm gonna change the node fill color to something else like green. So that changes the color. Now all the nodes are green back here. I'm gonna change the border color to blue. And now the borders are blue. So you can change the node size, node label, all of these different attributes you can change. You can even create a tool tip if you want that where I now mouse over the node and I get these little tool tips. In this case it's not very useful if I just call them all hello. But you can map your data to tool tips and then you can see that particular type of data coming up. Let me get rid of that one. And then you can change the node label position. So if you click on the node, this is the editing the node label position. So I can take my label and I can just drag it anywhere I want. So say I want the labels to be here, then that's press okay and now all labels move. So this is very powerful actually how you can, you have quite a lot of control over all the different attributes. Same thing with the edges. What happens again when you change the node tool tip? I'll show you. I tried it and nothing's happening today. You have to hover your mouse on the node and then you'll see the tool tip pop up. And that'll? Yes. Yeah. I'll show you. I will pledge it to all. It's not very useful to change it here because that will apply to all your nodes. I'll show you how to change it in a node specific map version so you can pop up information about a node by hovering the mouse over it. And just one more thing. I'll show you some of the different edge. Oops. Here's some different shapes that are available for arrows. So in your notes and on the presentation I showed you a lot of different shapes that are available. We actually added new shapes in the latest version of Set Escape 2.7 and you guys are all working with 2.6.3, which is fine. It's not that much new. I'll tell you what's new. But one of the things that's new is that there's a lot more line shapes, line and node shapes. So if you go home and you work with Set Escape 2.7 you'll see more options here. Okay, so that's just, as I said, this editing the default appearance is just a fun way of playing around with the network. And I'm gonna cancel out of there. Or I guess it's not canceled, but now I'll explain the real sort of utility of this menu which is mapping node, your data attributes to your visual attributes. So by default, so this is a fairly complicated user interface here that I'll take you through. But basically you have, your active visual mapping is at the top here. And then these are all unused, they're called unused properties. These are all properties that you haven't used, you haven't mapped. And you can scroll down here and you can see all the different properties. There's node and edge properties. So by default the only thing that's mapped in this default visual style that comes with Set Escape is the ID. So when I select, I didn't really mention this explicitly, but when I select nodes here and I see them in the data panel, the node ID is in this case these gene symbols. And so that node label is, so this ID is mapped to node label and that's so that I can see the node ID here. I can change the label to something else. Say I wanna change it to the gene expression data that I loaded in. If I click that, now all the labels, the node labels are these numbers here. That might not be that useful for you. Usually you wanna have some name that's useful, but that's sort of the general idea. You can select a visual property, in this case node label, and you can select the type of data from here and then you select that they're linked together. It's a little bit more complicated than that because there's different types of mapping. This one is called a pass-through mapper. There's actually three types of mappings. The pass-through mapper just takes, oops, the pass-through mapper just takes this data and passes it right through and that makes sense for certain types of visual properties. So node label, it pretty much is the only thing that, the only type of mapping that makes sense that makes sense with node label is pass-through. You pass through data here to the node label, but if you have other types of information, this'll be more clear as I show you more examples. Say I want to visualize gene expression data as a node color. So I go down here and I find node color and it says double click to create. So I double click and it bumps it up to the top here to this area where the ones that I'm, the sort of active properties are being used. And it says please select a value. So I select one of my gene expression values here. These gene expression values are just full change values normalized around zero. So negative is under expressed and positive is over expressed. I guess that's full, normal full change. And then I have to select a mapping type. So I select a continuous mapper. Pass-through mapper doesn't make sense because passing this, because this is not a color. If I pass this number and it becomes a color, it doesn't make sense. So normally for numbers, I will use a continuous mapper. So I click that and by default, do you select all these nodes? By default, you got a simple gradient here that goes from black to white. And it says if I put my mouse over this gradient, it says click to edit this mapping. So I click on that and I get a little window that pops up that shows me the gradient. And it shows me that the minimum value of my attributes is this, the maximum value is this. It sort of selected automatically these two values to be kind of close to the minimum, close to the maximum as a default. And these little triangles here, I can play with, I can move around. So if I like this color scheme, but I wanna change the maximum and minimum where the colors change, I can just click these triangles and I can move them around. So as I move them around, so say I move this down so that everything that's close to zero is white, then you can see the network updates automatically. So I can move this and I'm changing the gradient live. That escape is calculating this gradient automatically based on the numbers in this attributes. Do the same thing here. Anything that's below this, that's on this side of this triangle is one color and then the gradient is actually defined between these triangles and then above this triangle is another color. So you can say everything above a certain number is just gonna be green and that will tell me that it's an outlier. So I'm gonna change the colors. So this would be a way to visualize your full changes, for example. Yes. Anything above two or below point two is considered significant. Set up your gradient that way and then you can just look at your network and say, these are my significant gene products. If you wanted to just see the significant ones as a single color, you could do that. But usually you wanna show the significant ones as a color gradient and you wanna ignore the ones that are not significant. So maybe you'd set up more gradients here. I'll show you how to do that. That a more complicated gradient that shows only things above two and below negative two as a color gradient. So this is sort of meant to be a visual thing. So I'll just show you two more things here. You can double click these triangles and if you double click them, you can select the color. So I'm gonna select red here and now the nodes are going from red to white. So you can also add, the last thing I'll show you is you can add a point here. If I click add, I get this little other triangle here and now I'm gonna set up a more complicated gradient where zero is white and high colors are gonna be blue and low colors are red. So now this sort of sets up a negative full change is red, positive full change is blue and things that are not changing too much are white. If you wanna do what you mentioned, you can add another point here and have to find that point to be like a range of light in the middle. That's not, that just if it's white, it's not telling you much. You can also click on these, if you click on one of these triangles and you're not satisfied with the number that's selected here just by dragging, you can actually change the size of this window and that might help you get a little bit more control over this dragging or you can change the actual number here. So if I want it to be exactly zero, I just type zero in here, press enter and now this becomes perfectly zero. So okay, so that's basically the complete functionality of this little panel. As soon as I close it, this thing updates, now I see that the visualization, that this gradient, what it looks like and you can see that all of the node colors, I'm gonna go back to the network panel here and I'm gonna move around. I can see the different node colors, the expression is now visualized on my network. And similarly to this, you can visualize any type of data and just up to your imagination, this expression. So what I showed you here is I loaded up a network, I loaded up gene expression data as node attributes. I looked at those node attributes, I noticed that the gene expression data is all full change, it's numbers and then I mapped those numbers to a color gradient using of node color using a continuous mapper. I'm gonna show you that tool tip thing now and then I'll ask questions, ask if there's questions. So you can go down here and look at the tool tips. You can have edge tool tips or node tool tips. So I'm gonna double click here and I'm gonna select the common name to come up as a tool tip. So I haven't loaded up the common name before, that was one of the columns in the attribute file I loaded up. So I'm gonna click that and it is a pass through mapper because it just passes the common name through to the tool tip. And now when I go on this node, I can see the node ID is the label, but if I mouse over this node, now the common name comes up as the tool tip. So if I'm exploring this network, I can show additional information as a tool tip that only pops up when I put the mouse over it. So that's just one of the features and there's dozens of them here. If I had multiple types of gene expression data, I can, if I had multiple gene expression values, I could, or if I had significance values, I can do something else. I could, one useful thing is to show the expression value as the node color and the p value you might wanna show as the size of the node. So small nodes are not very significant full change and big nodes are much more significant full change. So if I had the p value loaded up here, I could do that and then I would, but I'm just gonna try to show you how that works with just one more with node size. So I'm gonna take node size, node size here, and I'm going to map another gene expression value here, continuous and I double click to create. Now this size thing comes up and says node size will be 10 if it's really low full change and will be 30 if it's really high full change. But I can add another point here and drag this, drag this, oops, to drag this triangle here and I'm gonna make this kind of pattern where I have big nodes for high full change and if it's in the middle here around zero, then it's small, the node is small. So I'm gonna close that and see what happens. So now, let's see if that worked. So there's a, that one didn't work. That was a node that doesn't have any information about it. So that didn't look like it worked too well. I might have, I might need to change the, make these a lot bigger. So this may not be working. So the node sizes are slightly different but maybe I'm zoomed out too much to see them and so I can't really see the difference too much. Oh yeah, there we are. So now we have high full change or bigger. I think I just didn't, like the nodes are too small. So I didn't make the node size big enough to really see on this level of zoom out but you can see that the node sizes are different. So I'm gonna go back to this map and I can delete that mapping because it's not really useful. So I'm gonna, yeah, I could make it a lot bigger and you would see the results. Yeah, yeah, no, it should work. It should work to make them bigger. There's the node size. So thanks for that. Let me zoom out a bit. Yeah, so now you can see that the size is changing a little bit more, obviously. Yeah, so that's what you'd have to do if it was really zoomed out. But if say I wasn't, if I tried this out and I wasn't interested in it, what I can do is I can delete these mappings if I want and there is. Wait, wait, just a moment. There's no colors to the node, does that mean you have bold data sets? I have the node color representing this Gal 1 RG expression values, full change values and I have node size representing this Gal 80 R expression values, so I have them both at the same time. And well, yeah, so now this network actually has, there's four types of information that I've mapped onto this network. There's the node label, which is the ID. There's the tool tip, which is the common name. The node color is the full change of one of the expression experiments and node size is the full change of another expression experiment. So this would be useful if you wanted to see the relationship between two expression experiments and you want to look for very dark, you want to see what was upregulated in one and also in the other. You'd look for big nodes that were red or something like that. So now you can just quickly, I guess shows the power of this least simple visualization. You can quickly just scan this network and you can see, oh, this section right here has a lot of big red nodes. So this must mean something where these guys are highly expressed in both expression experiments. And so you can immediately see that just by browsing around. You don't have to do some complicated formula to calculate that. So it's an exploratory mechanism. It's not giving you a value on that, but it is telling you where they are. So when you're exploring, you can see them. And if you put more types of information on here, you can see more of those types of relationships. You have to set it up so that it's answering the question that you want. So you can set up, if you have, there's about 20 different attributes for nodes. Not all of them are very quantitative. So size is quantitative, color is quantitative. The node border color is quantitative, because it's a color. You can also change the node border size. You can change the node width and height separately. So you can look for different patterns like really skinny, thin nodes or big in one dimension. And if they're flat, then they're smaller in another dimension and other type of data. But if you run out of those things, there's sort of, you may run out of those, those, the number of visual attributes that you want to map. So I'll show you one other thing with a plug-in later that displays, shows a different type of display for certain types of information. OK. So should we, I think, Michelle, I think I was going to try and take up all the time to lunch with Side Escape and then after, yeah. Maybe I'll let people play a little bit. Sorry for lunch. Can you invert the approach? Here you can put the network first and then apply the data over the network. If you don't want to work with the network you do, can you put the data first and have Side Escape find a network for you then? You can't, Side Escape won't find the network. By default, you give, it's like Word, it just loads up data that you type in. And there may be plugins that help you, given a set of genes, find the network and then you can pull in more data based on a set of genes. So Lincoln's going to talk about one of those plugins later and then Kuwait is going to talk about even another one of those plugins later. So those are more advanced devices. Yeah. And then the information on that network is available from websites that we download? Yeah, so I'll show you that next. Let me just show you quickly how to delete mappings and I'll move a little bit quicker. So if you right-click on one of these mappings, you can do different things, including delete the mapping. I have to right-click here, I think, on the section here to delete the mapping. But you can also generate different colors and there's some sort of shortcuts. So let me delete the node size mapping. You can create multiple mappings, multiple visual styles, and you can use this little menu here to create a new visual style. And then that gives you a blank visual style or you copy an existing one and then you can work with it. And then you can switch back and forth between them. So here's one that's supposed to look like the universe which is just too dark. Here's another one that's sample that shows a different visual style. So you can flip back and forth between them. OK, so I'm going to move through a little bit. That's really the basics of side escape. There's a few other basic things that I want to show you. Let me go back to our default here. One is that if your network is really big, you might want to zoom in on it. I told you that that's important. So just showing you how to do that, you can select a bunch of nodes and edges. And in the file menu, you can create a new network from selected nodes, all edges are just the selected edges, so usually I do all edges. And as soon as I do that, I now get another network popping up here. It's a child of this network. And I can play with that one separately from the other one. I can make it fill out my screen, for instance. And then if I click back and forth between these guys, I can click back and forth between them. So that's really only useful if you have some values to select by. So there's a filtering system in side escape that allows you to create a, I'm just going to go through this pretty quickly, allows you to create a set of filters that you can query your data by, like give me everything that's over expressed in this tissue, if you have tissue information, you can have a fairly complicated set of filters. But the first thing I want to show you is this little thing up here, search. So by default, when you type in values into there, type something in there, it will search the node IDs. And so if I type in Y, then I get Y-L-O-3, and I'm going to just press enter, and it zooms in to that node that it matches. But this little thing is quite a lot more powerful than just that. If you click the box next to it, you can configure the search options, and that search can be executed on any data that you load. So I'm going to click full change here and apply. And now instead of having a little box that I type in, I have a little slider bar, and I can select a range of expression values. So I'm going to select all the negative full change guys, and they sort of are automatically selected as I move this thing. So once I'm happy with that, I click that, and now I can select that. I can move those guys to a new network. And this is the network of everything that has everything in my original network that is negative full change. So you can do pretty fancy things with that. If I'm interested in more complicated things, I can use this filters panel here, and I can create a new filter, and then I can take any attribute like that gal one, I add it to the filter, and then I can do the same thing. I can say, okay, these guys, add these guys, and then I can make, I can do Boolean combinations of those so I can just select the most over-expressed stuff, and I can apply that and select everything. So, and you can create multiple filters. Now, when you're ready to save your site escape session, you can save it. The file will be called a .CYS file, .Sys site escape session file. That's really a zip file that contains all the information about what site escape is. So site escape is storing. If you unzip it, you'd see what's inside there, but, and then later you can open that later to start up again. So the normal thing that you do to start site escape is you start working with the import menu, importing things in, and then once you have stuff in there, you can save it as a session, and then open the session later. One, that's basically it for most of the functionality of site escape. There's a help menu, and there's a help desk that you can email, but otherwise there's quite a lot of information about the contents of, there's quite a lot of help available. There's also this little error console. If you have problems with site escape, you can click on that error console, and you'll see information that all of this information is actually normal, but if there's some error there, that you're having problems, there might be an error there. You might be able to see it. Okay, so just one more, one more thing, actually, yeah, okay, sorry, two more things. Forgot to show you the web services. So your question, give me network information. So this assumes that you have network information from somewhere, you can download it from different websites, but there is a way of getting network information for from within site escape by default without adding any plugins, and it's in the import menu, import network from web services, and by default you'll get an option that says import from pathway commons, but if you add more web services, which you can get as plugins, you can find additional options here, but I'm just gonna try the pathway commons one. So step one is you search for gene. Unfortunately, and the reason why I'm not more, even though we made this plugin, I'm not more positive about it right now for this course, is that it doesn't allow you to select a, to give a set of genes, you have to give it one gene at a time, and the future version you'll get a set of genes. And some of the plugins that we'll show you later allow you to start off with a set of genes and grab networks. The only issue with getting these networks from these two different places is, as Lincoln mentioned, there's problems with coverage versus depth. So Reactome has, you know, doesn't cover every gene, but it might be quite useful if you're just interested in overlaying your gene expression data on specific pathways. So you can do that with this plugin. You type in a gene or a pathway name. So I'm gonna just type in TP53. You can select, not only a few organisms that you can select to search specifically, but I think if you do all organisms, it will find anything that Path with Commons knows about. So this is a database that Lincoln mentioned that has a lot of different pathway databases. So I clicked, I gave it TP53 and I said, okay, what's, what do you know about TP53? And it takes a little while and it downloads a whole bunch of stuff. So I'm gonna move this over here and so you can see the results. So I can, I can select different, different TP53s that it found. So it found TP53 in mouse. It found TP53 in human. And if I click on TP53 in human, there's a whole bunch of pathways that it finds from Reactome and CI nature pathways and cancer cell map. And there's also interaction networks. So I can download, there's 1100, almost 1200 interactions for P53 and there's different types. I can filter by data source or by interaction type. I can just give me, I can just give the physical interactions or the complexes that it's part of. But I'm gonna go to the pathway one and I'm gonna check one of these things like one of these first pathways here. So it says double click pathway to retrieve. If I double click that, it will go and download the pathway from Pathway Commons. It's the Reactome version of this cell cycle pathway, G2M checkpoints. And when it's finished, it didn't break. Looks like it didn't really work that well. It, I'm gonna move this over here. So I think that it didn't work that well. So it actually did load up the network, but it didn't lay it out. It just loaded up as it looked like a square. So this is some bug that we'll have to fix. But if I lay it out with the layout algorithm, then it's, you can see the pathway there. This is all the relationships in the pathway. And probably this VisMapper style is, it didn't get, yeah, so there was a visual mapper style that this is supposed to apply that gets supplied there. But if I zoom in on this network, you can see that, yeah, so this is, I'm just gonna quit set escape and start again so that I can show you that properly. Oops. So I'm gonna import network from web services, type in BRCA one this time, maybe it's a little bit smaller. And I'm gonna get this react on pathway and double click. And there it worked this time. So now, even though it's not laid out, it shows you the visual style. I'm gonna lay it out using organic layout. And now I see this pathway here that has a whole bunch of information about it. I'm gonna close this network, import network panel because otherwise I don't have that much room on my screen. And now I can click on these nodes and I actually get a little bit of information about them from pathway commons. This is a complex. These are proteins. There's also small molecules on here. And if I select a bunch of these nodes and look at the data that's associated with them, all of their IDs are just these numbers. Those are pretty much random IDs, but I can select a whole bunch of information about them like synonyms and the entree gene IDs. And now that information is being pulled in here. So these are the names of the proteins, the entree gene IDs. If you have your gene expression data, in this case is a human pathway. So if you have your gene expression data from human and you are associated with entree gene IDs, you can load that up and visualize your gene expression data on these pathways. And that's actually very useful if you're looking for specific pathways. And the problem with the, so what you'll find when using these tools is that the field yet hasn't solved this problem completely of gathering all the information that is known about the gene. So you can go to Reactome and you can get everything. Reactome knows about pathways. I put a bunch of links in the Wiki and the workshop Wiki. So one of them is IREF index and IREF web, which is a database that is made partially in, or a website that is made in Toronto by Brian over there and Shashender Wodex lab that collects protein-protein interaction data from almost every protein-interaction database. But then that has all the protein interactions, but not all the pathways. And it doesn't have, and none of the Reactome or IREF index doesn't have information about functional associations, except for the human ones that Lincoln showed. So if you wanna collect all the information about the gene for any organism, there's no site, there's no like complete one-stop shop yet that you can just say, like NCBI is like complete one-stop shop for publication information or entree gene is pretty good for any gene. But for pathways and interactions, there still hasn't connected yet. So you might have to go to different sites and merge the data together and work with Excel to, if you're really interested in building the most complete network possible. And there's a number of papers that have come out that do that and the author of the paper actually spends a fair amount of their time just collecting network information about their genes of interest and then they can work with it. Once they have that resource, it's their own little database. So that's usually the current activity. In the future, you'll just be able to hopefully just do that in one button press. Okay, any questions? Oh, okay, so if I have another network here, if I wanna load something else from pathway commons like this network, this other ATM mediated something. Yeah, so there is a plugin called Merge Networks that you can use to merge networks and we'll say, okay, do you wanna take the union of the networks or the intersection of the difference? And if the IDs for both networks are the same, like you can, like node A and network A and network one is the same as node A and network two, then they'll be linked and they'll all be merged. So if you have different network data, you can merge them like that. If you're loading in data from pathway commons, there's another way of merging which it will ask you when you double click on another pathway, do you wanna create a new network or merge with this? And that will allow me to sort of build up a bigger network and I wanna lay it out. Okay, so let me lay it out very well. Yes, the pathway commons, it knows how to merge things automatically. But if you have your own network, so you have to make sure they have the same IDs if you wanna merge them. Otherwise, they'll just look like two different networks. Any other questions? I want to show you one more plugin but we can take more questions, yeah? Oh, just the different sidescape and the same one. Oh, the 2.7, you can actually use 2.7. I didn't mean to say that you shouldn't load it. Just the instructions for the course, when we sent out, we said 2.6.3. And the reason I did that is because the entire demo here was verified on 2.6.3 and I didn't wanna take any chances with the new version, which only came out a few months ago. But I've been using 2.7 quite a bit and it's pretty good. There's a couple of new features which I'll mention of time right at the end. I guess we're pretty much out of time now. So I just wanted to show one more, yeah? Well, as you guys have argued with that, that if you sort of think that this is yet one more, then I'm just gonna show you how the IDs will continue to want you all the way through the end. Yeah, so if you make sure you pick the right, like, good stable IDs at the beginning and not just any name, then that way it will help you a lot. So, okay, so one more thing that I want to show you. If you wanna download plugins for Side Escape, there is a link on the workshop page of where to get plugins, where to see descriptions of plugins. But if you click plugins, manage plugins, I think you guys have already done this to install your own plugins. You can go shopping for plugins here and there's a whole bunch available that you can select. I've installed the Vista Clare plugin and one of the, I'm just gonna open a, you, technically you can't install too many plugins but sometimes a plugin that you install might break your Side Escape for some reason. So if you just download all the plugins right away and some people do this, they just download every plugin and then the plugins might interfere with each other and then you can't get Side Escape to work the way you want and you'll have problems. So you shouldn't do that. You should just download plugins one at a time and understand what they do. I'm just gonna load up the GAL filtered session file. This is the same network that I showed you last time but I just wanted to show you the Vista Clare plugin. So all the steps that I showed you with the mapping your gene expression data to color, you can do automatically with the Vista Clare plugin. If I click on the Vista Clare, when I load up the Vista Clare plugin it adds this little tab at the bottom here and if I click that, I have to start by clicking sync and Vista Clare then gets all of the gene expression data from the network and puts it here in its own little window and it shows you the gene expression as a little heat map and then what you can do is you can press this play button here and it will cycle through the gene expression and just show it to you as a movie and it's probably hard to see here because I think there's so much information. It's actually at the very end here it's cycling through these different expression, the three different expression experiments that are loaded up. So if you have lots of expression experiments you wanna play a movie, you can do it or you can just select one of these and as you click here it will change the view. So this is like an automated way of helping you do the visual, visualize your gene expression data and actually calculate all that visual style stuff automatically. You can also do one more thing here. You can right click on this here and there's all sorts of interesting views that you can do. Like you can do a ink blob view which shows you this is a different view if you like that one and you can also display heat strips which are little things that pop up on this network and you can zoom in and you can see that the gene expression is mapped as a little bar chart under each node and so that might help you visualize multiple gene expression experiments at once. You can turn off the heat strips. So that's just an example of a plugin that does something interesting. It's a good plugin for gene expression visualization and exploring gene expression data. I happen to, it was the data that I loaded up originally, the GAL filtered example but you have to load up your gene expression data first so you load up your network and your gene expression data as I showed you with the GAL filtered example and the instructions for the files that I used are in your binder but then once you have that you can press the synchronize button and it will gather all of that information. So the size of the file that you can put in terms of gene expression, we have a network of limited size but your expression data may have way more. Right, that was a good question. Does it select only the things that it recognizes? By default, when you're loading an expression data, it loads up only the expression data it needs to to show on the network and if you have add new nodes later you won't have the expression data but there is a way of forcing Site Escape to load all of the expression data so when you load up attributes from table, I think, you, I think there's this button here, import everything and that will just make sure that everything's completely loaded up. So here if you had a small network you would give you the expression data for all that network plus a whole bunch of other. So it wouldn't show you the other information. You have to have a node or an edge to show the information. And what is the size of the other big network and only 10 genes or only those 10 genes in that big network? Yes, correct, yeah. Yeah, any other questions? So how many people are kind of playing along while I was doing things? Okay, so we have to deal with a little bit of timing. This lab was supposed to be probably taking too long. So if you want to play along, you might have some, we don't have the evening session tonight but hopefully we'll have some time in the afternoon. I don't know, do we? I think we do because there was, I think I've covered everything that I was wanting to cover and there was an afternoon session for me as well which is just continuation of this lab. Yeah, so you can try it with some of your own data. It's a big package. What we tried to do here was give you the basics then core concepts so that you will understand how the system works and you can download new plugins now, you can load up your data and you understand how to load up data but there's a lot of different places you can get data for, a lot of different ways you can visualize things. And to learn more about Side Escape, you should really try it out with your own information and your own data and after lunch we have some chance to do that but you can't do all of it in a day so there's, because it's complex so, but hopefully the stuff that we've given, what I'm trying to say is hopefully the stuff that we've given you today allows you to not worry about to understand everything so that you can do it yourself later. You have a good grounding. Okay, so I guess we're at lunch now and we'll come back after lunch and play with Side Escape more and then Lincoln will give us a demo of a particular Side Escape plugin that helps you do what he showed you, does what he showed you in his presentation, given a gene list, download a network and do some network analysis on it, which is pretty cool. But during that time in the afternoon, just after lunch you can ask more questions to everybody about Side Escape. Okay, and one more thing, sorry, I put a bunch of links in the module three Wiki section on the Wiki page that points to interaction databases that are cool and I mentioned some additional plugins there and just gave a brief description about what they do, that are plugins that are particularly useful that people like to use a lot, so you can check those out.