 Welcome to this short introduction on the core concepts of network visualization. In this presentation, I will focus on visualization of the networks themselves, not data onto networks. It will thus all be about nodes and edges, and if you don't know what that is, you should go have a look at my short presentation on the core concepts of network biology first. Why is it hard to visualize networks? Well, it's not always hard. If you have a small network like this, it can be quite easy. But when dealing with networks with say a hundred to a thousand nodes and a thousand to ten thousand edges, you will typically end up with something looking like this. So how can we lay out big networks? There are two fundamental types of layouts, and to choose between them, you have to ask yourself this question, what is most important? The nodes? Or the edges? The bad news is that you have to choose, because the answer to this question decides the fundamental type of layout you want to use. If the nodes are most important to you, you will generally want to use 2D layouts. That is, we spread out the nodes in 2D, allocating most of the real estate to showing the nodes and relatively little space for showing the edges, the structure of the network. The most common 2D layout algorithms are false-directed layouts, and the result is unfortunately often the type of hairball that you've just seen. If the edges are more important to you, you should look into 1D layouts. This is an example of a 1D layout. And as you can see, most of the space is now allocated to the edges, showing the structure of the network much better. The one I showed is called a circular layout, because all the nodes are laid out on a circle. There are other 1D layouts, including hive plots. The problem of all 1D layouts is that because the nodes are not spread out in 2D, you have very little space for them, and you end up with tiny nodes. It is therefore impossible to label all these nodes and tell, for example, which node is which protein. A possible solution to this is network clustering. In network clustering, we try to first capture the structure of the network by identifying dense subgraphs. That may sound theoretical, but if you're working with a physical protein interaction network, that simply means identifying protein complexes. And if you're working with functional associations, it would give you functional modules. There are many algorithms for finding these. Two of the most popular are Markov clustering, also known as MCL, and M-code. Regardless of which algorithm you use, you can subsequently use the clusters to help guide your layout so that nodes that are in the same cluster in the network are close to each other, that way revealing the structure of the network in a 2D layout. You can also use clusters as a starting point for doing network simplification. When you're trying to visualize a very large network, less is more. You may be better off not actually showing everything. The easiest thing to do is to only show intra-cluster edges, that is, remove all interactions between proteins in different clusters. If we do that to the hairball network, we end up with something that looks like this instead. And the reason why it's possible to make this network look this much better is quite simply that there are much fewer edges now. We've removed the vast majority of edges. Alternatively, we could turn each cluster into a node and only show the interactions between different cluster nodes. That way you get a network of clusters which will have both much fewer nodes and much fewer edges and therefore be quite easy to show. The downside is, of course, you're no longer showing the individual proteins. Lastly, you can use clustering for sorting a 1D layout so that the nodes that are in the same cluster are next to each other on the circle, for example. That will allow you to label the clusters, even if you can't label the individual proteins, so that way, for example, assigning, this is where we have this complex or this is this functional module. Finally, you can use layouts to encode information in the position. You can, for example, encode time. This is a cell cycle network and the position of the nodes in this network reflects when in the cell cycle those nodes are expressed. Similarly, you can use position to encode localization of proteins within a cell. You often see this in pathway charts. So you now have the position of the nodes not just being based on what something interacts with but where it is in the cell or before when it is expressed. This is all I have to say about network visualization. If you want to learn more about networks and how to work with them, have a look at this presentation. Thanks for your attention.