 who is going to talk about this amazing library and he is going to show us our company networks. Good afternoon. Everybody, so I hope you're not here only for the comfortable chairs and wait for the lightning talks. In this meantime, I will tell you something about Netflix and Bokeh and how you can plot networks even if there's no support in there for directly plot networks. About me, I'm a junior software engineer at Blue Yonder. I do not use this at our work, it's just a site project so at the moment it's not used in our company. Yeah, that's it. So I hope most of you maybe heard the talk of Fabio yesterday. So do you hear it? So most people know what Bokeh is. It's a great visualizing library. And yeah, I will show you basics, how you can handle data, how you can manipulate it, that you can go back and change something or get effects. So why did I do this? So during my master thesis I was working with networks, some kind of social networks. We wanted to explore them and the problem was we wanted to see them. We wanted more than just tables or some columns to read them about them. So we wanted to visualize them and we wanted to see some properties. We wanted to see it in the browser so maybe we wanted to include it into an app. And I came up with this. I generated the networks and the properties and I stored them in a database. Okay, I wanted to visualize it with D3. D3, it's a nice swish, swish knife or what it's called but it's extremely complicated but it's powerful. So I had to provide the data from the database and I created in Rust. So it's a lot of overhead and a lot of programming just to do some visualization to explore. So the question was can we do this better? I was thinking okay, I will try the same now with the library and it's much easier. So I did not have to handle any JavaScript code at the moment. I do not have to care about how do I get the data to the client running in my browser. D3 is doing this for me and I also can explore, start my visualization app in a notebook, in a Jupyter notebook and this is really great. And on top of this I can change a network. I can change my graph, I can manipulate it and I can effect specs. So if I select something I can get this back. So I will now show you how it's done. So I will create a network, I will show it to you and all the code you need for it is part of this slide. So I did not let any code out. At the end there's a more complex example there maybe but here you see all what is necessary. So I need some example data. So I was thinking about using some, usually this example data like Lemyserrable or something like this but yesterday I had the idea. So we had the EuroPython and people like to use Twitter. There's some nice Twitter modules like TPP. So I used it at the information from the user EuroPython and now the user EuroPython sometimes he is linked to a lot of people or an author uses EuroPython and links to another people. So I can create, use this data and create some kind of social network. So authors are connected to each other, maybe they treat it more so they, the weight on an edge might be higher. So this will be useful for a network. So I have my data now. What, it's the next step. I need a network. So as I said, sadly at the moment Bokeh doesn't support it out of house but we can do it our own. So we use network X and we load our EuroPython data. I could have done it live here but I was a little bit afraid of the Twitter limit so in the Wi-Fi so I did not do it so I stored it in a GML file. So I created the network using network X and it has an author and function provided to a GML file so I now import this file back and I get my network. What I do now, network X can't draw but it usually draws with my putlib and it's static. So I can use the layout from network X to create a layout and I can use this layout to fill in Bokeh and there I can get an interactive visualization. So I put in my network, I put some values in Bokeh case just says how much distance you have between some nodes and it's an iterative algorithm so I can say some number of iterations. If you're a little bit more interested what it exactly does, you can go on the first Wikipedia page force directed graph drawing. So what it basically does, it creates spring forces between nodes and then you have a 3D model and put it on a table and then it tries in a few iterations to get rid of the friction and then you have your nodes on some positions. So this is basically a spring layout or a force directed graph drawing. I will use this layout now or later. Now we have to do some work around, not work around, we have to get the data in a format we can use in Bokeh and the cornerstone in Bokeh is usually I would say it's the column data source. It's one kind of I think three or four data sources but I think it's the most important. It's the one you probably would see first. So it's a class where you can store data column based. So you see on the left there's an ID of course because they're usually all lists here and I store there the X coordination, the Y coordination and the node name. So the first row says it's my Twitter handle is located at the position 213. And the nice thing about this column data source is you can change it. You can add data, you can add columns and you can change it and you will also get effects back. So if someone selected a node in your graph this is the point where you get information about which node is selected. So you can use a lot of lists you can tuples, you can use pandas data frames to create those lists but at Netflix you usually have a directory first and so we have to do a little bit of transforming the data and this is a drawback at the moment so you have to copy the data. So I get the layout, I have the items so the key in the layout is usually the node name and after the node name the value is a tuple of the coordinate of the node. So we have to extract those values and we have to put them in lists so that we can create our column data source. So we just extract them, they use them in the dictionary and put it back in the column data source. So now we have our node source. Now we can finally plot something. How it is done. Here is a little bit of code. You can ignore first the hover code but just look at the figure plot. Figure just creates your drawing area. So you define how big it is and you say something else. You say which tools you want so tab means you can click on it and it will show you the data source. So you move your mouse above a node it will show the name because I know that I have the column name in my data source and also I have the ID or the index. This is a property which is always there. And then you hover over it and you will see the ID and the name of a node. The next step is I want to see my circles and this is done by creating a renderer. It is the R cycle. It is a cycle renderer. Now I put my data source in here. You say source is my node source. Now I want to have x and y. Here is x and y. So I say the first is the column name x and the column name y. They will be used for the positioning of the circles. I have some fixed values for the lines and the level overlay just means it is above the lines later. It is 10 size. It is a network. It is just points. We need some more work. It is not so much but we have to add some edges. To add the edges we have to prepare the edges again. We just take the layout and the network and we extract the positioning of the nodes and we want to connect nodes. What we do here is I get the data off of the edges. If I say network edges and data is true, I will get the edges and the weight which is the data attribute for every edge. Now I calculate some maximum weight because I want to do some alpha coloring of the lines so I can calculate a value between 0.1 and 0.6. I put all of this in lists and now I can put back into column data for the edges and now I get a line source. Yes, now I can plot multi lines and I do the same circles. I put in the source and say the source is the line sources and I say for the first point of every line so now you have two points in those first two lists so line is defined by x, y for starting point, x, y for the starting point, y is the end point and this is just a name for the columns and here you see already that we use for alpha the name it's alpha so the alpha will be used from the column data source and you cannot see it directly here but usually the lines have different coloring of alphas. We'll maybe see it later a little bit better. This was just a boring network. We want to see a centrality or maybe clustering so we add those information to our column data source and it's not so complicated so networks provide some really cool algorithms so you can use for example multiple centrality algorithms. I have chosen here the betweener centrality. It just means a node where so you have shortest parts in your network and a node where a lot of shortest parts have to be used and now I have a centrality again it's a dictionary we have to transform it a little bit and I can use it and put in the values as a shifted a mapping to a range so I want to use this value for the size of the circles so I say okay the least important are the size 7 and the most important have maybe 17 it's just a range mapping and I say okay the new column for my column data says is centrality and I add it to my node source so my node source has now for every node has a centrality value okay so the next point is I wanted to have some clustering so which nodes people are maybe a little bit connected because they have been treated about each other so I use this Python-Movain module in addition to network X and it creates clustering for you so it's clustering is NP hard so you will not get always the same result and it's maybe a quiet calculation need some time to calculate it but for this size it's still great so even much bigger sizes will work so I would get a partition and now again I split up the partition get out the nodes communities here and the first you have again nodes I don't need them we have the communities and now I can again add some attribute or add a new column to our column data source it's community and now I have communities in my data source now I just do a coloring mapping because I want to have different colors I have a list of colors and I use the model operator to just give every group a color and now I can see another plot now I missed something so you have just the added new column but you are not using it the problem is the renderer I said we have R-Cycles renderer has still a fixed size and a fixed fill color so I just change them in my column data source and I say now use centrality and now use community color for the fill color and now can plot it and now you can see different colors different sizes and there's a big dot in the middle it's Europycin so yes I let it in there because I want to show you now I want to interactively remove it because I don't want to have a social network about people plotting twittering about Europycin if there's Europycin in it so it doesn't make much sense so we have to change it and we want to do it interactively so I want to see I want to click on a note here this is a little buggy because it's a slide show and usually it works also in notebooks you can go above I can show it here you can go here click on something and you mark it and then you can I want to remove it because I say okay it's a bad data set I want to remove it and I want to do some recalculation so what I can do I can do interactions and I can get out of column data source which nodes I selected it's a bit tricky data structure in here so you have 1d and 2d 0d is just for lines and patch glues all other glues like circles are in the 1d key and 2d are maybe some multi-line drawings like octets or something like this so we just go there use the 1d key and we have the indices of all marked nodes currently in our plot and what we can do now we can remove it this is just a sample code you can do it better I think so I get the index and I use the index to get the node from my network and now I can remove the node from my network I will pop it out of my layout but I have to recalculate or restructure my data in my column data source because currently they are not sharing the data so I iterate over all of the rows over all columns and I remove the index so you could also remove multiple of them and again then you update the data adjust the new data for every column and you add the dictionary for the updated edges and then you can remove an edge can remove a node but there's a problem okay it's great but it still has some problems not everything is working in a notebook and as you see I'm still in a notebook it's just a slideshow you cannot redraw data sources or I cannot redraw automatically if you change a column data source you can push your changes there or you can create a push and it will redraw it or if you run it in a bookcase server it will automatically redraw it because usually it will iterate over it and will check for changes or you mark it as trick and changed and another problem is you cannot use a bookcase server and I showed you here it's not working currently in a notebook the list will always be empty so you have to do this in bookcase server okay it's still great I can use a bookcase server to run my app and it's not much a problem so you can another floor back I have to say yes if you want to add widgets so your notebook can add widgets like sliders you can still save with pure Python function and pure Python callback functions good now I want to show you that you can do those interactions so as I said this is the Europe Python account of Twitter and I want to remove it so I marked it I can remove it and you see now it's gone and we have there's some other connections you see some strong lines those are connections between others you might be interested in it and I can switch back so you see a problem it's still there's no no central person in there because we remove the very central person I still have to update properties I push a button I call an update function I go back to my network it does some calculations I will get the information put it back in my column data source and I see now more interesting people who might be interested to you because they are twittering a lot here I think it's the OpenStack account they have connections to other people and yes but we still have the old layout so we can update the layout takes a while and now we get this layout looks a little bit weird at the moment because for the network for the Europe Python is a little bit I would say we have a lot of people who just twittered about each other but we still have a lot of connections and you also have still like here nodes they don't have any connection because we removed Europe Python but we did not remove nodes who have no other node attached we can fix this so we can remove it now so you see this is one is gone and I can reset the zoom back and I back here and I can update the layout again so I am looking out here here is my colleague he is sitting there and I think he treated the most of our people our colleagues and I can zoom in here and see which people he is tweeting about so here is another colleague and cool but I can now explore also what happens if he gets a meltdown and decides to go to Java or something and then I can again update properties and stuff like this so you see I did mostly an interactive network plotting in just a few minutes and I think it's quite handy if you just want to explore you can go further and do some more stuff and of course you can just switch network X it's a great library where you can switch it for other iterations if maybe you want to use something like this you want to do some heat development and you want to plot it just think about it you can do it it's not so complicated to bring it to bokeh and interactively change maybe what you're doing and bring in some values you wanted to change and I think that's it I hope you have enjoyed it and maybe learned something if you want to get the documents and the notebook and the data and how I get the data and you can go to our company Blue Yonder documents there are the presentations for this year and the last year here's the links for the network X and bokeh that's it is it working with all the layouts which are possible in network X with the bokeh or not can you customize the layouts more like there's like five or ten different network layouts they have a random circle layout but they are not so sophisticated but if I have like a specialized one like my own stuff can I use it through this as well so it will work or have you tried it no I did not try but if you generate a network where you just generate positions it should not be a problem so if you want to for example if you want to have a spring layout where you can move clusters nearer together I think you can just you have to copy it you have to create a network X and then you can bring in some additional forces to draw others each more together should not be such a problem so it's like in piplot right so you first draw the nodes then the edges and then I can put it into this one as well okay thanks just wanted to understand a bit better the connection between bokeh and network X so once you've done the initial graph with bokeh when you do some more things live back to network X again or not when you do things at this point yeah I go back to network X okay so I want to see a different centrality here closeness centrality it goes back to network X and calculates it it's not pre-calculated it's just Python callback functions they go back to network X call algorithms remove on network X and node and then you have to transform it back and then you can use it the thanks for the talk by the way the buttons I see here is this from bokeh or have you added this yourself okay this is something you have it's not in the slide so it's basically buttons from bokeh two lines you say I want to have a button I want to have then you add an update function to a button you bring it in a layout and okay it's three lines two lines and maybe another and then you have both buttons and they do something any more questions no give a big applause for bro