 Well, enough of the easy math, let's get down to something more difficult. My name is Jim Velandingham, and like she said, my day job is at a biomedical research facility for doing genomic analysis on worms and flies and other science stuff. But for fun, I like to look at data visualizations and interactive ones, and today I'd like to talk about the force layout in a little more detail and how to use it in a non-traditional manner. So we're going to be abusing the force layout. Typically, here's a, you know, we just saw an example of your traditional force layout. And this is typically utilized for your social network or your Congress's voting on the same stuff, you know, network-based stuff. But we're going to abuse this force a little bit and come up with some novel ways to apply it to other data visualization techniques to make your life a little bit easier. So first, I just wanted to get everybody on the baseline of what exactly, what we're talking about when we mean force layout. And I think, you know, for most people, I would agree that with the right amount of nodes in your, with the right amount of bubbles in your layout, it's a pleasing way. It's a pretty way to draw a graph. But it works underneath the covers. It's working as a physical simulation. So each of these nodes is kind of a charged force, a charged particle, excuse me. And they have these charges work to move to repulse or to attract them in different ways. And the links between these nodes constrain that movement. And the simulation works as a giant loop. So each iteration, each cycle of this loop, the charges are, impact the nodes, they move, and the visual display is updated. And that looping just occurs over and over again until it settles into a stable configuration. So that's the force layout in a nutshell. That's how it traditionally works. I'm a big fan of this concept that everything is a remix, which I first kind of latched on to from Kirby Ferguson's wonderful videos, which you haven't seen, put it in your headphones now and start listening to them, ignore me. But his main thesis is that in the art, in the creative world, new works are created by simple modifications of existing work. So you take some existing stuff, you copy, you transform it, you combine it to create new pieces. And his videos are about that occurring in the arts, in music, and in literature and movies. But I think it is also true, and I think we've seen this in a number of presentations, also true in the data visualization world. So part of this talk, I'd like to provide some remixable components, little nuggets of useful trickery that we could start thinking about applying to your own visualizations. So that means we're going to look at some code, and hopefully that's not a terrible mistake. So let's start with the force layout, but let's start out with just looking at just the nodes. So let's build up, this first part will build up a simple node layout in D3. D3 is a powerful tool for this because I think we'll see the succinctness that you can write and start using the force layout immediately. And the chances are you'll probably be using D3 and your other stuff, so it's a good that it has that in the toolkit that you could use. So like everything in D3, the nodes are going to be represented by some data. So here we have just our basic data sets in an array, we have objects and each object has some attributes here, we just have the amount attribute. To get a force layout started, it's pretty easy. You create a new instance of the force layout, you pass in your data, your nodes, as the nodes parameter, and then you start it. And then we'll talk about this here in a second, but you also want to listen to the tick event, and that, as we'll see, allows you access to the simulation, each iteration, each loop of that simulation. The important thing to remember about D3 in general and the force layout in particular is that you're not constrained by a particular visual representation, right, that's one of the big powers of the whole idea. But that does mean you'll have to do a little bit of extra work to get things going. So a couple more pieces of code. First you would have to decide how you want to visualize your force layout, so you have to visualize your simulation. So in these we'll just use some SVG circles, but really keep in mind that that could be anything, and we'll see a couple of examples of other stuff. And right now, so you bind your data, in this case the force nodes to that visual representation, and right now we won't use any attributes of that data, we'll just use some static constant for the radius. Now inside, when the force layout starts, you get, your data attributes get injected with some variables, or some more attributes if they weren't already present, and specifically we'll look at the X and Y attribute that would get added to each element of your data array. And these represent the current position of that, of the node being represented by that visual at every iteration of the layout. So you can use that to position your nodes, and so this is the tick function, this is the simplest example of the tick function. This will get called, executed every iteration of the simulation, and we can use the X and Y from being provided by our force layout update to modify the positions of the circles that we're using to represent it. So with all that, you get a couple of circles, right? But you can already see that this is a nice visual, attractive thing. And it only took a few lines of code to do this. So what's going on behind the scenes to help make this attractive are the forces that we've mentioned that are working on each of these nodes. I'd like to talk about, briefly, charge and gravity. Charge represents the repulsion or attraction between each node and every other node. So more negative values cause the nodes to repulse from one another, and less negative and positive values cause them to attract. So you can make cool stuff like that. Gravity, unlike physical gravity, it's not down to earth. It's more of a, I mean, it's described as a spring-like thing attached to the center of the visualization. So higher gravity constrains it to the center. And if you let go of gravity, everything floats out into space. So with all, just this beginner introduction knowledge to the layout, you can start to abuse the force in our own way. And the first abuse can be the example that Irene was talking about, the bubble chart. And this was first, or one of the first best ways this was shown was New York Times visualization that was looking at the Obama budget proposal. And so I'm using the term budget bubble chart just to mean we're using the node size to represent some underlying data value. So how would we be able to implement this type of visualization using the knowledge that we have about our force layout? Well, the first thing is easy. We can, instead of using some static constant for our radius, we can scale the size of our bubbles by the data that is represented, you know, that we wanted to represent. And that gets us half the way there. But unfortunately, everything's overlapping. Nothing looks good anymore. So what we do, the insight is that we can pass a function to our charge parameter. It doesn't have to be constrained by a static constant either. And so you can use, so this now function will get executed for each node in your layout at the start of the force at that start time. And so we can use the data itself to scale our charge, our repulsion value by the amount that is being visualized. And with those two changes, we get this nice effect, right? Each node is pushing away relative to its size. And the sizes are then scaled by the data that you're looking for. So it works pretty nice. And this is the summation of the whole talk, so study it. I think Mike Bostock said it very eloquently when he said the force layout is an implicit way to do position encoding. I'm going to remix that quote slightly and say it's a lazy way to move nodes around. But it's a good kind of lazy. So as we'll see in these other examples, we don't have to care about each individual location of the nodes. We instead impose simple rules in our simulation, right? And allow the nodes to find the correct positions based on our simple rule set that we're given them. Here we're just using the simple rule that charges relative to size, and that's all it takes. So how else could we abuse this concept of applying simple forces? Yeah, it's cute, right? I like the Darth Vader. The evilest character in the book, the other movie. Well, the original one here had this cool feature where you could split apart nodes based on some categorical value. So here it's mandatory versus discretionary spending, which brings us to the idea of imposing our own custom forces onto the nodes. So I recreated parts of the New York Times One visualization in this demo for the blog post about this is Gates spending, so the Gates Foundation. Sorry, the title's cut off. And the grant sizes that they've provided over the years. So in this, the categorical splitting we can do is over years. And so you get this really nice organic feeling, and it can merge back together. So it's pretty. And it's useful. So how do we do that? Well, here's our basic tick function. The whole general idea of custom forces in this example is we can modify this slightly to add in first a function that jacks with your nodes somehow and modifies the in position slightly, and then use that modified position to then position our nodes. So here's an example, and we won't go through all the code, but in this we would have some way to represent the two centers that we want. We're going to split up nodes from left to right. And we have a function called move towards center, category center. And this is a function that returns a function. We'll see why here in a minute. But so we grabbed the correct center for our particular node's value. Remember, this is going to get executed for each node, and then move that node based on the center's location. So that almost works. Here's our nodes. When we apply that custom force, we're back to everything slamming on top of one another. So what's going on? We've lost that nice thing that I just said was the whole point of this talk was that we didn't have to care about stuff. Now it seems like we'd have to care about position all this stuff. Fortunately, we have this access to this parameter, which is indicated as alpha. And we can think of it as a blending parameter that can be used to combine multiple forces or charges on these nodes. And it's accessible from the force instance, but you can also get to it from the tick function at each iteration inside of an event that's past there. And if we were to print those out, we can see that it starts around 0.1 and gets decremented slightly every iteration of every loop of that simulation. And then when it gets to close to 0.05, it'll end. So the numbers are arbitrary. But the idea is that this is the actual mechanism by which the force layout is settling down into, I mean, it can be considered a form of simulated annealing if you're into that kind of thing. But this is what's causing the stabilization of the force layout in general. So we can use this parameter. And here's the same code, but instead of, so here now we get to pass in the alpha into our move toward category centers. And with that small tweak, this is why it's a function return to function is because so it allows us to pass in variables without creating a global variable. We can pass them locally in here and have access to them in our closure down here. But with that small tweak, things now work as you expect. Now we're again not here when we separate them out. The alpha blending allows our charges that are now relative to the size of the nodes to be blended with our custom force. And it works with a lot of nodes. And it works if you drag them around, they'll stick to their centers. But again, we're not worried really about where each of these nodes goes. We're just telling them, you guys clump here, you guys clump here, and they figure things out. It's kind of nice. So let's look at another iteration of this. Right now, the repulsion, the charge that's associated with each of these nodes is nice. It keeps the nodes away from one another, but it's kind of an eventual process. So what if you wanted that to be more constrained, more formal looking distinction? Well, we can implement a real simple collision detection in the same manner. So this was done, I like remixing New York Times graphics. And by remix, you know what I mean. But this was done by another great piece where we're visualizing the words, different words said at the different national conventions during the elections. And I implemented again, it's a part of this, in another visualization that just looks at word frequencies in some books. But the point of it is that now, when we move these guys around, there's a nice hard line around each of them. They're colliding and they're maintaining that perimeter around them. To implement this, it's the same idea. We use another modification function. We can call it collide. And we'll just look at pseudocode here. Because it's not difficult, but it's a little long. So again, this is being executed for each node at every cycle of that physical simulation, the node layout. So inside of this, we can loop through all the other nodes and then do a simple distance check. If they're too close based on the data attributes that we already know in those nodes, then we just move them back by half. And so with that slight modification, we can get this nice look. It works with other custom forces, like the one we just looked at, and it works with any number of nodes that you put into it. And this is a brute force mechanism that you can optimize, but it's just a starting point for this kind of visualization. And you can do cool stuff like the demo of putting an invisible node underneath your mouse cursor and then everything, all your other little nodes are afraid of it, which is fun, too. OK, one more. And there's nothing that mandates that you have to move them back by half. So if you start increasing that movement, you can end up with this nice, but useless popcorn explosion. They're colliding with one another and getting scared and running away. Just fun. OK, so I think this is Dragon Ball Z, not really. Star Wars, but that's OK. It's a force. Let's add some links, finally. Links constrain the locations that our nodes can move around. Oh, yeah, here's the example with the links now. And as you might expect, the three links are also represented in data. So the minimum two things you need in your link data are a source and a target, which represent the two nodes that they're connecting with. You can do this either via an index into the nodes array or with the actual nodes themselves. And if you do it in an index, when the force layout starts, they'll be translated into the nodes themselves, which means you have access to both the source and target node data. You need to visualize them in some way if you so choose. And so we can draw a simple line in this example. And then, like I said, you have access to the source and destination or source and target data. And you can use their x and y coordinates to position your lines start and stop. And so the simplest example I could come up with is that. Links also have their own parameters. I'd like to talk about just link distance. And as you might guess, link distance just specifies the length of the links based on some value. So another insight is you can make that link distance driven by data. So instead of being a constant, it can represent some data value. In this case, the link data has some value distance that you could use to expand or contract each link at a per link level. And that's essentially all I've done in this particular visualization, where I was looking at communities in neighborhoods that have a sharp racial divide. So here, this is Kansas City. And each node is now represented by a census tract. And like I said, you're not constrained by bubbles. And so each census tract is connected to its neighbors by links and invisible links. And this hasn't started yet. But the length of those links is relative to the proportion of white and black populations between that neighbor and its neighbor, that census tract and its neighbor. So if there's a sharp jump in white or black population, so if you go from one census tract is 20% white to its neighbor being 20% black, then that link will be longer. So when we start this, it starts to break apart at areas where there's this real high division. And this is something that you understand kind of emotionally in Kansas City. But I wanted to kind of look at a way to represent how these racial boundaries affected the spatially very small areas in which they occur. So there's that. But underline that we're just link distance is the only trick on that. So we can start expanding these concepts. And the first thing I thought of when I kind of discovered the multiple centers idea is what if you put it into a circle? Everybody likes circles. So here's my interactive hairball that I did for the people that I work for at Stowers. They wanted just kind of an artsy representation of collaboration. One of the main science things is everybody collaborates with one another. So each of these groups is a lab. Each node, each circle is an individual in that lab. And there's links between them based on how many papers are on together. So it's fun, and it allows for exploration. Again, all I'm defining are the centers along a circle for each lab, not the individuals, which means we can do stuff like reordering the circle's center locations, makes everything move around. Kind of fun you can change. And if you change the radius that you're applying the circle to, you can make pretty but useless, obviously, artistic spiral of collaboration. But you get to find out who's the coolest kid in the groups there. And if you want a little bit of math introduction, I kind of feel embarrassed for needing these kind of reminders after that previous talk. But Tom McRide has a great introduction to this math for pictures, which I think is a wonderful introduction to using this kind of math to generate circles and stuff. Another example from a website called Let's Free Congress that was used as kind of an extension of the idea that they're looking at the small number of people that contributed large amounts of money to election funding. But the visual was kind of interesting. I thought I'd recreate it. So you start out with a bunch of nodes. I kind of wanted to kill Bill Newt music to be playing at this. So that's why the color scheme is such. And as the center node expands, the nodes around it are kind of repulsed in a very organic manner, right? I like it. And then when you come back down, they all jump on. But this is trivial now that we know the previous examples. Here is all building on it. So we can tell the collision detection is in effect here. But it's also just a modification of the charge for that center node. So you get a value from your slider and then modify your charge function to utilize that value. And that's all you need to do to create this kind of interesting visual with, again, not much work. The last one I'd like to look into in depth is a kind of a deviation from Maureen's Stefaner. And he's got a GitHub repository called Grid Experiments. And the idea is you start out with a grid here represented by these dots that are kind of hard to see. But when we apply a force layout diagram, in this case, it's just random data. But they have to follow two rules. One, they need to stay on a grid spot. And two, no two nodes can stay on the same grid spot. So when you start it up, I think it creates these kind of very interesting looking patterns. And you can modify the shape of your grid underline. Here's more of a just rectangle grid. And you can see that they kind of move. There's a little tweaking that goes on as it sits there. And the reason behind that is there's actually two nodes associated with each of these points, the visual one, and then this underlying one that's still getting manipulated by the other forces in effect. So you have these two parameters, or these two rules, and then the still, the charge, and the constraints are being applied to these nodes. So the first thing I thought when I saw this was subway maps and transit maps. I kind of wanted to create that for Kansas City. So here's Kansas City's subways. Works pretty good, right? We don't have. We don't have. I was further inspired to, and by that I mean easy access to data, when Fathom did this very similar experiment with the Boston subway systems here. So Fathom has a great post in about how much geography can we get rid of in transit maps. And still it's just a thought process. But I really liked the visual display here. So I thought I'd retry it with this simple, or they didn't really define the algorithm that they used to create that visual. But now we know the algorithm here. So we'll start each node, which is this subway station. We'll start them at their geographic physical location. And then we'll let the simulation run and see what happens. And it's interesting. I think it's kind of artistic and fun. I don't know about the practical being able to read this. I'm not from Boston, so it just doesn't matter to me. But it's at least a starting point. It's something to conjure up more ideas. And I look forward to seeing some of those. I've got three more quick examples of these same principles being applied in the real world. The first one is an older site, but is Barack Obama the president dot com from the Guardian, right? These balloons themselves are being controlled by a force layout. So they have a buoyancy kind of concept that are making them float. And then if you were to hover over one of them, you get the same radius expanding concept to a lesser degree that kind of pushes and focuses in on one balloon. I also have, I like this one. It's the shape of my library by Sarah Garof Palamero, a UX designer. I probably butchered her name in San Francisco. I thought it was interesting, so she's quantifying or making note of her library, her books that she contains. And it's a simple concept again. We're just using the same idea as multiple center points, but now with a lot more of them. But it allows for starting to quantify these different categories in an intuitive way. The larger bubble groups means a larger number of books there. And this, again, is part of a larger interaction that can be moved around and modified and explored. And finally, Discuss is a recent kind of dashboard. They call it gravity, wherein each of these white nodes represents a topic. And the bubbles that start forming around it are comments and conversations around those topics. And the more interesting part is that this is real time, and it gets updated while you watch it. But it's the same effect, really. The nodes themselves are part of a force layout. And each of the nodes in a particular category are being drawn organically to the center, which happens to be the topic node. And so with a little bit of work, they're allowed to have all this flexibility by just defining these simple rule sets. So hopefully, we've seen a little bit behind the magic of the force, right? This is Frank Oz, controlling Yoda in school. And that was, I mean, point is to take away some of the magic behind this. I think it's important that we get to see the how of some of these visualizations that we were inspired by and try to break them down into little components that we can then abuse ourselves. So I look forward to your abuses. And thanks. Any questions? Bean. Right. Yeah, yeah. No, I think some of these are, again, starting points of where the, yeah, that you can start encoding more, you can start applying more simple rules to these to make them more finesse. But I have seen examples of, are you talking about X and Y being part of the? Yeah, yeah. And some of these, I mean, yeah. So we're seeing examples where, well, in the Obama, the first one here, one that I didn't talk about, but something that also makes it, let's see if I can get to it, something that makes it cool is that they're actually double encoding their, the color. Da, da, da, da, da. Let me get to it. There. Color shows the amount of cut increase, right? It says, but that is kind of also double encoded by the layer that it's in. So I really like that they're adding that nuance. I didn't talk about. But so in this, I guess the Y is, at least, helping that separation by keeping things layered. And the implementation of it is you just add another custom force. And I cut it out for time. But the nuance is that they have the, they've multiplied by alpha twice for that particular force. So the effect, the impact of the tiering is less than the impact of the separation. So you end up, you can blend them to your own discretion how much you want one to affect the other if you want to both equal or whatever. I thought that was a nice, maybe an example of where this is starting to occur. I think it's the numbers that they've risen to have the links to be shorter if there are more links so that the more distantly related groups could be actually physically more distantly part. Yeah, yeah, that'd be cool. And in my hairball, those links aren't impacting the locations of the nodes. But yeah, certainly an easy optimization to that would to make them the order of those nodes more meaningful. The reordering was an attempt at that. But utilizing the data inherent in that data set would be a much more powerful way to start looking at that process. Well, I think it's, depending on the technology, we've seen this thing can bog down in SVG after a couple of hundred nodes. And it certainly seems, I'm not an expert, but it seems to be dependent on the browser you're looking at too. I mean, I'm doing this in Chrome for a reason. Most of these wouldn't fly as well in at least earlier iterations of Firefox and some of the other browsers. I know there have been experiments in using Canvas, and I think the performance can be improved after you get a certain hit to a certain threshold. But I might be incorrect about that. Bostock has an example of comparing SVG and Canvas implementations of not the forced layout, but a bunch of nodes moving around. And they do work in a similar efficiency at certain numbers, but at certain numbers. OK. Thank you so much, Tim.