 a quiet but warm welcome for Mike Bostock everyone. Alright thank you. So if you've ever gotten frustrated trying to figure out why your code didn't work or how someone else's code worked um maybe it was my code and I'm so sorry. Uh you are not alone and this talk is for you. Um you know as Max mentioned for the last eight years or so depending how you count I've been building tools for visualizing information and the most successful outcome of this effort has been this D3 JavaScript library for visualization. I mean I've spent far more work than I expected on this uh when I made that initial commit um but I mean that in a good way I mean it's been really exciting to see what people have done with it. Um but there's a danger when you spend so long designing a single tool and that is that you may forget what the tool is for right the tool itself becomes the goal uh rather than the value derived from its application. So the purpose of a tool for visualization is to construct visualizations but what is the purpose of visualization? So in Ben Schneiderman the purpose of visualization is insight not pictures. This is actually an adaptation of a Richard Hamming quote which is the purpose of computing is insight not numbers. So the point here is that visualization is also a tool. It is a means to an end a means to insight a way to think to study to understand uh to discover uh to communicate something about the world. So if we only consider the task of constructing visualizations of assigning visual encodings then we ignore myriad other challenges finding relevant data cleaning it transforming it into efficient structures for analysis uh designing statistical analysis building models validating those models uh and ultimately communicating whatever it is that we learn in that process. And these tasks are often performed in code and coding is famously difficult. Even the name code suggests impenetrability right code originally referred to machine code is low level digital instructions sent to be executed by a processing unit and code has come a long way since that you know it's much more human friendly uh but it's got a long way to go. I kind of love this image I mean to me it's like the fantasy of the developer when they're writing code um but it's also a little bit strange because it's like it's a robot I mean why is the robot typing can it just like plug in and send the bytes directly to the computer it's like the robot is being forced to use the human interface uh rather than vice versa. Okay to give a comically dense example uh this is a bash command I wrote recently to generate a coruplet of population density of California the rest of the talk will just be this slide and I'm going to walk you through all 15 commands here um I I mean it's kind of funny I mean look look at this like what what is it doing exactly if you look at it and you think it starts with this geo to topo right this is converting geo jason to topo jason but that's not really where this program starts because that takes like this giant thing as input and then that command actually takes like these two things as input uh so there are multiple levels of nesting here uh there's like these weird sort of punctuation marks what do they mean uh there are these abbreviations like dash p and dash f and of course like the backslashes um and if you notice there's actually two languages here like this is not just bash this is actually JavaScript expressions that are awkwardly embedded within bash uh but this is you know this isn't machine code like from the machine's perspective this is extremely high level programming um but it's it's hardly natural language uh so Brett Victor gives this concise definition of programming programming is blindly manipulating symbols and by blindly he means that we can't see the results of our manipulation right we can edit a program we can rerun it we can diff the output but programs are complex and dynamic and so this approach is not really a direct or an immediate observation of our edit and by symbols he means that we don't sort of directly manipulate the output of our programs we operate in abstractions and these abstractions can be powerful but they can also be difficult to grasp um in other words or in Donald Norman's terms these are the Gulf of evaluation and the Gulf of execution okay but clearly some code is easier to understand than others and one of the things that I first think of in human code is spaghetti a spaghetti code code that lacks structure or modularity where in order to understand one part of a program you have to understand the entire program it can't be disentangled and this is frequently caused by shared mutable state right when you have multiple parts of a program that are writing to the same piece of state it becomes much harder to reason about what that state's value is and indeed how do we even know what programs do if we can't track the complete runtime state of a program in our heads then reading code is insufficient right so we use logs we use debuggers we tests but these tools are also limited a debugger for example can only show you the value of a few symbols at a single moment in time so we can't observe complex data structures or complex patterns of execution and in a way like we continue to have great difficulty understanding code and it's a miracle that anything works at all but despite these challenges we're still writing code you know for myriad applications and more than ever before and so why is that like are we masochists maybe are we unable to change I mean I think in part certainly some of us some more than others is there no better solution and in general and that is the critical qualifier no code is often the best tool we have because because it is the most general tool we have code has almost unlimited expressiveness so alternatives to code as well as higher level programming interfaces and languages can do well in specific domains but these alternatives must sacrifice generality in order to achieve greater efficiency within their domain if you can't constrain the domain it's unlikely that you'll find a viable replacement for code there is no universal replacement at least not while humans primarily think and communicate through language and it's hard to constrain the domain of science science is fundamental it's studying the world extracting meaning from empirical observations simulating systems communicating quantitative results and a medium to support discovery must be capable of expressing novel thought just as we don't use phrasal templates or mad libs for composing the written word we can't be limited to chart templates for constructing visualizations or a drop-down of formulas for statistical analysis we need more than configuration we need the composition of primitives into creations of our own design so if our goal is to help people gain insight from observation we must consider the problem of how people code Brett Victor I will quote him several times in this talk had the following to say about math but it applies equally to code the power to understand and predict the quantities of the world should not be restricted to those with a freakish knack for manipulating abstract symbols the point here is that improving the human experience of coding is not just about making your workflow more convenient or more efficient it empowers people to better understand their world so if we can't eliminate coding can we at least make it easier for humans with our sausage fingers and our finite sized brains to explore this question for the last six months or so I've been prototyping building this thing I call the integrated discovery environment d3 express and it's for exploratory data analysis for understanding systems and algorithms for teaching and sharing techniques and code and for sharing interactive visual explanations I do want to make visualization easier and discover easier but first we need to make coding easier now I cannot pretend to make coding easy the ideas we wish to express explain and explore maybe irreducibly complex but by reducing the cognitive burden of coding my hope is to make the analysis of quantitative phenomena accessible to a wider audience now the first principle of d3 express is reactivity rather than issuing commands to modify the shared state each piece of state in a reactive program defines how it is calculated and the runtime manages their evaluation it propagates derived state that's a lot of sort of technical words but if you've written spreadsheet formulas you've done reactive programming so here's a simple notebook in d3 express to illustrate this concept it looks a little bit like the browser's developer console except here our work is saved automatically so that we can revisit in the future and share it with others right so in imperative programming c equals a plus b is a value assignment that means it takes the value of a adds it to the value of b and copies that into the symbol c and so an imperative programming if the value of a or b changes the value of c doesn't update until you recompute that same addition but in reactive programming c equals a plus b is a variable definition which means that as here when I change the value of b from two to three the value of c changes to four if I change a from one to four again the value of c updates if I change b to be math not random again it updates so the point here is that as programmers we now care only about the current state because the runtime manages state changes and that's a small thing when you're only adding two numbers together but when you have larger programs this eliminates a substantial burden of course discovery environment needs to do more than add a few numbers so let's try working with data we're going to load d3 and because this is csvconf we're going to load a csv and now we want to see what that is so we can click and inspect but already there's some cool stuff that's happening here so one is that requiring d3 loading d3 and likewise fetching this data they're asynchronous operations now if you don't know what that is it kind of doesn't matter that's the beautiful thing about this because you can treat this reactive code as if it were synchronous code meaning that anything that depends on d3 in this notebook doesn't get evaluated until d3 is loaded and likewise any expression that you write that refers to the data won't get evaluated until the data is loaded and this avoids the sort of famous challenge of callback hell that you get with imperative asynchronous code so what does the data look like we can click we can inspect and just open it up here one of the things that you'll notice is that these fields they're all strings right so the date is a string and the close field is a string this is stock data for like the last five years or so of like Apple stock daily closes and in order to work with this data one of the things that we're gonna have to do is convert this from string type to a more precise type like we want to work with numbers we want to work dates so what I'm doing right now is I'm passing in this row accessor function to try to specify what those types are you notice when I go in here in d.close I put the little plus sign in front of that that is JavaScript's way of converting a string into a number it's the unary plus operator and the point here is that as I make that change you're immediately seeing what the result is so we're still doing sort of abstract symbol manipulation here but at least we're doing that less blindly because we're seeing the result immediately so we can do the same thing with dates except JavaScript doesn't sort of have native support for parsing this date format so what would happen if we just called a hypothetical function called parse time well we can call it and of course it returns an error because that is not defined it's not sort of a built-in primitive but this kind of phrase is another point which is that the errors that you see in these notebooks they're no longer global errors they don't bring your whole program to a halt the other cells that you have in your notebook can continue to run and these errors are temporary so that when I go in and I define parse time the error automatically goes away and so again this is one of the things that reactive programming gets you in terms of being more structured programming it's a bit like when your formula is in your spreadsheet they just say invalid value but Excel doesn't crash when you mistype the formula so here I've changed I've defined my date parser and now I have nice green dates in my data set and now I can start to ask questions of my data so I'll use D3 for that and I'll say D3 extent the data you know D dot date but this raises another point which is that I actually forgot to give the data set a name I give the data a name and then it automatically reevaluates that and this raises another interesting feature here which is that this notebook is now order independent because it understands the references between these cells you can write the code in any order that you want so I can refer to data up here in this cell but the data isn't defined until down here this is useful for sort of having a cleaner structure to your code it's also useful when you go to publish your results you want to communicate those with other people you have total freedom in terms of how you order the code and how you explain it you're not required to for example put all of your requires or your imports at the top of the file you can just let the narrative determine how you want to order the code so like the developer console the output of each cell is visible immediately below that cell but unlike the developer console we can have visual outputs so now I want to take the same data and make a little line chart out of it so I'm going to define the width and height and the margins for that is like three different variables then go up and look at our data the extents that we computed and turn those into domains of the scale so these scales is D3 concept and it's mapping sort of these abstract dimensions of data to visual encodings here like a position in X and Y so I'm just updating those definitions and once those are defined we can create an SVG element to contain the output of our chart I'm going to use the curly braces because this will be a little bit more of an involved definition than the other ones which are expressions so now I can sort of put more JavaScript code in here I'm using D3 select to do that and I've got this like DOM.SVG I mean the details aren't really that important but this is really just creating a SVG element of the given width and height and then returning that and that's sort of the basics that we're just working with like web standards here in vanilla DOM so the first thing I want to do for this chart is to make sure that my sort of dimensions of data look correct so I plug in the axis here and by default like an axis in D3 is going to be rooted at the origin which is zero zero in the top left corner and so I want to shift that down so that it's on the bottom of my chart so I go in here and I put the transform attribute on my G element and I give it the right sort of height minus margin dot bottom and so then that moves down it's a little bit off the screen here but you get the idea likewise I can make the left axis here for the Y scale and then put that on the left side of the chart. Okay so now we have our little axes and now we want to actually draw the data we want to see the data we're going to need a path element for that path elements in SVG require this D attribute which tends to be this complicated thing this is a whole like micro language in SVG for making these things we would don't want to do that by hand so there's a D3 line primitive and we're going to pass our data to that line function in order to construct the geometry for the chart so I'll do that now that's configurable again so we're going to pass in our X scale and the corresponding like value from the data so X and Y here okay so that shows up obviously it doesn't look correct but that's because paths in SVG are filled black by default and we want to stroke it for a line chart so again we'll go in here and we'll set the fill style to none and the stroke to blue okay so now we have a basic little line chart here with a few different cells okay but even though this is a basic line chart the program's topology is starting to become more complex so this is a directed acyclic graph of references within this notebook so this is showing you the structure of the program and this visualization was itself made in D3 express using graph is there's a command you can run to produce this thing so you've got the require at the top that's how you load the libraries it generates D3 and then we use D3 to make our time parsing function we also use that to parse the data and load the data we've got your width height and margins those feed into the X and Y scales and then basically everything feeds into the SVG node which doesn't have a name so it's just number 93 at the bottom so a few observations of this chart so one thing is that is now trivial to make this chart responsive right the width height and margins are constants as we've defined them but if we changed them so that they were the size of the screen or the size of the window then the chart would update automatically and likewise we can replace the data definition so rather than a static definition maybe we want a real time chart and so that's just a question of replacing that definition with another definition and everything else sort of falls out of it but I want to look a little bit more closely at the code here so you can get a sense of how this reactive programming affects your code structure so this is sort of like typical D3 code that you might see on Bloss.org and the idea is like I'm defining a scale here I would define the scale or I would declare the scale on page load and then I have to wait till the data loaded in order to set the domain and so there's a scale object here but really my definition of that scale kind of gets distributed throughout my program where I have a whole bunch of unrelated code here and this is obviously a very pared down example this is not a complete chart it wouldn't fit on the slide for one thing but you can already get a sense of like how the code ends up being harder to follow and more distributed because of the statefulness of this program whereas in reactive programming we can localize those definitions because it's now the runtime's responsibility to manage the order of execution and it knows that this X scale depends on the data and it knows that it depends on D3 and the margins and so we can just define it in a way that makes more sense and let the runtime handle it. Now the last thing on charts is that you don't have to use D3 in order to make charts in D3 express I mean it is called D3 express maybe that was a mistake but you know you can use Vega light you can use 3JS you can use whatever it is that you want all these things are just JavaScript and DOM so this is the same sort of data set put a log scale on the chart but this is using Vega light and to me this is also a really exciting opportunity because as you sort of make it easier for people to explore data sets you can also explore other of these more domain specific higher level abstractions and still have the benefit you get with the reactive programming. Okay so how about Canvas? I've got another notebook here and I want to make a globe so I've loaded D3 Geo and top of Jason and then the topology of like World County boundaries and I'm going to create a canvas element similar like we did with the DOM.SVG get the context from that and then we're going to draw a bunch of canvas commands in order to get the world to appear. So for that again you know you have this path object where you're passing in your geometry what is that well there's a D3 Geo path function which takes Geo Jason and turns it into a string of our sequence of canvas draw commands that requires a projection which for here we'll use an orthographic projection and so now it just appears so I'll take that and like let's say I wanted to draw the outline of the earth as well so for that I'll need a sphere object so I break that out to a sort of separate definition here plug that back in and so now we have a nice little globe here but one of the powerful features as I hinted at when we were looking at the directed graph is that we can sort of take one of these definitions or one of these variables like in this case the projection and we can replace a static definition of this orthographic projection and put in a dynamic definition so it's something that's animated let's say so here's the Mercator projection or equirectangular but now I want to make it so that it's a rotating orthographic projection okay so the way that I'm doing that I'm sort of like glossing over some JavaScript details here but this is this is a generator so I guess a lot of people haven't used JavaScript generators but I only like honestly discovered them a few months ago but they are remarkably cool for doing this sort of stuff you'll see how it works in a second basically it's a function that can yield a sequence of values so normally a function just returns one value but in this case like it can yield an infinite stream of values so this is going to create an orthographic projection and then just inside of a wild true loop it's going to set the rotation angles for that projection and then you get a rotating globe here tilt it a little bit because northern hemisphere specific so yeah so generators they're pretty cool now one of the things you may be wondering is like how does this work right like how how is it that the generator why doesn't it just go into like a why doesn't hang the page it's a wild true loop and the answer is that generators are a pole system so it's the runtime that's pulling new values from this generator at 60 times per second rather than it's sort of like pushing new values is whenever they get updated like skip ahead a little bit here now one of the cool things you can do in addition or you may be wondering like when you have these generators let's see you know what is it doing with the canvas and the answer is that it's just throwing away the canvas and creating a new canvas every time that it needs to draw and that actually works just fine in this example because it's not very expensive it's a pretty simple geometry that is displaying but obviously that's a lot of overhead and it would limit the sort of things that you can do so the thing I'm showing you now is that you can change that behavior by just accessing the previous canvas that you used and of course when you do that it starts smearing but you can add this clear command so that it's clear before you redraw and then you're fixing the line width as well and so the point here is that you can opt in it's loop now already too late I hope you got it you get this simplicity of the reactive model but if you want to like opt into a little more complexity you also don't have to pay the performance cost for that there's negligible overhead compared to what you would write in vanilla JavaScript okay so just to reiterate to look at the code a little bit this is our static definition of the projection and this is our generator which defines the rotating projection and so every time a new value is pulled from this generator it sets the new rotation angles based on the current time and then yields that value okay so generators are good for scripted animations but what about interaction well turns out we can use generators for those two it's just that our generators are now a synchronous and they yield a value whenever there's new input rather than just yielding it at a fixed rate so the first thing I'm going to need in order to make this interactive is I'm going to need a little slider and again this is just Dom so I'm creating input element type range with these values it's not hooked up to anything so of course like dragging that back and forth doesn't do anything but I'll give that a name and then we can define a generator that emits the value of that range slider so now I can see okay it's going from minus 180 to plus 180 as I'm dragging it I give that a name call that the angle and then we feed that angle into our projection rotation so now when I drag the slider it's now interactive right I can sort of spin the earth around now because this is a very sort of common case where you're defining a user interface and you want that to drive something in your code there's a view of operator which does exactly the things that I just showed you but it does it sort of within a single definition so there's the input slider that you're declaring here that's the user interface or the graphical interface and then there's the value of that which is the programming interface so that's the angle that the code sees to drive the projection okay so again this is the code so this is sort of like a long form where I'm declaring my projection and its rotation takes an angle that angle is derived from this range input and the range input it just goes from minus 180 to plus 180 and this is the shorthand form using the view of operator okay but we now have the ability to generate arbitrary inputs right this is not just sort of you're limited to sort of a fixed palette of a range slider and a drop-down menu in this case like I'm making this table and it's got like three sliders I'm making a color picker for the cube helix color space and I want this the output of this complex input to be a cube helix color instance so that's the code that I'm writing here which basically takes the values from the sliders it updates their corresponding output so as you're dragging it you can see that the hue angle is changing there to the right and then below that you can see this color object that's kind of the output of our interface here and then I'm using that to sort of set the background color of this div so the point is like you're just doing sort of DOM and HTML here but there's a really nice primitive for you to sort of hook that into the programming system into the reactive programming environment okay now for visualization this has sort of even more interesting applications so this is a histogram it's looking at sort of the price of 500 or so stocks in January 2012 relative to their price in January 2011 so you can see that there's a bell curve here and the mode of that is like slightly greater than one because of course like the average return on stocks tends to be positive but there's also a long tail where you have like stocks that did really well and you had stocks that did really poorly and if you wanted to know exactly what those stocks were in another environment you might have to write separate code in order to query that and look at the results but here we can augment this visualization with a little bit of interaction so that we can manipulate it directly and see the output so that's what I've done here is there's that D3 brush so I'm brushing on it and then this chart yields just like a range slider would except it yields sort of the data points that you've selected so just by brushing back and forth here and using the default object inspector I can see what these stocks are so like that's the price line group in some pharmaceutical which I assume they like I think they went public like at the start of this data set but the interesting thing for me is like all the ones down here on the left does anybody have any guesses what they are from 2007 so it was shortly before the financial crisis and so this is like e-trade financial and all of the other sort of like financial firms that basically collapsed and had to get bailed out by the government but it's cool that you can just sort of see that directly from this visualization here building it up incrementally okay just to show that there's no real magic going on under the hood here this is the code that I wrote to adapt sort of your standard D3 brush that you would write today to this sort of generator based system so whenever there's a brush event you can look at the selection and then you can use that selection to filter your data so picking the stocks that have a change value between you know the lower bound and the upper bound of your selection and then you're just setting that as the value and then dispatching this input event so that it triggers the update now normally in in reactive programming and in this environment your reactions are instantaneous but sometimes it's beneficial for them not to be instantaneous you actually want to observe the changes from one state to another state so similarly like we had with the canvas example we can use the previous value of the cell in order to define sort of your standard D3 transitions and using the data join so here I've got a stagger transition on the data on a bar chart and I've got a data set that I'm going to sort based on this like little checkbox here switching between this is descending frequency this is letters in English language and just lexical graphic order so if I click the checkbox here it's just running that same code and because it has access to the previous chart it's not throwing it away and it can just do your sort of standard D3 data join stuff in order to make an animation there okay so inline visual outputs improve our ability to inspect the program's current state but there's more that we can do with interactive programming to understand not just the current state but to understand the behavior of a program we can do that by poking by changing deleting and reordering code and seeing what happens and so in this notebook I've got sort of your typical force directed graph and I've got the simulation here which is driving the layout and by sort of commenting out the different forces I can see what effect they're having on the layout so I turned out the charge force and everything collapsed because the charge force is what's causing these nodes to sort of pull apart from each other likewise if I change the strength of that charge then it sort of all collapses on itself and that was because normally the charge is negative right so that they repel each other if you change them to be positive they're all pulling each other towards the center and there's nothing there's no equilibrium it just goes into chaos so I can tinker with the forces here I can choose like what's the right value I can play with the link force and turn that off to and it sort of explodes or turn off like the centering force just kind of floats away yeah so you probably seen little tinker toys like this before where you have a force layout and they're like some sliders like that gooey or whatever where you're playing with the parameters but the thing that's kind of cool here is that you didn't have to build any specific interface to do that it just sort of came for free with the reactive programming model just by tinkering with the code now a more explicit approach is to expose the internal state of our code as it's running so that we can study it with visualization generators can help with that as well so I'll give an example so this is just a very simple function which computes a sum of an array of numbers and what we can do is turn that into a generator so that basically means we put a star here and then we add this yield value and so now the idea is like we have an extra channel where we can emit information from our code and use that in order to construct visualizations or animations to understand the behavior of the code and that's important because it gives a cleaner separation between the implementation of the algorithm and how we study it how we explore it and so there are two ways we can call the generator like this so one is you just call it directly and then you get an animation just like we did with the rotating projection the other way is use this like little ellipses here in an array and then it's actually going to pull all of the values at once out of that generator and so you get a nice like static data set and you can then construct a static visualization of your program's behavior rather than just being limited to animations so obviously like understanding like a running sum isn't particularly interesting so I'm going to use like a more concrete example here we're going to get deep into computational geometry probably weren't expecting that at this conference but here it comes so this is D3's hierarchical circle packing layout it's a bit like a tree map except it's not quite as space efficient as a tree tree map but you can see sort of the structure of the hierarchy a little bit better so this sort of technique is commonly used to understand sort of like where your disk space is gone or how you're using a file system so in this case like this is flare which is another visualization toolkit and looking at the sizes of the different source code files organized by their package hierarchy so one of the tasks here is that you have to you have all of these circles and you want to pack them into a small place a small space as possible without overlap like huddling penguins and Antarctica and so your our job is to place circles one of the time until all of the circles have been placed okay since we want the circles to be packed as tightly as possible it's fairly obvious that each circle that we place must be tangent to at least one and actually two of the circles that we've already placed but if we just pick an existing circle at random as the tangent circle then we're going to waste a lot of time trying to place the new circle in this in the middle of the pack where it's going to overlap with the circles that we've already placed so ideally we only consider the circles that are on the outside of the pack but the problem is how do we efficiently determine which circles are on the outside so what D3 uses and other implementations of this layout use called Wang's algorithm and it maintains this front chain which is shown in red and that represents the outermost circles so when it's placing a new circle it's going to pick the circle on the front chain that is closest to the origin to the center and then the new circle is placed tangent to that circle and its neighbor on the front chain so if this placement does not overlap with any circle on the front chain then the algorithm can just move on to the next circle but if it does overlap like in this example here this black circle is overlapping with these other circles on the front chain then you have to cut the front chain between the tangent circle and the overlapping circle and it sort of like expands the front chain out and that way after you apply that process a few times the new circle that you place won't be tangent to any other circle so I find this animation a little bit mesmerizing and the moments I like are when the large circles kind of like get forced out of the pack there's like a very quick animation of only a few frames where they kind of get squeezed out but more than just being kind of cool to look at this notebook was extremely useful for me for fixing a long-standing bug in D3's implementation where very rarely it would cut the wrong side of the front chain and the circles would end up overlapping and actually I discovered another bug just last week with a different visualization here but it's been great so once you pack the circles you're not totally done you also need to compute the enclosing circle of that pack so that you can then repeat the process and the rest of the hierarchy and the conventional way of doing that is to just scan the front chain and picking the circle that is the farthest from the origin and that tends to do a pretty good approximation because these packs end up being roughly circular but it's not exact and I a year ago or so I discovered there's this other algorithm called Welsel's algorithm for computing the smallest enclosing circle in linear time and I think it's also pretty cool so I'm gonna show you how that one works so let's assume that we already have the enclosing circle for some circles and now we again want to do this incrementally we want to incorporate a new circle into the enclosing circle and that sounds a little bit circular that we already know the answer but it's like a proof by induction all right or any sort of like recursive process as you'll see but I'm not gonna give you like a rigorous proof of this it's not enough time and also I probably just couldn't do it frankly but I want to give you like an intuition so that you can get a sense of how this algorithm works so if the new circle is inside then we don't have to do anything we just move on to the next circle but again if the new circle is outside of our enclosing circle then we're gonna have to compute the new enclosing circle but we actually already know something about this new circle it's the only circle that is outside the enclosing circle and thus it must be tangent to whatever the new enclosing circle is which is in this case is this one so we don't really know yet what the other tangent circles are but we know what one of the tangent circles are and that means that we can apply this process recursively in order to find the other tangent circles okay so I'm glossing over a lot of geometry here there are also like boundary conditions that you have to worry about like you need to know what the enclosing circle is for one two or three circles and that last one is called the problem of Apollonius it has a cool math world page with lots of pretty diagrams but the point is like with a little bit of geometry combined with this intuition you can get a sense of how this process works and understanding that it's a recursive process we can now sort of see a more complete picture of this algorithm so the first one that I showed you was really just sort of the top level of the algorithm and now these are like up to four levels of the algorithm you can't ever get more than three tangent circles so that's why there can't be more than four circles that are drawn up here and as you are iterating over your circles and you find one of the circles that's outside of your enclosing circle I've said circle like 5,000 times it has to recurse it knows it has a tangent circle it has to move to the right so it's like heading one level deeper into the dream and then popping back up again until you finally get your results okay so in addition I'll show that again but in addition to showing how this algorithm works the algorithm gives a sense of how much time the algorithm spends or the animation gives a sense of how much time the algorithm spends at different levels of recursion so you can see that it converges very quickly on an approximate answer but when it encounters a circle that's outside it then has to rescan all the circles that looked at previously in order to compute the new enclosing circle so it ends up being more expensive when it finds a new circle that's outside in order to validate the new result so one way to write less code is to reuse it and the 450,000 or so packages published to npm attest to the popularity of this approach but libraries are an example of active reusability right they must be intentionally designed to be reusable and this is a substantial burden it can be hard to design an effective general abstraction just ask any open source maintainer in contrast implementing one-off code like you see in many d3 examples is much easier you're only concerned with sort of the task at hand and not some general abstract class of tasks so what I'd like to explore with d3 express is whether we can have better passive reusability sort of something in between one-off code and sort of nicely packaged up reusable code where by leveraging the structure of these reactive documents we can more easily repurpose code even if that code wasn't carefully designed to be reusable so what I mean by this is for starters you can treat your notebooks like de facto libraries so I don't know if you saw that but in this notebook here I've sort of defined a color interpolator this is like implementing terrain dot colors from R using d3 hsv it's just sort of used for elevation data or topographic maps sometimes so I've defined that in one notebook and what I want to do is use this color scale in another notebook and I haven't published that to npm but I can import that from the other notebook by just saying import interpolate terrain from the name of that other notebook and then I can start to use it and so this is nice for sort of reusing code that you wrote from another notebook I can also imagine this technique being useful if you for example you have a lot of different notebooks used for exploration and then you want to combine those together into your final write up you don't have to copy your code from those separate notebooks you can just import the symbols and then write around it and add extra explanation now more interestingly you can do rewiring of these definitions as part of the import process so I'll give you an example so this is a data set where I'm streaming data over web sockets and so whenever it gets a new event it's going to sort of like add a new datum to the array and shift the old one off so that it's a moving window again like the details of this code don't really matter I'm not imagining that people would write all of this code for all of their real-time datasets you'd probably have like an API or something to load these streaming datasets but I still want to show you that it's it's relatively straightforward in order to construct these using the generators so this is what the data set looks like it's just an array of 300 things and you can see that it's sort of shifting off representing sort of a recent time window here okay so now you saw from before we had a chart that did a line chart from before and so this data set I mean it's real-time but it's basically the same structure as our old chart so the question is can we reuse our old chart to show this data set and the answer is yes as you'll see so this is the chart that I've imported from the other notebook it's actually a slightly different definition using Canvas rather than SVG and so now all I've done is I added a little width clause here to inject our data into this chart definition so we're just replacing that definition and you can see that it's now a real-time chart that's sort of ticking as I get new data from the server without having to change anything else about the the chart definition but the cool thing is I can actually customize that chart definition a little bit more if I want to so I'm going to load D3 and one of the things that I want to do here is I want to fix the Y scale so that they don't sort of bounce up and down as the extent of my data changes I just want like a fixed value say it represents like what the expected values are for this data set and that way it won't sort of bounce up and down distractingly so in order to do that I need to import some other symbols I need to know what the size of the chart is but then after that I can inject my Y definition and so now the chart is got this fixed range and similarly I can do the same thing with the X scale so like let's say rather than again deriving the domain of the X scale from the data I just want to have like a fixed moving window so that it updates at 60 frames per second so I'll do that with the generator and here it's like a generator that emits the X scale where the domain of the X scale is based on the current time the import statement and inject the X so now it's it's smoothly sliding rather than ticking every update okay so the last concept that I want to talk about is that because these notebooks run in the browser and not in a desktop app or on the cloud it's a web-first discovery environment like all the computation and rendering happens locally inside the client so what does that mean well a web-first environment has to embrace web standards including vanilla javascript and DOM it works with today's open source whether that's example code that you find in a tutorial or libraries that are published to npm and it minimizes the specialized knowledge that you need in order to be productive in that environment there is some new syntax in d3 express for reactivity but I've tried to keep it as small and as familiar as possible such as by using generators so these are the four different ways that you can define cells or variables in d3 express and these are just expressions block statement the funky block statement preceded by an asterisk which means that it's a generator and then your sort of standard function declaration here and the idea is that by having sort of minimal syntax it's very different from let's say using a reactive framework where there's a lot of boilerplate sort of api that you're wrapping on top of your code here I want to make the reactiveness feel like a more of a language feature something that's intrinsic to the programming environment rather than a layer that you add on top of that and that's important too if you want to take this code and pull it out and put it into your react app there's not really anything that you have to remove in order to plug that in there you're just trading one reactive environment for another reactive environment so another important principle is that the web first environment lets you run your code anywhere right because your code is running in the browser there's nothing to install which means that it's easier for others to repeat and validate your analysis you didn't have to reproduce the exact environment or install the right set of packages in order to run your code if it's running in your browser it's going to run in your reader sprout browsers as well and that means that you can transition more easily from your exploration to your explanation you don't have to start over from scratch switching from one tool or one environment to another in order to communicate whatever it is you've learned in your discovery process so to keep a comparison to what's commonly done today you know we might publish a model or some data sets up to github and what I'm trying to do is to make this process a little bit richer to make it so that the ability for our readers to reproduce these environments and run this code is easier the bar is lowered we don't have to sort of read these long instructions in order to reproduce these environments we can just sort of run them directly and I'll end on like another sort of victor quote well I have a little bit more to say after this but the point here is making your code for analysis more portable can have a transformative effect on how we communicate so to quote Brett Victor again an active reader asks questions considers alternatives questions assumptions and even questions the trustworthiness of the author an active reader tries to generalize specific examples and devise specific examples for generalities an active reader doesn't passively sponge up information but uses the author's argument as a springboard for critical thought and deep understanding so the point is if the code is running in your reader's browser they have a much better ability to see how that code works to tinker with the code to change some of the assumptions that you make to fork it into another environment and start doing some new things or even just explore interactively because the code is again running inside of their browser okay now for the disappointment as much as I want to release D3 Express this is quite different from sort of library stuff there's a lot more work here when you're building a platform a service for people to use in addition to the software and so it's not really ready for you to go yet and start using it I was trying frantically to release a part of it by the time of this talk but it'll still be a few more days so if you want to try it out you can go to this URL and you can sign up for it you can also get in touch with me if you want to help me build it if you're also looking for a job also get in touch with me I would love help doing this so thank you