 Hello, I'm excited to be here at this fantastic conference. I think I have the longest title. Okay, where's the switch? Don't see the button. Okay, it's a little bit big. Better. Sorry. So the title could have been shorter. It could have been just how I was seduced by a technology. This talk is... I was fortunate to work at Interactive Things. I'm no longer with them. I trained another company one month ago. But this talk is very much a result of our collaborative efforts at Interactive Things to find a tool stack and architecture which allowed us to build the middle and larger scale interactive visualizations for the web. So especially Jeremy Stuckey put a lot of effort in that. So my background is computer science, software engineering. I started my career like in early 2000s. I worked mostly on UIs, web stuff and some data analysis. At some point I was so fascinated by one of Hans Rosling's videos that I decided to learn more about database. And I did a PhD with Enrique Bertini and Danila Lan. It helped to learn some basics. But the nitty-gritty... On the theoretical side, of course. But the nitty-gritty stuff of engineering data visualization that I had to learn by doing. And so I worked at Interactive Things that helped me a lot to learn lots of things. And before that I participated in some projects. I will show you very quickly because it's not a portfolio talk but so that you better understand what's my background, like some of the works I did. I've used many tools, Java, Adobe, Flex, which I kind of liked. JavaScript D3, Backbone, React. Lots of D3, lots of D3. React, Backbone, mapping stuff. So, yes. This is my current project. I work at a startup called Teralytics where we are looking at visualizing people mobility. So I'm the only one at the moment doing the front-end. By the way, if you want to join me, let me know. So there are lots of big data analysts. So it's big data about people moving in cities and so I'm building the front-end. So that's what I've built so far. It's kind of a dashboard where we have a sleeping map with showing how many people work in different streets at different times. And there will be more and more views, different controls. The views are interconnected. So it's like a dashboard which has to be scalable. So the architecture, I had to choose for that, has to be scalable, flexible and have a very good component model. So I chose React for that or rather an architecture enabled by React. And I'll try to explain you why today, why it makes sense. First, a little bit of history, so to say. So I think that visualization is a lot about communication, right? So we want to tell people what we know or we want to help data, so to say, to tell stories by themselves, right? That are hidden in the data. And the media is also very important, which we used to tell those stories. So our far ancestors left us these rock paintings which we can still enjoy, which is fantastic. Our not that far ancestors used paper and they could put more and more information on it, right? But it's not always a good thing, right? Because if you put too much data in one thing, it becomes very overwhelming and hard to understand what's going on, although it's potentially more informative. So we now have all those interactive devices, right? We have to deal, we have to take advantage of them because there is really a lot of potential. We can support interactive exploration. We can show one facet of the data at a time. And I personally believe that for addressing the challenges of which big data poses, interactivity is a key. So we don't have enough pixels to show all the data at the same time on the screen, right? So we have to summarize it. But if we summarize it, we reduce it and we have to allow the user to summarize it in different ways so it can be explored. WebJail will not solve all of the problems. Interactivity also helps to find personal stories. So users explore the data and they choose their path. They choose what they are interested in. And of course it can be more engaging if designed well. But it's very hard actually, I can tell you, to design a good interactive application. This is like a simple network exploration tool which allows to explore it on separate levels and has like different views. It cost me a lot of sweat to develop this and to make it work on different platforms also. So I mean, it's a simple thing. Why is it so hard? Why is interactivity hard, in fact? It's because with every new interactive feature we add to the system, we add a new potential state or potentially many states, right? Which the system can be in. And the more states we have, the more transitions we have between those states, right? It's a combinatorial growth and we have to model all those transitions. We have to make sure that they are still, that we always still have consistent representation of the data even when the transition is, after a transition run, right? And because we have so many of them it becomes very, very difficult to ensure that we are really doing it in a consistent way all the time. Do we really have to do it this way? Is there a way around? Maybe. So this is the great grammar of graphics by Lillian Wilkerson. What this book taught us? It taught us actually that nearly any statistical graphic can be expressed declaratively by defining a mapping from the data to some visual elements, right? We just define how data elements are transformed and what are the static properties of the graphical elements, geometric objects which are finally rendered by the computer. And it's beautiful because it's so simple and declarative, right? So it's basically defining a function. So if we are speaking about visualizations for the web, right? Function which maps data to DOM objects on our web page. And we know that this model works really well because there are so many implementations which have been developed so far and still are developed which are very successful. But are they really scalable in terms, like, if we have a more complex application with many components which depend on each other? I mean, how can we apply the same model? Does it work? Not directly. So this is an example of our work which we did for Neutriloed Zeitung, the Swiss newspaper. It tells a story about the events of the First World War. You can see that, like, it's a pretty complex interface. There are lots and lots of components. They are interdependent. There is, like, another separate view showing, like, the armies of different countries, their sizes, and when they engaged into the war. There are some additional views here. You can interact with the map as well. So it's a pretty complex application. It would have been much more difficult to build with D3 alone. So we used React to package the whole application. And so what does D3 have to offer for such larger applications? We have this reusable chart pattern, right? Which is very nice. It can help us to package one specific particular chart into, like, encapsulated component and then reuse it in different places. But it doesn't really tell us how to handle the whole application state, how the data flows in the whole application. It doesn't allow us, or doesn't tell us, at least, how to compose components, which is really important to build those middle and large-scale applications. And essentially it's always the same problem. We have many components which are interdependent. We have too many dependencies between them which are difficult to take care of so that it's consistent. So what has the industry... What has been the industry standard for developing large US embassies, right? For 30 years already. We have model controller and view. We have the data in the model. The view is what we see on the screen. And controller figures how to, like, controls the data flow between the model and controller and the view. If we speak about using it for database on the web, we can put the D3 part, the D3 code in the view, right? And we can, like, that the controller takes care of sending the right data. It works. It helps to encapsulate logic and separate it from the view. But, I mean, again, it's the same problem. We have lots of components. They are interdependent. The models are interdependent. Once something changed, we have to update the right models and update the right views. Lots of dependencies. Also, here we have, like, models which are designed specifically for the views. And that results in data being scattered, right? Our application state is scattered across the models. There's no one source of truth. Also, sometimes it's even redundant, which, of course, doesn't contribute to consistency. So it could be improved, maybe. If we just do it in this way, we get the application state, right? We can encapsulate it without thinking too much about how we represent it at first. But then we define, like, this mapping from the application state to the actual representation. That's it. So it could be something like this. So we have a big application state, right? We have a component hierarchy, and the application state gets rendered here. And if we have some interactive thing, some button which the user clicks, we eventually propagate this event to the app state, change the application state, and then do a render, right? Again, very simple model. So it's actually very much in the sense of the grammar of graphics. We have data and we map it to DOM objects, right? We have application state and we map it to DOM objects. And over time, application state changes, but the mapping from the application state to the DOM object doesn't. It's always the same, right? So it's important, so-called. Over time, this way, we can develop an interactive application. So, I mean, it's a simple model, very nice. Why haven't we been doing this so far? There is one issue with it. It requires full render on each state change, right? And it can be costly. Well, but maybe there is a way around. And this is where React comes for the rescue. Because with React, we have this notion of the virtual DOM. So it's a virtual representation of the actual DOM tree, which gets eventually rendered in the browser. Why virtual? Virtual is easier to manipulate, right? It's not that costly. The actual DOM is very costly to change, because any change can cause a reflow, which is very, very costly. So it's basically slow. But if we manipulate the virtual DOM first, right, and figure what we have to change in the actual DOM, this can be affordable. So how React works, it's basically, when a change happens, we keep track of the previous state of the virtual DOM. So this is the previous state. Then something changed, right, in the virtual DOM representation. We see that this node has changed. And React runs a heuristic algorithm which tries to identify the minimal set of changes. Not optimal, but with some heuristics, so that it runs in predictable time. Which mutates the actual DOM in the browser, so that it represents our most recent state, right? So we basically do the minimal set of changes to come from this state to this state. And the virtual DOM is a way to describe the state of the visual representation. So with this approach, full render becomes affordable. And it actually enables this architecture, which I was talking about before. So let's see some examples. This is the simplest hello world. What is going on here? So we have a virtual representation of a div, right? HTML or DOM element with hello as text. And class named greeting. This class name instead of class because class is a keyword in JavaScript. And so here we create this virtual representation of this node. But here we tell React to render it in the actual DOM. And it will be rendered as a child of document body. Very simple. So React is coming with something called JSX. It's basically a syntaxical sugar for JavaScript which allows us to use HTML or HTML like text within our JavaScript code, right? It's not mandatory to use it, but I find it very useful because you see right away where are the DOM objects. It sends out in your code. So I'll use it throughout the examples. But it's doing the same thing as this. It gets transpiled into plain JavaScript if you use it. So one very good thing about React is it's very strong component model. Basically it's all you do when you develop applications with React. You create components, and each component basically is a function which defines how data is mapped to its visual representation. So here we define a component called greeting, right? We say that it has a property name of the type string which is required. And so this is the most important function in a component which is called render, and it basically creates the virtual representation of the DOM which eventually will end up in the browser. And here's what we do to render it. So we say greeting. We pass a name as a parameter, and that's the result, right? Let's do some visualization. This is a scatterplot. We have some data, and we have a scatterplot component. We render the scatterplot component passing some data to it. So this array basically. And the render method creates an SVG document. React works nicely with SVG, so we can do graphics in the browser with it. And here we just use a standard JavaScript map function to map each data element to a circle. That's simple. Forgive me, I'm using a bit of ES6 syntax. You need a transpiler for that, but I find it easier to read. But this is basically just a function with two parameters. So yes, it's pretty much in the sense of the grammar of graphics. We map data to visual representations. This could be improved a little bit. We could use scales for that, right? Like in the grammar of graphics or in D3. And actually we can use D3 for that because D3 has lots of visualization goodies which are independent of the DOM manipulation, right? Actually, we can use most of the functions from D3, even some of those which manipulate DOM, I will show you later. The only function we really kind of use is append, or we shouldn't at least. So let's see. We use the scales from D3 here. We defined scale for x, y, and the radius. And we just use them here. Plain simple. I think it reads even better than D3. But it's not the point. So let's do some changes. That's where it becomes more interesting. So we have an update function which basically re-renders the whole thing every time, passing new randomly generated data to the same component, right? And here in the SVG we have an on-click handler which basically just calls the update function. So when I click, it re-renders. Great. Not very impressive yet. But the cool thing about this is that, I mean, we could do the same thing with D3. But the cool thing is that this scatterplot component, right now it's just one component, but it could have been the whole application and it wouldn't have changed much, right? We could re-render the whole application in just one function call without it costing too much. Okay, animation. So we do some changes. Maybe we want to animate them. This is where React does not excel that much at the moment. However, it still works. So the straightforward approach is to introduce T, a parameter part of the state, right? Which says like in what time, moment of time we are in the animation. So it's between the animation based on this time moment and use the standard React re-render cycle, right? But that means that we have to do the diff operation on each change, right? And for this particle system kind of, this is basically the worst use case for which you can apply React because here every single node is changing on every single update. And there is no point in doing the diff because we know it's changing every time. It only costs something. So if we do the same thing with plain D3, it's much smoother because there is no diff operation. These examples were prepared by my colleague Jeremy Stuckey. You can find them on his blog's page. So there are some... So I mean, it still works, but it's slow if we do it this way. You can take a pragmatic approach. What if we say, okay, we are not really caring about the internal state of the animation for the whole application state, right? We can just use D3 transitions to animate between the states. And this works. It's a little bit of a slippery road because React is not anymore in full control of the rendering. But I mean, it works. We have a little bit of care. So we have this component update, did update. So React has those lifecycle events. They are so called. For each component, you can implement those methods. And this one gets called each time the component is updated, right? And here at this point, we actually have the real DOM node we can operate on. In the rendering phase, we don't have this node because it's not yet been rendered or not necessarily. Here we just create the virtual representation of our node. But in some of the last cycle events, we can manipulate the actual DOM nodes. And here we can run a D3 transition, which works. We can even emulate entering and exiting transitions, which D3 has, because normally this is a little bit more tricky, but it still works. Normally React, if your render method, after a subsequent render, removes some element, right? React will remove it from the actual DOM straight away. So there will be nothing to run this exiting transition on, right? So we have to tell somehow React to keep it for a while while we are running the transition. And this is what this transition group does. It basically adds two additional lifecycle events, which are not there by default, which we can use to run an animation, and then call back when we are done so that React knows, okay, now I can remove it. So it kind of works. But they are actually working on better support Facebook, I mean, for animation, and we can expect that in the future it will be supported. But even now you can do things. But you tend to develop applications with less animations. At least, at least I tend to. At least at first. And then you add animations as needed. Okay, here's just a simple example for one project I worked on. So internet permits. Yeah, here you, I hope you can see that those flows are animated. And the project is built with React. But this animation is a plain D3 animation running via D3 timer. So it's, again, pragmatic approach. But for the rest of the applications, it's actually absolutely relevant that these flows are animated, right? So it doesn't have to be a part of the actual state of the application. Never mind. Let's do some interactive stuff. This is about interaction. This is the fun stuff. We can create a slider component, which works, which we can manipulate. So what we have, we have properties with title value, on value change. This is a function which we have to, the parent component wants to get notified when there is some change, right? The value actually is not part of this component, right? It's not part of the state. It will be part of the whole application state. And it gets value injected, so to say, as a perimeter, and the slider only knows how to render it, right? We have another lifecycle event, component deep mount, which get called as soon as the node, actual DOM node is created. And here we can initialize some non-react stuff. We can use D3 drag behavior, which is very useful for creating something like this. We could also use standard react events for that, but drag behavior has also support for touch, which is nice. Why not use it? So handleDrag gets called by the drag behavior, and here we just basically take the mouse position and calculate the value we have to notify the component of the new value. So now let's try to build the application, which has an application state. This is a simple way to put it together, right? We encapsulate our application state. We have only one property density in this case. We have a getter and a setter. When the setter gets called, we render, right? And we have a component for our whole application now, not just slider, but here we do compose. We take our slider component and edit here. Then we have a diff with the actual density value, which you can see here, right? And the render function, which gets called when we change something, just renders everything, the whole application to the document body. Cool. That's quite simple. We created the scatterblot component before, right? Remember, why not reusing it? We just take it, we pass the right data to it, and so I added something to the application state, right? The points and the opacity in other property, which we can change here, and it works. So that's how it works. It's really, really straightforward to compose an application out of components, which you define, and which are functions. It's important that, given the same data, they render the same result, right? If they are like this, it's really easy to compose. So, yes, basically we built an example of this application using this architecture. If you're getting really serious about this, you should look at flux. Pattern proposed by Facebook for building applications enabled by React, but it's very similar. There are a few new concepts, like stores. Store is basically application state with the business logic, but in a flux application you can have multiple stores. In some cases you want to better modularize your stores. I'm not always sure it's a good idea. I also like the idea of one big application state. But if you have multiple stores, then you also need a dispatcher which, given an action for which in flux there is also an explicit expression. It's like an object which specifies an action, which can pass from a component to the dispatcher. The dispatcher figures out in what order the stores have to be updated if there are dependencies between them, so nothing goes wrong. But the idea is very similar. The data flow goes always in the same direction. There are no dependencies between the components, except from parent to child. Something really new which was announced recently by Facebook but hasn't been yet open sourced is really an illustration of flux. And here the idea is that, as far as I understand, that each component specifies its data needs. You don't anymore have to pass the data from the parent components down the hierarchy to the lower level components. You just say in every single component which data it needs. And the system figures out how to feed the components with the data on the client. It also fetches the data from the server in the most efficient way, so that there is no over fetching and under fetching. Which is really, really cool. But we have to wait until it's released. So if the internet permits I hope. Does it? Does it? Maybe not. I want to show a little bit. It doesn't seem to work. So, a sleeper map example. Sleeper map, like Google Maps. So, here we basically have, it's a rewrite of an example for a plugin for D3 called D3 GeoTile which basically allows us to, for a given projection of Mercator for a given setting. Scale and translate. To calculate the numbers of the tiles we would need to download from OpenStreetMap, for instance. Which we have to show actually draw in the application. So, there must be actually two kinds of tiles here. Which we can see for some reason. Because these are raster tiles. These are just images. So, I have here a component for raster tiles and there must be also vector tiles. So, it should look something like this. Yeah, it seems that the internet is not working. I'm sorry. I'll try different connection. A secret one. Let's see, let's see, let's see. Yes, yes. So, okay. Great. So, we have our sleeper map component with height and center position in Boston, in this case. Get initial state. So, components can also have internal state. Usually, you don't need that. But sometimes, you have some state which only this component should know about. If it's relevant for the rest of the application. In the case of the sleeper map, it can be either, I think, depending on the application. But here we just keep it within the component. The component gets re-rendered automatically. So, our state is scale and translate, right? It's like the projection settings. Component did mount. Here we initialize our zoom behavior. Again, it's using the D3 magic. Zoomed gets called when the user interacts with the map. And it changed the internal state of the component which leads to re-render. Get projection, get transform. And the actual render. Here we use this geotile plugin which gives us the list of tiles we have to actually show. So, there are two kinds of tiles. Raster tiles are the images and vector tiles are basically tiles of geometry. Geojson geometries, which we can render as a vector path. And we can apply some visual mapping to it. So, here they are rendered. So we have the simple raster tile component and the vector tile component. Vector tile component also fetches the actual geometries from the server when it gets mounted. And then so we set state, internal state of the geometries. It leads to a re-render and then we can actually draw the geometries. As paths. So this is how it works. Slow. Is it the folder of React? Yes. The thing is that we always re-render and here we have a costly operation. Tile path. It does the re-projection. Do we really have to do this every time we want to pan or zoom? No. There is something called shoot component update in React. It's basically a way to say, okay, this part of the tree of the virtual DOM, I don't want it to be re-rendered. I know that nothing has changed, right? By default, everything gets re-rendered. So render of every single node in the virtual DOM representation is called. Sometimes we know that we don't want it. So here we can basically say next properties and next state. These are like the new values of the properties and of the internal state of the component. We can compare them to the previous state. If they haven't changed, we don't have to re-render. We return false. But just adding this method, we can make it much smoother, right? So great. We could optimize something. So this is this method, shoot component update, which makes a difference. If you are having performance problems with React, look at this. How you can improve it. Immutable JS. Something using immutable data structures is good. You know that. We use immutable strings in JavaScript and Java and it's really good. Unfortunately, we don't use them for everything, like in Clojure. It's my favorite language and everything is immutable. But there are libraries for that, which we can use in JavaScript. Immutable JS was inspired by Clojure. And what it can do, what it offers is basically a set of a library for collections. We have maps, sets, lists, and so on. All immutable. It means when we change something, right? In our map. We get a new object. This will return false. It will be a new object. And the old object will still retain. So how does it help us, actually, here? The thing is that with this, we can implement shoot component update very efficiently. Imagine that you have a large data structure. And you want to check where it has changed. What do you have to do in JavaScript? You have to do a deep comparison, which is very costly, right? What do you have to do with an immutable data structure? You have to compare the identities, which cost almost nothing. And that's the power of immutable objects. Unfortunately, so you can structure your application state using immutable objects, but if you want to use it with D3, okay, that's not always easy. Many functions expect JavaScript arrays, right, in JavaScript. So we need to convert, or we need to use JavaScript objects. So there's no perfect solution for that unless we rewrite D3. So one other thing I'd like to mention is server-side rendering. You probably know that most of the visualizations which we create on the web with D3 or other libraries which use JavaScript to render something into the DOM, they produce something invisible to search engines, right? Because search engines do not run our JavaScript and do not generate those DOM elements. So, and this is not perfect. With React, because React uses virtual DOM, if you use something like Node.js on the server, what we can do, we can use the same, exactly the same code which we use on the client to produce the DOM representation of our components. We can use it to render our application into a string, right? So my app is React component which you also use on the client. But here on the server we basically say render this to the string and send it back to the user, right? So the search engines will see it. They will not have to run our JavaScript and also the user will see like the ordinary users will see the results earlier. They won't have to wait for our whole application to load and build those visual representations. So there are two advantages. What's also very interesting and exciting is that there are other implementations of React which use other destinations for other target platforms for rendering. There are some libraries for rendering into canvas. There is a new library called React Native made by Facebook where you can use React to develop iOS and Android applications. There is however no support for graphics in React Native. There are only UI elements so far but I'm sure it will come. There are some experiments even for WebGL, I think. Although they don't look very promising at the moment. But it's pretty exciting. It doesn't mean that you can run the same application which you write for the web, for SVG, producing SVG, to use the same application to render into canvas or iOS because you have other primitives, right? Our primitive components of which you build your application. But it's still the same approach, the same idea and it shows that the idea is good, I think. So React developer tools, I'll briefly show that. This presentation is actually a React application. And React developer tools is an extension for CRO which basically shows you your virtual DOM representation here. So you see I have this is my root component, slide show, player. I can even see the properties of this component which got passed by the parent current slides, the internal state and it's pretty cool for debugging purposes. So hot code reloading. This is something called 2. Okay. Yes, here it is. So here we have five sliders, right? So I just set some values. Great. We see the average value, here's the source code for that. Okay. Let's change mean to median, right? Did you see it change? A little bit, right? It was 69, became 67. Okay, we can change the label so you surely see average. Let's remove value. Save. Okay, changed. So it got refreshed, right? So this changed. Did you see the sliders change? No, they didn't change. Right. But it was a refresh, wasn't it? It wasn't really. It was a re-render, right? We re-rendered our application because the whole application is just one big function. We just run the renderer again. It rendered. Even if we change some source code, we change the render logic, right? But the state was not thrown out. We still kept the same state as we had before we made the change. This is pretty amazing. Imagine if you have a complex application where we have to click 20 times to get the specific state which you are currently developing, right? You don't have to lose that with this approach. Only works with Webpack. But I highly recommend to use it to package your application and manage dependencies. So, okay. Yes, this I showed you. One last thing I want to show. This is from a highly recommended article by David Nolan called The Future of the JavaScript MVC frameworks. What he did, he run a comparison. He implemented an application with React and OM. OM is a React wrapper which he made in ClotaScript. It's basically React still in the basis of it. And the same application. It was to do MVC in Backbone, right? And this is from Chrome developer tools. It's a flamethrower. So, basically, we see here functions which call other functions. So, this is time. This is a function which takes longer. This calls this function, this function, this function, and so far. So, we see how much every function takes to, like, how many functions get called and how long they take to proceed. So, first thing is that the Backbone was slower than the React version. React and OM. I think it was, like, three times or something. The second thing is that the pattern of the function calls, you can see it's so much different, right? Here we have lots and lots of small functions which get called. We have one model which updates another model which has another dependency. It leads to this very, very complex pattern of function calls. In React, we basically have, okay, re-render, do the difference, and so forth. Which architecture is easier to reason about? You can guess, I think. So, to summarize, React is fun to use. And it enables architectures which are easy to reason about because we avoid these unnecessary dependencies. It's a case to large applications and sometimes there are performance issues but they can be dealt with. But for some applications like particle systems, it's not the best choice, maybe. Although you can still develop a component and within the component do something which does not use React for rendering. Still. It kind of works. So, thank you. If you want to look at the source code of this presentation, you can find it on GitHub. To be sure, I'll try to improve it in the next days. Thank you. Thank you.