 So up an awesome at speaker is Pete Hunt Pete is head of web engineering at Instagram. I previously worked mainly at Facebook on the main Facebook He's a core contributor to react.js He is a guitarist plays a bit of alternative rock And he'll be talking today about react.js is virtual Dom. Let's give him a warm welcome Pete Hunt Hey guys, how's it going? Whoo, everybody get pumped up. Come on Nice, so my name is Pete. I I head up web engineering at Instagram and I contribute to a library called react And I'm gonna talk to you about the secrets of the virtual Dom This is the most dramatic title I could come up with for a presentation about software engineering So work with me on this one So react is the the library that powers a lot of features on Facebook Graph search page insights a lot of data visualization on canvas a lot of stuff on our mobile site including but not limited to photo uploads Search and all of Instagram comm is one giant react component But I'm not gonna teach you how to build stuff with react today. This isn't gonna be a tutorial Instead I'm gonna focus on the underlying ideas behind react rather than specific implementations And I'm also gonna talk about competing ideas and try to stay away from those underlying implementations as well So does anybody here build user interfaces? All right, basically everyone builds interfaces whether they can raise their hand or not. I feel like so they're really difficult to build and They kind of feel like they're more difficult to build another types of software and one of the reasons for that is that it just Doesn't feel right when you have a bug So a lot of times I'll try to get it as close as I can and I'll hand it off to my PM designer We'll test it with some some users and and there'll be a lot of complaints that we we didn't find before handing it off to them And the reason why why it's very hard to ensure quality when you're building user interfaces is because humans are in the loop First and foremost, so it's very difficult to write an automated test for this looks correct We can do selenium tests and we can do unit tests But it's hard to get the entire picture from a unit test So for example, we have a bunch of test cases Covering our mobile site for photo uploads and one time we started getting bug reports in that photos weren't loading on our mobile site We looked at our egress graphs that JPEGs were being served our test cases were all passing But it turned out that somebody had committed a line of code that set the height of the images to zero in CSS It's very difficult to test for that unless you were either using a screenshotting system or you thought of that ahead of time Additionally, we've got designers that are working really hard to build user interfaces that seem simple To the user but underneath that veneer of simplicity There's usually a lot of complexity as well a lot of state that you need to manage to provide the right user experience and Finally the tools that we have today for building software aren't as good as they are for building backends So unit testing I mentioned, but there's also things like static analysis It helps you find some classes of bugs But usually not the types of bugs that you find when building UI, you know, you got the math wrong on the layout It's very hard for you to apply static analysis for that And you know even like type checking is is helpful, but it doesn't get you all the way But at the end of the day user interfaces are just really complex things to build The good news is we're programmers and our job is to organize complexity So Jeremy beat me to the dyke store quotes today, but I got plenty of them So the art of programming is the art of organizing complexity of mastering multitude and avoiding its bastard chaos as effectively as possible One technique that we found one of the most important techniques we found is to focus on being predictable. I Want my app to break the same way in development that it doesn't production And I want to be able to reproduce the state of my application without doing too many imperative operations that are difficult to test and reproduce So another word for this could be reliable So let's focus on on making our user interfaces more reliable. What do we need to do? So go back to Dijkstra and I asked Dijkstra how to do this And he says we should do our utmost to shorten the conceptual gap between the static program and the dynamic process To make the correspondence between program and process As trivial as possible Now I'm gonna try to illustrate that So hopefully you guys are still pumped up from from from your coffee break So here's a screenshot of our of our buddy list And I'm gonna ask you what the state of this buddy list should be after I read off a series of interactions Okay, so go memorize what this looks like remember who's in the list you guys ready Come on. You're ready. Let's get pumped up All right Alice went offline Bob went offline Steve went online Bob went online Charles is idle and Charles is on mobile. What does it look like? It's very difficult for humans to visualize processes that evolve over time You've got to keep all of this state in your head, and you got to basically write down what happens and manually update this model Instead it would be a lot easier for us to visualize what the buddy list should look like if we looked at a consistent snapshot of the data At this point in time So if you look at this, it's probably a lot easier for you to visualize what that list should look like So the state of the art for doing this today is something called data binding. Does anybody here use data binding? Wow, I would say most people here using data binding awesome So what data binding basically does is it makes the user interface, which is one dynamic process Look a lot more like a static program relative to this other dynamic process Which is your underlying data model or domain logic or whatever you want to call it that's being influenced from user events and the network So at the end of the day, it's just syncing one data model with the user interface So one way I like to think of data binding is as a polyfill for reactive JavaScript in the DOM So if you're familiar with reactive programming, there was a talk on on Rxjs earlier today It's basically programming that responds to stimuli and this is a polyfill for that Now data binding is certainly a non-trivial abstraction and all non-trivial abstractions are a leaky to some degree I don't think the data binding that we have today is simple At Facebook we had a traditional kind of MVC data binding system And as we started to build with it the complexity snowballed to the point where we couldn't maintain our applications anymore So we started looking for another solution, but I'll get to that in a second Dijkstra also says that simplicity is prerequisite for reliability So the goal here is to make reliable predictable software, but we can't predict what we don't fully understand So we have to be able to fit it in our heads and in order to fit some software into our head it has to be simple now simple is one of these words that's thrown around all the time and Can take on a lot of different meanings depending on who you're talking to So this guy rich Hickey invented this language called closure and he has a really good Objective definition of what simplicity is and he says that simplicity is marked by the lack of interleaving So if you think about those times when you're trying to solve a problem And you look at a piece of code if it only does one thing You only need to hold that thing in your head to understand what it's going to do and predict What's going to happen if it has a bunch of different things intertwined into it You need to start reaching to other parts of the program and understanding a bigger chunk of it before you're able to fix that bug Or add a new feature Now it's important to note that I saying simple. I'm not saying familiar So I'm going to talk about a couple of ideas here. Some are more familiar than others Just because something might not be familiar doesn't necessarily mean that it's not simple and something familiar may not be simple either So I asked you to keep an open mind So the first idea I'm going to talk about is something called key value observation So this is the type of data binding implemented by Ember knockout backbone Meteor pretty much most systems out there and it's not just limited to the web either This is how Apple implements data binding on iOS as well and it's built around the idea of observables and computed properties Now the way I'm using observable here is not the same way that observable is being used in the RX talk They have a slightly different definition The way I'm using it is an observable is a value That can notify other things when it's changed so it has an on-change event associated with it Computed properties are very similar except they depend on other observables So a computed property could be combining two observables into a sum and that would be that the computer property would be the sum So let's walk through an example of how we build a system with this Imagine that we have this data structure coming from the server and we're going to build a a review site for companies like a rotten tomatoes type thing So I'm not a very good designer. This is the UI I came up with What we're going to do is we're going to show that the most popular companies Sorted by the total number of votes We're going to take the top three by popularity and we're going to show a rating based on the percentage of votes that were upvotes So the I'm going to show how we would implement this in KVO And I'm going to use ember as this example for KVO The reason why I chose ember is because they're a really talented team and they're focused very very intently on developer experience They're starting to be focused on performance as well So the way I see it if this team can't get KVO right and make it simple Then there's probably something wrong with the underlying idea So let's walk through this this is implemented in a language called handlebars Which is a domain-specific language for gluing ember's notion of observables to DOM nodes and The way that it works is that we have this each directive up here And it iterates over everything in this top company's list and it renders some DOM nodes Now note that we're not able to use arbitrary JavaScript expressions here This is all a custom domain-specific language Which isn't as powerful as JavaScript because JavaScript is not reactive and the DOM is not reactive So this is a leak in the abstraction that we have to build a domain-specific language to do this But let's move on there's JavaScript underlying this view as well So we'll start by creating a function that sums the upvotes and the downvotes to get the total number of votes Now in ember's implementation you use getters to pull values off of models But that's just a specific of the implementation We model each company And each company has a computed property like I mentioned before and that's that percentage score And the way that we compute that is we use we divide the number of upvotes by the total number of votes and we convert it to a percentage Now take a look at this dot property right there that indicates that it's a computed property And this tells the system hey whenever the upvotes or the downvotes change make sure that we recompute this score Computed property and when that score computed property recomputes it will update the DOM as well because it has things observing it as well So moving on We build our overall application model which has that top company's array and the way that we implement this is we we have this company's raw array and We sort it by the total number of votes and we slice off the top three and Finally we add a little bit of glue code to glue our JSON data structure that we got from the server into ember's system So here's a video of the test case that I ran where I Deployed this application and I randomly added a number of votes upvotes and downvotes to each of 100 companies every half-second or so and Here's it working You can see the percentages live updating But since I'm adding a random number of upvotes and downvotes to each of 100 companies and showing the top three by total number of Votes you would think that this list would be constantly resorting So I really appreciate the code review you guys gave to me, but we pushed it to production and we screwed everything up, so Geez, here's the bug You wouldn't have thought that this is a bug and the reason why this is buggy It's because we're only observing the list itself So only when the length of the list changes will this be recomputed We actually have to dive into the each item and observe the upvotes and the downvotes inside of each item in this array Now first of all there wasn't like a compiler time or a runtime warning that could help me out with solving this Additionally, this is a domain-specific language for expressing Expressing these relationships, so no static analysis tool understands it But what's most important is that I can't look at this function and figure out how to make this properties expression work correctly I have to dive into the implementation of that total votes function That's more of the program that I have to load into my head to solve this bug It's more interleaving and it's not simple. I have to know how the data binding system works for any piece of code that touches my data So when I make that fix Everything resorts and it works fine Now if you've worked with ember before You probably think that I don't know what I'm doing because I should have made that total votes a computed property on the company It reminds me of this quote from Einstein that says intellectual solve problems and geniuses prevent them With a traditional key value observation system you are not allowed to use traditional JavaScript functions to compose your application They don't work. I don't like that So let's look at another system, which I think improves on key value observation, which is dirty checking This is implemented most popularly by angular, but it's also Implemented by polymer as well with object that observe Now let's imagine we want to build this very common widget on Facebook comm which is a profile picture with a username attached to it The markup looks pretty great We have a controller and then we reference the reusable component we're defining and in in angular It's called a directive and in polymer. It's called a custom element So here's the the initial kind of boilerplate code We need to set it up in angular if you're not super familiar with angular The specifics aren't crazy important We have this scope variable which holds all of the data that's being tracked by the system and when it changes It'll trigger re-renders We then define the avatar component Now on Facebook we have other pieces of user interface as well Or one component is called a face pile and it's just a bunch of profile pictures So it would be great if that component and my avatar component could share code So rather than put all of the code to render that UI in this directive I'm gonna split it into two directives one that renders just the picture called fb pic and another one that renders the avatar called fb avatar now Notice to in order to compose my application here I have to create a string representing the parameters that I want to pass to this directive and then angular is gonna stick That into the DOM repars the DOM and pull it back out and then run this pic in The data doesn't flow in any way similar to how it flows in JavaScript We set up these two directional data bindings using these scope indicators here And it's just it breaks all your static analysis tools And it's just not the way that you compose programs in regular programming languages It's a leak in the abstraction that you have to use directives to compose your application in angular Now the point isn't that we can't build applications this way because people build them all the time It's that these crazy talented teams at you know Google and ember can't make this simpler So I wonder if we can't create the same things with drastically simpler tools So imagine for a second that we had a reactive JavaScript built into the browser today And we wanted to build this example It would probably look something like this we'd have a little bit of boilerplate code to fetch the JSON and We tell it which DOM node to render into We create a function to compute the total number of votes and Then we'd start creating DOM nodes So in this example, I'm gonna spare you the document dot create element boilerplate and simply call Methods named for each HTML element on this DOM object So we'll create a DOM node and we'll sort the companies by the total number of votes We'll take the top three and Then for each company We'll render a list item with the title and we'll embed a JavaScript expression in there that Calculates the percentage and the system will know how to keep that up to date magically because this is a magical reactive JavaScript Now let's suppose that we don't like all this nesting here We can pull it out and flatten it out Just like you can with any sort of JavaScript program by making a function and composing your application with functions And oh by the way, this is a lot easier to unit test to we can now unit test the row component separately from the application itself Now unfortunately, we can't do this today And the reason we can't do this today is because JavaScript isn't reactive and even worse than that the DOM is stateful So we can't simply just destroy the DOM and recreate it all the time because it's gonna lead to poor performance and a bad user Experience imagine you're scrolling and suddenly we destroy all the DOM nodes and recreate them you're gonna have a bad time So over the past couple years at Facebook and now an open source We've been building an abstraction around this that we like to call the virtual DOM And we think of it as a much less leaky polyfill for reactive JavaScript in the DOM than key value observation under a dirty checking So spoiler alert, this is actually valid react code today and it works in the browser and I think I'm Gonna be adding by the way that the today count is gonna start going over the future account with this talk because everything's available today so I want you to look at this code and And realize that we can use our full JavaScript toolbox here We can use all the tools of today to compose our application and there's not a single data binding artifact here You don't have any idea what sort of data binding system you need to use to keep things up to date because you're just writing JavaScript Now this seems kind of magical But that's because we treat your code like a black box and here's how it works Whenever anything changes in your application The virtual DOM and the reference implementation is called react will re-render everything to a virtual DOM representation Then what the system does is it diffs the current virtual DOM representation? Computed after the data changed with the previous one computed before the data changed Then we diff those two representations We isolate what exactly changed in the virtual DOM and we only update the real DOM with what actually needs to be changed This makes your applications a lot more expressive and when I mean expressive I don't mean the breadth of ideas like the traditional expressive power definition I like to think of practical expressivity, which is a measure of ideas Expressible concisely and readily in the language So like I said when we treat your code like a black box it makes your code simpler You don't have this data binding concern intertwined with the rest of your application so We all or we've recognized that there are some problems with traditional data binding systems But the reason why nobody does this alternative approach is because until recently people have thought that it won't perform well And if you think about it it kind of intuitively seems like it won't perform Well because we're doing all of this re-rendering and diffing of things that may not have changed But I've got great news Quotes actually originally about Lisp programmers. Sorry, David But every system has constraints right every system every one of these abstractions is leaky And we like to think of the virtual DOM as the set of leaks that we would prefer to have as opposed to these other systems So with KVO your app code is untangled with observables You need to know about how the observation system works at every single part of your application With angular style dirty checking It's a little bit better, but you still have to compose with scope and watches and directives and The leak with the virtual DOM is that you need a signal to say hey Something changed in the application and that's a much more manageable leak because it's pushed to the edge of your system You only need a signal from the outside and the inside of your code base all the meat and potatoes of your UI rendering is Unaffected by this and that signal can come from anywhere and come from object on observe Can come from a browser event can come from a network request in in ohm I believe they use request animation frame to signal this and there's some real benefits for Avoiding kind of these domain specific language approaches that you have to use with KVO as well Static analysis so linting minification and type checking when you're putting symbols and strings These systems won't understand that they're symbols and not strings past the user so you start to have problems with you know Google Closure Compiler advanced advanced mode or Trying to use type script on an observable list. It's very difficult So let's talk about performance a little bit The way that we look at performance today is a little bit different than the way we looked at performance a few years ago because mobile is a big deal and Memory is just as important if not more important on mobile than it is on desktop Or is more important than than CPU on mobile So if you think about it if you waste a lot of CPU on mobile You're gonna have a sluggish application experience and your user is gonna be mad at you But if you waste a lot of memory on mobile the operating system is going to kill the browser process and you're gonna have no user experience It's not gonna be fun So with key value observation you've got these observables and these computed properties And you have to maintain that entire dependency graph So you need to know that this computed property depends on that computed property depends on this observable And so that takes a decent amount of memory and CPU to maintain that representation Now contrast that with the virtual DOM approach when we're re-rendering all the time The render code is usually very cheap that you're using to render your user interface Because if you think about it, there's not really a lot of tight loops in your render code And if there are they're generally cacheable More importantly your view is almost always smaller than your model So in complex performance sensitive applications You may have lots of data, but you're only rendering a couple items This is one of my favorite quotes So I'm gonna dive in to some actual performance numbers Now measuring performance is very difficult And I just didn't have the time to do a comprehensive assessment across all platforms frameworks and ways of implementing things So I'm gonna use these numbers mostly as evidence of big O complexity Rather than comprehensive, you know millisecond by millisecond comparisons of these implementations because we're talking about ideas not implementations So let's go back to this example from before I implemented this in the production version of Ember as an example of the of KVO And the production version of React as an example of virtual DOM Now I didn't have time to build it in Angular and I didn't think it was really worth it because KVO is on this side of the spectrum virtual DOM is on this side of the spectrum and Angular somewhere in the middle When it comes to performance Characteristics and the differences of how it's implemented So When I ran the numbers at 25 items 25 companies in that list now remember We're taking the top three companies and we're displaying them and so there's 25 underneath that UI Doesn't matter, you know these numbers are like within a millisecond of each other and they're all under a frame anyway So who cares? When we bump it up to 10,000 items we start to see a pretty big difference in performance. I Mean it's not only in the initial render time It's also warm updates and steady-state memory after garbage collection Now the reason for this Can be found in big o notation Now is our is everybody here familiar with how big o notation works everyone knows everything there is to know about o of n Okay, it's linear Well, it turns out it matters what this n is So let's break it down into o of v which is the size of your view what you're actually rendering and O of m which is the size of your model that kind of the data you've downloaded from the server or the representation You're maintaining behind the scenes And what we found is that for applications that are performance sensitive You're rendering a lot less of the the data model than actually exists So I'm thinking about the past couple things that I've optimized at Facebook Which is our mobile search product and a big sortable data data table on desktop They're rendering a finite number of search results, but they may have thousands of search results cached So if we were to break it down and kind of a hand wavy way with with big o notation You would see that KVO can update in constant time virtual DOM updates in in linear time as a function of v But one of the big differences here is memory usage so maintaining that in that That representation of all of your your computed properties and observables gets to be really expensive When you use KVO whereas with the virtual DOM We simply sort once and save those top three items that were rendered because we don't have to track the observables throughout your entire application So we at Facebook found that this o of m here is very very difficult to scale So that was a pretty favorable situation for react in the virtual DOM. Let's look at one. That's a little less favorable Let's look at one where we're rendering a lot more nodes than our underlying data model and let's update a single one of them So this is kind of the needle in a haystack approach, right? So you would think that Rerendering is generating a lot of nodes We don't have to to generate and then we're walking all of them And that's a lot of different that we don't necessarily have to do Whereas with KVO all we have to do is trigger a single callback and update it. Let's see how they compare So we're going to render 1000 data items and then render an additional 10,000 DOM nodes So you'll see here that that virtual DOM actually does okay in terms of initial render and steady-state memory But just as you would kind of Into it the warm update is is pretty slow relative to KVO and this isn't just like, you know Textbook slow. This is way bigger than a frame. So it's actually noticeable by the user So if we were using a data binding system and we had a performance problem, what would you guys do? anybody Sorry caching well The in most data binding systems that I've used which is angular The advice is to use something called bind once which is effectively caching the value at the initial at the initial state So I like to think of that as disabling data binding because if you're caching the value and you're never updating it It's not really dynamic But with virtual DOM you would use memoization and traditional caching techniques To get this performance and so let me give you a little illustration of that With react we provide a hook that says hey memo eyes this component And it's implemented as a function that takes the the previous version of the data the next version of the data And it returns whether it should update or not So if we add this these six lines of code To our application all it does is say hey if the up votes or the down votes have changed return false or Yeah, whatever I'm jet lagged We can bring down performance to below frame so this is still ten times slower than KVO But it doesn't matter because this is a worst-case scenario and it's still under one frame different So we're talking about losing a lot of really attractive Properties like that o of m memory usage is is just not as good as o of v memory usage And the simplicity of pushing that data binding concern to the edges of your program And you're doing it just to gain under a frame of time So the secret of the virtual DOM is not about performance It's about simplicity We want to build sophisticated applications applications that are very difficult to hold in your head In order to make it possible to do anything with them and predict what they're going to do We need to keep things simple And I think that the virtual DOM leads to more simple architectures and Because it leads to such simple architectures and treats your code like a black box It's the most expressive way to build an interface in JavaScript bar none So remember our example from before we had to know about this data binding abstraction all throughout our code base We're not allowed to use functions to compose our application We have to use a domain-specific language that was developed, you know over the course of a year to Model your your binding between the DOM nodes in the JavaScript or you could use dirty checking Which is you know, you would use this unit of composition for your application Rather than this unit of composition for your application Now I don't like getting into the lines of code kind of code golf wars But I do think that orders of magnitude can illustrate fundamental properties about ideas So if you look at the bindings for Angular to a popular library called Firebase There are about a thousand and eighteen lines of code when I checked this morning The equivalent react bindings are 78 lines of code Now again You know arguing about lines of code isn't really productive But the orders of magnitude here suggests that there is an impedance mismatch between the way that Firebase thinks about data in terms Of regular JavaScript and the way that Angular does in terms of its observable kind of directive base model So performance is not the main goal of what we're doing here simplicity is but when it comes to performance Most workloads are just fast out of the box. You don't have to think about it But in the worst case, it's easy to optimize within one frame after you're done building your application Without breaking out of the fundamental abstraction So I ask you Please don't trade simplicity for familiarity Try out some alternative ideas see how you like them poke holes in them and we'd love to hear feedback about it Thank you Hello, and yeah, this is new for me that the virtual DOM and my question is Do you use all DOM manipulation with a virtual DOM or do you go down to the real DOM and do animations or or Changing and an attribute over time So the question was at which point do we drop down to the real DOM the real browser down So we render these virtual trees which don't actually use the DOM API themselves They're basically just JavaScript objects that say hey, I'm a tag of this type and I have these attributes or properties What react will generally do in most cases is just walk though walk that representation and find what changed And that is executed kind of at whatever time you see fit So om executes it on request animation frame react out of the box executes it synchronously Now in the case of animations you can implement animations in terms of that But a lot of people want to use jQuery plugins or something like that to implement kind of their existing You know, whatever animations they want to do and so we provide lifecycle hooks that let you say hey Here's the at this point. I'm going to update do your dump your custom DOM manipulations So we provide a way for you to do that in a managed way That answer your question. Yes. Thank you. Cool One right at the back corner there Hi And one of the things when you're managing like so you see like a state going into like one of you But some of the complexity like in a single-page app is when you've got lots of views talking to each other And normally people handle this through pub sub up to a point But how does like react deal with like this kind of managing state across things of an equal kind of tier in your view hierarchy So the question was how do you manage the state and communication of lots of components on the page? So the beauty of this abstraction is that it's a fairly lower level abstraction So it's it's more like a declarative jQuery even though the word declarative doesn't mean much It's an easy way to think of it Something that sits more at the jQuery level so you can implement any sort of communication paradigm you want So what we tend to do is we we have a high-level coarse-grain pub sub that communicates to the various you know apps on the page and then within those we use Composition to build the rest of the application and composition with react basically lets you you know Automatically update the data whenever the data is changed Did I answer your question? Cool Got one last one No, all right brilliant. Thank you. Thanks