 All right, good morning. So great start of the morning, actually, two very inspiring talks. And I hope I can keep up with that level of quality, actually. I'm Dan, and I live and work in Amsterdam. Everyone knows that city for the wrong reasons. We have about, I don't know, 800 years worth of cultural history. And so that's Golden Age, VOC, painters like Rambrand, Fango. But every time a tourist asks me for directions on the street, it's something like, hey, man, do you know the shortest way to a coffee shop? So as an engineer, I saw a little problem there. So I started a little guerrilla warfare. And every time someone asked me that, I give them directions to the closest museum. Get some education, punk. So first of all, I want to speak my heart here. Yesterday was really great, in my opinion. There were some really inspiring talks. But I had a lot of trouble focusing, because I don't know if you can see it. Do you guys see that little string right there? Moving in the wind? It's driving me insane. And as soon as you see that one, you see the other one as well. And you just can't unsee that anymore. So Adam, is there something we can do about that? All right, that's too bad. Well, just try to ignore this stuff. Thankfully, my presentation is in a 16 to 9 ratio. So you won't be bothered by that in my presentation. But just try to keep that in mind, all right? So the back bone tango. Is it flickering? Not for me. Oh wait, that's the remote, maybe. Just tell me if it's for flickering again. I'll disable the remote. But back bone tango. So I work on a web application called MapKit. And just let me jump straight in. What's MapKit? I'm going to show it to you. This is MapKit. It's a geographical information system, which basically means it's a program that uses a map to present data to its users. And in the case of MapKit, it's used by public utilities to manage the maintenance on their water networks and sewer systems. So it allows them to quickly see what the problems are in their water network and which parts need repair and which parts are in need of an emergency response, for example. And in addition to that, we do some advanced network calculations to show bottlenecks in the water network and try to predict future problems. And well, you can just click everything in this application. You can click on, for example, our round thing is a hydrant. Let me open up the legend here. So you have the blue lines, which are water pipes. The round things are fire hydrants and the triangular things are valves. And you can click on any of them to get some additional data. And for example, a mechanic can go to this particular fire hydrant, which is blue right now. And you can just add another report and say, all right, there was something wrong. Excuse me, that's Dutch. I wasn't able to translate everything into English on such short notice. But you can just say, well, this is solved. You press Save and it turns green. So you can just move around, select everything. And well, considering the bottlenecks I was talking about, say, for example, you click on this valve right here. If I zoom out again, you can see an entire part of the network just turned black. What I just did was I selected a part of the water pipes. And I said, all right, I want to close this part off. So the red part, I want to drain that from water. I want to work on that. And MapKit is going to tell you, well, are you sure? Because you're just going to put 1,600 families without water right now. Is that really what you want to do? Or is there maybe another way we can do this? So they can use MapKit for that kind of thing. Now, the most important thing for us, we use Backbone for pretty much everything in MapKit. So we use it for routing, for rendering, for, well, everything. And especially for data, because we have a lot of data. We have millions and millions and millions of these valves and fire hydrants, because we manage the water network for pretty much the entire country now. So we needed some ways to improve performance. We can't just load everything in at once. And on top of that, MapKit is mostly used by users which are out in the field. There's our mechanics with a laptop bolted inside their van with a mobile internet connection. And they just drive around. And, well, the connection is spotty at best. So we needed to squeeze every little bit out of performance optimizations to get it to work snappy. So what I did for this talk, I selected four topics that I want to talk to you about today. And they're all regarding optimization of communication. So the first three are going to be about that, about how can we make it faster. And the last one, the Q is about improving reliability. So I'm going to start with indexing. This is a bit of a preliminary topic. I'm going to explain a bit how we load in data, how we get that stuff, and how that paves the way for lazy loading to minimize the amount of data we actually have to load in. And for grouping regress together. And I am going to conclude with the Q, which is designed to minimize data loss and improve reliability of sending data out. So let's start with indexing. Like I said, MapKid has to manage a lot of data. And everything is displayed in a map like you see here. These are the fire hydrants. But if you keep that in mind, it's actually just a table. It's just plain tabular data. And if you want to lazy load a table, well, it's pretty straightforward on the backhand side, not in front end. But on the backhand side, it's actually pretty straightforward. You have, for example, let's say this is a table and this is batch one. The user scrolls down and you load batch two. You can order these batches of data in the order in which they appear. Now for a map, that's a different story because the user can move in any direction he wants. So you cannot number these batches by the order in which they appear. So we need to come up with a different solution for that. How do we number these batches of data? Now let me show you. So this is MapKid again with only the fire hydrants. And, well, you might have guessed from the scheme I showed, but we divided the entire world up into a grid. And you can just zoom around. Let me zoom around on that. You see, you can just see everything is in a tile and immediately you can see every hydrant has a one-on-one relationship with one of these tiles. And since every tile has a unique X and Y coordinate, you can calculate a unique ID for that specific tile. So you have a one-on-one relationship with a tile and a hydrant and you have the unique number for that tile. So when you move around, you just see the IDs coming in and you can just ask to the server, give me all the data for that tile ID. And it's pre-computed, so it's kind of an index, really. And that's how we did that. So that paved the way for a few optimizations we did. And the first one is lazy loading. This one is really important for us because it's a lot of data and well obviously we need some way to lazy load and grow our collections. We start with nothing and as soon as the user moves the map, we slowly grow that collection. Now let me show you how that looks. Here we have the same tile index, all the tiles are green and when you move the map around, map kit knows which tiles are new and also knows which ones it already has in cash and it doesn't have to load those. So if I just move the map a little bit, now these ones are green. Map kit knows, all right, I don't have to do anything. And the same goes when you zoom in, right? You don't get any new data in and when you zoom out, you will have to load 56 new tiles. So how did we implement that? Well the most important thing actually is we just added another fetch method. So normally you do a fetch and we do a fetch tiles actually. But before I get to that, I need to explain a little bit of how the map works. So the map consists of three parts. It's just a usual MVC structure. So the map itself is leaflet, open source, that's the view layer. You have your controller and you have your map collections. So a hydrant collection, a valve collection, a pipe collection. These are just globally available singlesons and everything starts when the user moves the map. So it triggers a move-end event. That in turn triggers the process map move method. And the map controller is just going to go by every map collection to ask, hey, I have some new tiles in view, do you want to fetch them? And it's entirely the responsibility of the map collection to get that data, cache it and then send it off for rendering. So what we could do is just, we have our tiles, we have our data, send it off for rendering, call the display models method. But we decided not to do that. Instead, every time the user moves a map, we create a little fetch controller. That is just a little object with three methods and we pass that along as a second argument to the fetch tiles method. Now, why did we do that? Well, the most important reason is you can have multiple maps inside the application. So you have the main map you already saw, but there are several smaller maps which are their own instance of the map together with their own instance of the map controller. So if you would like to have a constant connection between the controller and the collection, you would have to have several permanent connections. So that's really cumbersome. So we also could have just passed the map controller as an argument. And we decided not to do that because this solution provides clarity of intention. It's very obvious what is the purpose of this. The map controller is saying, fetch me some tiles. I want these IDs. This is the list of IDs I want. And these are your options for interacting with me. So whenever a map collection is interested in fetching tiles, it'll say, okay, I want to fetch. When he's done, he calls fetch ready. And when he wants to display something, he calls display models, which get relayed to the map eventually. Okay, so like I said, the most important part about this is the fetch tiles method. So I want to talk a bit more about that. I'm going to focus on the hybrid collection just as an example. And first let's look at some code because every map collection for us is a subclass of mapkit.collection which in itself is a subclass as well of backbone.collection. Now the most important thing we do is we create a tile index object, which is just a cache. And then we listen to the add and remove events. And whenever a model's added, we try to index that. So we read the tile ID I showed you previously and then we try to add that to the tile index object. Now this is simplified code. I mean, there's some stuff happening in between but this is the most relevant part. And you have a remove index entry method as well. I can, I assume you can guess what that does. And like I said, the hydrogen collection is just a globally available singleton which is a subclass or an instantiation of the mapkit.collection. Now with that in mind, let's go back to fetch tiles. And like I said, you get a list of tile IDs and your fetch controller. And the first thing it's going to do is indicate that it wants to fetch, right? It wants to get those models. It's gonna call prefetch next and it's going to look in that tile index to see which tiles are already loaded. Now if it finds some of those tiles, it's going to send them off for rendering. And it's going to return a new list of tile IDs and these are the ones that are not available yet. So these are going to get sent off to the server. Now if that list is empty, we're done. But if it's not, we're doing a request to the server. And when we get a response, we send that stuff off for rendering and we're done. Now if you look at that in code, it is another method added to mapkit.collection. Fetch tiles, you get a list, you get the controller. We indicate that we want to fetch. Normally this would be under certain conditions but I just simplified again. And then we do a prefetch which expects the new list of tile IDs which are not available yet. So it starts with an empty list. It loops through the tile IDs it got and it tries to get the models from the cache. If they are available, sends them off for rendering. If they're not, that tile ID is added to the list and we return the list. If it's empty, we're done. And if it's not empty, we're going to request those tile IDs. Now when that's successful and we get the models back, we try to add them to the collection. And this part relates to what I showed you previously. We listened to the add event. So these models get automatically indexed so the next time we want those tiles, they're going to come from the cache. And then we do a display models of fetch ready and in case of an error, just a fetch ready because there's nothing to display. And that is a bit how that works. Now the most important thing is of course what's it gaining? And let me show you that in another demo. Is that readable at all? The R is pretty legible. So what I did for this demo is I pre-recorded a set of movements I made and I'm going to replay that recording a few times with settings enabled and disabled. So first I'm going to disable the lazy loading collections and I am going to move around. So right now it's really stupid. MapKit is really stupid. It doesn't know where it is. It doesn't know which tiles it has. So it's going to keep requesting the same tiles over and over again. As you can see, it's pretty slow right now and it is loading a lot of data because if you look here, you can see 529 kilobytes for just a few movements. Now let's type that in and right now we are going to enable the collections, lazy loading, clear the network log and do that again. Now this time it's going to cache the tiles. You can see when it's zooming in it's already faster right now because those tiles are already loaded and we're done. And this time we only loaded 150 kilobytes which means just by enabling the lazy loading we have a reduction of 72%. But we can do better than this because we can also apply lazy loading to models which is actually pretty simple. So let's take for example, the defaults of a hydrant, right? So every time the user moves the map we only load a subset of the actual data that we need. So all the hydrants you can see in the map it's only this part. We only need a location which is the point and we need a status to draw the little shape. Now whenever the user clicks on the hydrant that's when the additional data gets loaded but how can you tell whether a hydrant is fully loaded or only partially loaded? Well in the case of a hydrant you could just check for an address for example but other data types might not have an address even more so. Some hydrants don't have an address so you would have to add specific code for every different data type to check whether it's completely loaded or just partially loaded. Now instead we just add another attribute which is sparse and it's true by default which indicates that this model is just only partially loaded. Now because it's true we're gonna fetch the additional data from the server and the server is going to respond with the additional data together with sparse set to false. Now the next time we need the additional data we can see all right it's already false we already have it we don't need to download it again. Now if we add that to the same demo we go back and this time we enable it we clear the network log and we do the same movement and this time it's even quicker because it only has to load the queries are actually simpler right now and we load 102 kilobytes and that gives us a total of let's say 102 so that's a total of 81% reduction we only have to load 19% of the actual data we should normally load and that speeds things up so that is lazy loading. I talked about lazy collections we steadily and slowly grow the collections when the user moves the map we use a tile index cache property and a custom fetch tiles method that just checks the property if which parts are already loaded and on top of that we apply lazy loading to the models and for that we use a sparse attribute to just check whether that object is fully loaded or not. Now this presents another problem because it's actually perfectly fine if a user clicks on a hyran or a valve and you have to load the additional data but in some cases MapKit is going to need the additional data for several items at once. For example when clicking on water pipes now before I am going to show you that I'm going to talk a little bit more about these water pipes so consider this this is a part of the water network these are water pipes water is running through them and there's a problem so we need to work on this pipe we need to repair it and for that we need to drain it it needs to be dry because otherwise it would be a very uncomfortable job so it would be perfect if there was a valve on either side of that problem so we could just close the valves drain the pipe and just get to work problem with this is you get an insane amount of valves in your network this would be a complete nightmare to maintain so instead they introduce the concept of sections and now if you want to drain that pipe you have to drain the entire section and you can use MapKit to show you what is that section which assets do I need for that and so MapKit can highlight that stuff for you and you can probably already see we are going to need the additional data for several items at once if you look at that in a demo I am going to disable grouping and I'm going to select one of these water pipes let me clear the network log so I'm going to click on one of these nice why was that? that's always a tricky part of live demos alright let me do that again grouping off clear the network log and select that section oh I certainly do yeah normally this is because yeah right see that is the great benefit of client dictated high security automatic log out functionality for you lovely sorry about that oh really alright I am probably able to fix that by doing that there you go just pretend like nothing happened and we are going to select that section so alright so now you see nine requests right let me zoom in on that you see we need the extra information on that section we need some more information on valve section valve section you see the pattern it needs additional data on about nine different things if you enable grouping clear the network log and select the same section it will group those things together right resulting in only three requests now this is mainly important like I said because our application is used on spotty 3g connections and you get huge round trip times which can really add to the waiting time eventually uh... so I just did that uh... so how did we do this grouping thing well for that we added another method to our collection which is called fetch model and we had to make a decision uh... whenever we add or whenever we get additional data we can only use collection dot fetch model we don't use model dot fetch we don't use collection dot fetch we can only use this method and the most important thing is just a simple time out we delay the actual request by fifty milliseconds and every other request that comes in just get appended to a pending list now from the top uh... we need a model so we get the model if it exists and we created with no attributes if it doesn't and then we're going to check if it's already pending we use a cid for that and if it's already pending well we can just wait until the uh... the delay resolves and we get that stuff uh... if it's not pending we make it pending so we need we have two attributes for that and well we set the time we set the time out we set the delay and we return the model so at this point you can return an empty model so on the other end you probably need something like a promise or a deferred or whatever that's not the focus of this talk right now so i'm going to ignore that for now uh... but if the delay times out splice the models out of the list so we get all the models that are currently needing to be fetched and we generate a query string based on those models let's say you have four valves that you want generate a query string like this so every time you do a fetch models it's going to do a fetch models with a unique url we don't have a fetch all models for this collection url we only have a fetch models with specific id's url and eventually we do just a fetch with that query string and we empty uh... the pending list that's about it now let me show you what that means in terms of time needed to do uh... the selections so i disabled grouping again and i'm going to do another one of these recordings and next to that i'm going to enable uh... the three g emulation very nice new feature chrome excuse me clear the log and i'm going to make a few selections now with grouping disabled all right so there you go um... as you can see here that's about a hundred and twenty requests we had to make and it took us three point six seconds to do that now if we redo that same thing but this time with grouping enabled remember one hundred and twenty requests and we do it again and this time we only needed sixty requests or sixty two rather and it only took us well only it took us two point nine seconds so we shaved about uh... a little less than a second off now this sounds maybe sounds like not very much but uh... a hundred millisecond round trip times you could see on top here is actually pretty optimistic uh... what we see is more things like two hundred milliseconds and three hundred milliseconds at a certain point this is going to uh... add up for us so that is grouping and now i get to my final topic which is the queue and this is an important one for us because these mechanics that are driving around the fields with the spotty internet connections they actually provide very important data for us that data that they add to map kits is essential for uh... the calculations that we perform so it's uh... it's unacceptable if that data doesn't arrive uh... so what we wanted was some sort of request queue that would keep track of all the requests that are happening and as soon as they fail it should hold on to them and retry them until they succeed so we had a few requirements uh... the first one obviously retry failed requests until it succeeds very important as well it should survive page requests so we don't want to bother the mechanic with stuff like hey man uh... your your request failed uh... could you please wait he couldn't give a fuck really he just wants to get on with his work and now he just thinks alright this is just a stupid computer i mean these guys just slam around with these laptops uh... they don't care so uh... if it fails it should retry that later on uh... because for us it's not really important uh... that the data arrives at that specific moment it's perfectly fine if it arrives three days later it just has to arrive that's the most important thing so if it if it fails on friday and it succeeds on monday that's just perfectly fine and finally should be invisible to backbone because we didn't want to alter the behavior of the actual backbone library now we searched around for this but we couldn't find any open source uh... software that did this so we developed it ourselves and let me show you how it looks now this is a bit of a uh... fabricated demo because like i said it's invisible so i had to come up with something uh... on the left side you see a few buttons and these are just uh... fake well they are actual request i'm going to do perform actual request to the server but this one you know it succeeds immediately uh... this one is going to fail on the first try and it's going to succeed on the next try and so on so this one as you can see it's going to retry in two seconds now for the next one is going to retry in three seconds and after that it's going to succeed now i can just add whatever just do this i can just refresh my browser doesn't matter it's just going to refresh it again it's just fine it just keeps track of everything refresh that again and sooner or later it's going to send them all out so wait for the last one yeah all right so how did we do that well for that we obviously we had to um... override the sync method because that's where the actual communications uh... happened and we only needed to replace the backbone dot ajax call in that method and we replace it with a q dot add and the first thing that does is it creates a request object so normally you would create like an x h r or something like that but we create a request object which is just a simple plain javascript object first thing is uh... is a g g y d globally unique identifier uh... so this makes every request unique this is good this gets sent along uh... to the server so the server also knows which is which uh... so with this idea you could for example implement some back-end code that prevents replays and stuff like that uh... the next three things are just what backbone creates parents options and a model uh... and then we have a status this did it fail are we working on this is it's doing nothing and we store the last and the next timestamps when was it retried last and when should we retry it next and then we record how many times it been tried and how many times we should retry it so there's uh... two scenarios you can get data uh... and you can send data obviously uh... we're not very interested in the get part so we just go straight to execute i mean that can fail uh... the user probably retries that so we're just gonna focus on on sending data and the first thing is uh... is not a run it's going to store that request and for that is going to store that in two places we're going to store that in the browser memory as well as in the local storage uh... and with that we can survive page request right so we need to synchronize those two so whatever's in browser memory needs to be in local storage not so much the other way around because there can be multiple tabs open so that can be more in local storage than what is in memory right so we added a layer of obstruction to that to take care of all that synchronization and stuff and the first one we already used so uh... right now the the the request is in memory and now we can call a run which starts a simple loop so we're gonna loop through all the request there are we're going to get the additional information and we're going to try and execute that uh... request now if it's already busy status busy we can just ignore that stuff if it's not we set it to busy and we go to the next one after sending it off for uh... to the server obviously uh... now as soon as the server responds it can either be successful in which case it gets removed or it can be an error in which case we retried a run loop with a little delay uh... and that's the next time stamp so uh... if we tried it two times we do the current time stamp plus uh... three seconds for example and we rerun the loop with a delay so what this means is uh... it is actually not always running it's only that the loop is only running when there's actual requests being made well that is how to queue works really and for this i am not going to show you any code because uh... we open source this so you can look at it yourself uh... it's called uh... cucumber it's available in github and there's a few caveats you uh... you should remember it's still a work in progress i mean we're still working on this we're trying to find the best way actually to uh... to do this uh... there is some documentations uh... documentation there is not a lot of tests so if you have some tomatoes left from from henrik you can throw it now but other than that it works pretty pretty uh... pretty well actually and um... we're very welcoming to uh... critique or or ids or bug fixes or anything just uh... have a look but the most important thing is it requires some back-end code because things are going to get shuffled failed requests might arrive later than other requests everything is going to be turned around so you can just you cannot trust uh... the data anymore you have to perform some some back-end synchronization probably and with that i conclude my topics so we talk about indexing little bit of intro laser loading to optimize what we what we send and receive single thing goes for grouping and accused for improving reliability and that's all i had to say thank you