 Okay, hi everyone. So this is going to be the resolver app, obviously. This is, again, online. But what I'm going to try to do is actually talk to developers, engineers, people who are interested in more in-depth detail. Obviously, we don't have enough time to go into all the interesting bits, but hopefully I'm going to give you a taste of how things work in the background, and what interesting technologies there are, and then feel free to later, you know, come and talk with all the collaborators who work online. So this is essentially a quick snapshot of, you know, the different applications that we have, and Olivier just demonstrated some of them, which is handy. A quick overview of what I'm going to be talking about. So briefly, you know, give a background of why online is really important. Then we dive into the architecture, and we wrap up with a quick note on, you know, how we tackle scalability and performance and all those nice things. So the idea of moving online should be fairly accessible and obvious to everyone, but it might not be as easily obvious that it actually, with all the benefits of being able to access your documents and edit them and all that, you also have a lot of challenges. You have a lot of performance-related issues that you want to solve for your customers because everybody's used on having all these features on the desktop, and the desktop is obviously snappy and local, you come to expect the same thing on the web. And this is the main challenge that we need to solve so that we have something, a product that's actually usable and is at least as good as desktop. But then we add, you know, the extra features, you know, the cherry on the cake, if you like, of allowing them to do things that they can't do with the desktop, like having versioning, having collaborative editing where they can edit the same document with their colleagues and friends at the same time and extend from there to even, you know, more rich features. So this is an ingenious view of the architecture. You can see it's a very accurate diagram, millimeter precision here. Let me spend a couple of minutes on this because I think it's important to visualize what is, how does the system break down really? What you see here is on the one hand you have clients connecting to a server, the web services demon, the WST, the blue, and these are just regular HTTPS connections. So you have SSL, you have certificates, you have all the nice security, you know, security that that brings you. But beyond that, what you have is a server that's actually self-contained. It does serve all the files that we need on the web, you know, all the JavaScripts, all the HTMLs that we generate. So it is self-contained. We have full control over what the clients get from us and the bits that we exchange with them. We don't have extra dependencies that we need to worry about in terms of having another server. Beyond that, you really have this green process here that is a helper for making sure that we can very quickly and efficiently fork the actual instances of the kit that does host the document and does all the magic that you see online. So that fork it is the magic that does that and essentially it is just forking itself whenever we ask it to do. And beyond that, it does nothing else. So it's a very lightweight approach. Once we spawn the kit instance, what it does is it actually loads the core libraries. It connects back to the WSD, so you get a dedicated WebSocket. And from there on, it has a dedicated channel to talk with WSD, and WSD will be able to utilize its resources the next time a client comes along. So this is, you can see from my verbiage that this is actually happening in advance in many cases so that we will be prepared. So there is always an extra instances that are running before clients connect to you. And once we've exhausted our instances, then fork it is going to rev up and spawn some more to match your configuration number. So once we get a client connection, things start getting interesting, and I'll get to that. So briefly, mostly I've mentioned these stuff, but briefly what is happening is that obviously internally you have the core and that is exposed to us using the LOKit API that Miklos yesterday talked about. And every document has its own dedicated process. Which incidentally happens to be jailed. So there is a CHRU process going on that jails that process into essentially its own world so it can't see the actual file system, it can't do anything privileged, and that is dedicated to a single document. So when you are sharing that document and working with somebody else on that document, you have implicitly given them access to the document and everything that the document contains or represents. So that is fine. So they share that process with you. And between you there are no firewalls, as to speak, but for everybody else working on other documents, you are jailed and you won't be able to break out if we've done our job right. We do have designed collaborative editing into this and this is the latest feature that we've been working on very hard past several months. And we integrate with major document storage and networking platforms. Currently we have on-cloud, next-cloud integration working and there's more to come. On the web the technology that we're using is fairly standard in terms of you have all your JavaScript that's doing all the heavy lifting for you, you have a single essentially platform that's essentially portable and compatible with a large number of browsers. And that is powered by a mapping library called Leaflet which gives us the ability to use it for all these tiles which are very similar to when you're browsing or navigating a map. We're going to see a little bit about that as well. And obviously when we do the integration with the document hosting platforms we need to have a way to interface with that so that you have your UI elements, you click on the document and internally it does all the magic that lets you load the document from them. And we currently support the WAPI standard for example that gives you the ability to access any document with a token that can expire and the token can be authenticated with OAuth or whatever that platform supports and all this is happening pretty much in a standard and abstract way so it's not too complex to add new platforms. So the tile rendering is really as simple as you can see. This is one tile and one can visualize that there is one on its left and one on its right and below it as well. So the browser is really seeing an array or a matrix of PNGs right next to each other. Then the JavaScript is doing all the magic of interacting with it. So when you click somewhere you will see that the browser is responsive and there is a cursor but all that is happening thanks to the JavaScript and that part of the cursor is rendered in the browser but in reality your actual document is rendered as an image to you. The tile rendering itself is actually quite an important part of all this as you can imagine. It affects your performance because if you're typing something we need to render that part of the document that just got modified and prepare a PNG and push it as fast as we can to the browser and the browser needs to essentially flip the old one and the new one so you immediately see your character pop up and that's when you know you've typed something in the document. And to do that we have a lot of performance essentially sensitive design elements. Some of these are that we cache all the tiles, make sure that we render them once until they are invalidated. So once you've rendered the document, if you're doing nothing other people who join in can see the document pretty much immediately because all the images are just downloaded essentially. It's like visiting websites. So you see the document just pop right at you. Incidentally, if multiple clients are asking for the same tile at the same time and it's not rendered yet, we're smart enough to know that and we know, okay, this tile is already being rendered and we keep track of who's asking for it. So once it's rendered we send it back to all of them. We broadcast to all of them at the same time. And we also have the ability to combine these tiles into larger blocks. So we render a much larger block and then we split it up into tiles and we cache them and again we send them to the clients. And obviously the clients might ask a big chunk of tiles but then they change their mind because the user just scrolled away from that page and now they're going to ask for another bunch of tiles. So what they do is they say just cancel all the previous ones, give me these new ones. Obviously we know not to send them or even render if these tiles weren't ready to cancel all that and not worry about it and just start working on the latest stuff. This is the second and last, I think, diagram that I have today. Very light on diagrams this talk. This is what actually happens when you're loading a new document. Again, very quickly for those who are interested in the technical details, what is happening is the client connects to the WSD and says hello there. Just the fact that you've connected to us is reason enough that we need to make sure that we have a kit process running. And if we don't have that, we tell the for-kit process to spawn one and create a session for us. And that create session is what tells the kit instance to connect back to the WSD and to start the process of preparing to load the document. The client, the connection itself when it says hi there, that must include some sort of a valid URL to a document. So we need to have an idea of what you're trying to do here because we do checks and there is a security layer we don't let you just reference anything. So for example, referencing local files and things like that would be blocked and sanitized. Once the kit is connected to the WSD, then there is this handshake that goes on where we need to match this kit instance to this particular document and that document to all these clients potentially. We start with one and then we add more. So there is this structure that we need to maintain across these processes and connections. And once we've established this connection, then the client is free to actually send the load document request which can have options. For example, you can pass it all the nice options that you can actually have on the desktop. Like for example, to hide the white space between the pages. You can customize essentially some of the things and once you've done that, you reload the document and from there on it's all communication that goes back and forth between the kit WSD and WSD forwards it to the correct clients. The protocol itself is between the client and the server and internally and back and this is pretty much a minimalist protocol. Most of the commands that we have in responses are just a single line of text. Nothing fancy. We have a command name that potentially can have multiple arguments. You have a simple structure where we have arguments that are essentially just a space separated list of arguments that we expect to be structured in a specific way. So it's a very strict formatting and then you have a flexible extendable format which is a JSON and the JSON is essentially only allowed for certain commands and responses. So again, there's tight control on what the protocol allows but it is flexible enough that if you say, okay, the JSON is allowed here then you can have a very dynamic structure provided the client or whoever it is receiving it knows what to do with it. Tiles are the exception to all this because tiles are by nature binary and that's the only case where we have a single line response which is a header if you like and then beyond that the rest is essentially the PNG binary and that is the only exception to the above. The communication is happening between the client and it ultimately has to reach the kit instance but then the core is reacting to all these events that are happening by issuing its own events by saying this happened, this changed. When you add that page break or a new line and there's a new page added to your document your document size has changed so all these information needs to be sent back and that is happening in essentially the same way that the commands are happening. I'm going to talk briefly about the events here. The way we integrate with the core is by this API interface and the API essentially has two parts if you like. One is the commands that you want to invoke on your document and the other one is the events that you want to receive back and those events are coming by registering callbacks and we have two callbacks. One is for global events that happen on the document level. That includes the status indicator update and password if there's a password on the document or not. These are very common things but then you have view callbacks and these are essentially they're per client so every client that connects when we create the session for that client we have a mapping between the client the client has a unique session ID and we know that session ID belongs to this document and ultimately that session ID has its own callback that's registered with core so when core issues events that particular client needs to receive that callback is invoked from there we figure out which session it belongs to and the forwarding process back to the client is done uniquely for every client connection so some things are broadcast to everybody some things are dedicated to a specific client currently core is doing the broadcasting and online doesn't really need to care about that just yet. To make things even more efficient for those who know the internals of core the code you will know that there are many events that are fired at different points in time and some of them are redundant you know undo the others because they supersede them and we've noticed that this could be a really big problem for example if you make a modification you might get multiple invalidation so multiple redraws on the desktop it takes microseconds to render a small piece of your device but if you're going to invalidate that same tile or multiple tiles if you're at the edge of one then you're going to render multiple tiles you're going to make the clients do more work and there's a lot of traffic going on and you can't sustain that so what we did is in core we've added a queue an event handling queue that is flushed on idle and on top of that we make sure that we can invalidate sorry we can actually compress and deduplicate any redundant events so that we minimize them to the bare minimum quickly what happens when somebody makes the changes the user input is processed by the JavaScript that then it gets forwarded to the WSD the WSD is forwarded and yet again to the kit the kit invokes the API on core core is doing its magic at this point the client doesn't need to do anything it doesn't know what's going to happen next and then core is going to issue an event if there is a modification that event is going to go back through the callback through WSD and forward to the client the client is going to figure that okay there is an invalidation let's say as a response to the previous event that it issued and it requests tiles and now the tiles will get processed separately so there's a bit of back and forth because it is essentially the same thing that happens on desktop is happening now across processes and across connections and geographic domains really threading is actually very critical as well because we cannot afford to have many threads per connection so we have to work with the bare minimum internally the core instance is just there is a single instance and we need to synchronize all our operations around it so with multiple views you need to take a lock and set the views to tell which client is really doing this modification and then you jump to calling the API functions you need and then you release the lock and everybody is happy so you kind of minimize things finally scalability what we need is a way to measure things and what we really need is to have a comparable benchmark so we've come up with this stress testing tool which actually does two things that have an overlapping purpose if you will one is to do purely a benchmark so we just run a bunch of invalidations and tile requests and we see how fast our server responding to this and how fast we can render tiles and how fast we can send them back you have two numbers for the tiles you have the rendering if you see there is a sample here so we have the rendering power and the cache power so the cache power is essentially these are cache tiles so this is how fast your round trip is to read a PNG off the disk push it into the web socket back to the client and this is on this particular machine it was this fast in terms of megapixels per second and you also have the rendering power which is purely how many megapixels can you really render given your hardware and given your current code base but on top of that you get also latency numbers for a complete round trip of a command that you've issued like an input and the response that you get back parallel to this this particular tool can do something else that's really interesting which is to replay any session so by enabling a flag in the config and giving a path to a file we can record all the commands that we've received from all the clients for a given server instance so it's an instance wide and that can be replayed and you can replay essentially without timing so you can flood essentially recorded session of multiple hours immediately on the server and see how it's going to respond or you can replay it with the same timing as it happened including opening documents, closing documents new views coming in and leaving and so on so with that I want to quickly try to showcase what I'm doing here so this is a writer document that I'm going to be editing and you can see right now it's not me because my cursor is actually here and this is me typing and you will see that if I make a selection here I'm perfectly happy to have my own cursor my own world and Pranav actually who's sitting right there doing the typing for the other session and you can see his selection I can see and I can see his login his login is under cover at the moment so he's going by bugzilla and he will see my selection with my name and we know who's who and he will see a different color for me for my selection, my cursor and everybody gets their own unique color so you can recognize people you can exactly tell who's doing what and we can edit the document in parallel with that thank you and questions