 Well, thank you for coming. Look at you all, you wonderful people. If your friends miss this talk and they feel mortally wounded, I have a quite similar one in another track in another half an hour. So there's another installment of slight variations and different emphases, but here we go. So what is online? I'm going to talk a lot about online, but basically document spreadsheets and slides. It looks like a number of other online office suites, but of course it does a viewing of documents, collaborative editing too, and increasingly we're targeting this to run on tablets and mobiles as well. I think Tor has a wonderful talk at the back. Has that happened already, Tor? You missed it, but never mind. It's available in this magic box of tricks here, and you can see it later. But essentially reusing all of the goodness we're doing for mobile, because if you have a web service, people expect it to work nicely on their phone and on their tablet and so on, and say, hey, why not just run it natively on the phone and tablet and use all of that wonderful multi-threaded, you know, browser-y stuff and our nice architecture underneath. So online is then quite important. Clearly it's got loads of nice import filters, so if you want to read your PDF document instead of reading it in your browser by downloading it and seeing it, you could in theory render it on the server and then see it, which seems like a totally mad thing to do, but I'll get to that. All these other random file formats too, you know, they publish a keynote, Microsoft works, you know, the Swiss Army knife of file formats, thanks to Document Liberation Project. I guess the on-premise thing is why lots of people like this, because you can put all your data on your own server, in your own home, your own company. You can bundle it into your product and sell it to other people to haste where they like. You can put it in a proprietary cloud if you like or a hosting service wherever you like. And that's really cool. And then one thing that's important to understand and become more obvious as we go through is that by itself, online is really not very useful. It needs lots of missing pieces from other software to do authentication, particularly in storage, but we'll get to that. But luckily, nice people are there everywhere. They've done integrations. So NextCloud, EnCloud, PIDOC file, and there is a very long number of other integrations out there with many of our partners. Our architecture is essentially a bet on CPUs getting cheaper and having more threads. I've noticed there's been a trend throughout the computer industry of this occurring. And so I don't know, you know, who knows. CPUs may not get faster, but they seem to get a lot more threads. And actually, we work quite a lot with AMD doing cool stuff with lots and lots of threads. So far, so good. Ooh. So I think one of the things that's cool about this is that we instantly reuse all of the LibreOffice work. So that has two particularly nice things, which is that every time, you know, we get all of the goodness of LibreOffice, but every time we fix something, it fixes it not only for, you know, online, but also for the PC version. And so really good WYSIWYG rendering. Now, you may think that WYSIWYG is not, you know, a novel concept, you know? Like I think in Word 6 was probably the first word that did it. Is that right? Before then, Word 5 was a text, you know, and what came out on the page was totally, totally different. But WYSIWYG has been less and less popular, particularly among people trying to write online office suites because it's kind of hard to do layout. And so if you look at some of the competition from very large companies today, you'll discover that they kind of opt out, you know? Footnotes, well, who knows? Page breaks, who knows? Positioning images? Yeah, you can't really do that because it's not really a WYSIWYG thing. It's not really a page you can position on. It's kind of become a bit optional. Which is quite nice. So here's a good picture of it going on. And there's a few things here that you would think you might see in, for example, Microsoft Word Online, such as redlining. Many people like to change track their documents and see who changed them. You can't actually do that in Word Online. It's a bit embarrassing. Inserting tables, for example, or charts. Again, another missing feature in Word Online. So I think you could make a clear argument that this is a really stop-gap product and Microsoft has something better in the works. We can do those things because of our architecture, ultimately. And so we're using LibreOffice for rendering lots of bits in here. Show you how that works in a bit. That's cool. It's like a motion-activated Internet of Thing or something. Anyway, but some things we pull out, like these comments, these are sort of a native commenting thing. And so because of how we do this, we have really, really good interoperability, all of the random file formats, I think I mentioned that. So assuming we can do all of these cool things, how can you use it inside your application? How can you get all of this goodness and, you know, by combining things make, you know, one-on-one equal lots? Well, it seems scary. But actually there's a very, very simple, very simple WAPI-like interface. I think in a man, a person, woman, lady, week, whatever it is, it's easy to do this integration, almost from a standing start if you're a web developer. We basically put an iframe in your web page. And this key concept is that, you know, online is essentially an office suite in a box. It doesn't really do storage. So there's nothing persisting in the box. And it doesn't do authentication. So it doesn't really know or care who you are either. That's all up to the WAPI host, the integration. And why I think that's a good thing is because all of these things, storage and authentication, seem easy. I mean, you know, you can type your password in and log into your laptop. How hard can it be? And then, of course, you discover the world of XAML and hierarchical cached active directory encrypted nightmares. And you're very glad that you don't have anything to do with it. That's someone else's problem. And you think, storage, how hard can that be? I save it to my disk, you know, and there it is. And then you discover object databases with global replication and scalability and metadata and la, la, la. And so it goes on. At rest encryption. And you think, I'm glad I don't have to do that too. So that's pretty nice. So we rely on all of these, you know, next-door encode body of C file types to do all of that. Al fresco, hard work, which is cool. So we can focus on the interesting bit, which is basically the office C bit. And so we love that. And so how does this work? Well, basically, when your browser, this is the bottom of the browser piece, we have an iFrame. And usually your existing web application is authenticated to your server in some way. You know, you really need to be, you know, there needs to be some kind of token that tells the server that, you know, you can give this thing information. So that's usually there already if you've logged in. So all we do is we send that token into the iFrame, perhaps not the same token, but perhaps a different subset of it. And then we set up a web socket to the server and we send that token and authentication and document identifier back here. And these can be free form strings of any kind you like. We don't care. We get the string and we pass it on. So having got that, online basically goes and asks the whoppy host, which could be say Alfresco or NextCloud or some such. And it says, you know, do you like this guy? Here's this authentication token. Does it, you know, is it kosher? Is it going to work? And, you know, what access rights does this person have? And if it checks out, and this is usually an encrypted connection here inside a data center that's nice and secure. It's not over the internet. The internet sort of exists in this boundary here. Then we just do a get for the file. We do a get with this document ID and we get the file. Strange. And then we put it back to save it. So that's already quite complicated. And that's pretty much it. If you want complicated things, we can do a put file relative, which does effectively a save as, create a new name. But that's basically the whole API surface for your host here. You have to be able to create an iframe that connects to our server and have an authentication token that will go around here and let us get and put your data. Okay? Not tough. Of course, you can make anything simple, complicated if you try. And so we've tried. So there's a whole lot of other things. It's quite nice if, instead of just having an arbitrary file with a random hex string, it's got some kind of user visible file name. So we have a base file name for that. In many ways, it's nice to know how much data is going to come down. So we have a size. The last modified time then allows us to resolve conflicts. So if we're editing, we know when we last saved and what the modification time was. If when we try and save again, there's a different modification time, we're not happy, we wave and go, by the way, something bad is going on here. And then we can allow the user to choose what to do and trigger that. You don't have to provide this, but it helps. Owner IDs and user IDs and user-friendly names so we can aggregate all of the people that connect names and then you can see who is in the document. You can even get avatars for them and render pretty things. And then, of course, there's a number of permission things. So user can write, I guess, just read only a flag. And I say it's WAPI-like. It's based on the WAPI protocol that comes from Microsoft. And so it's relatively well understood. And there are a few lockdown bits. So some things people like to tweak the UI. So they prefer, instead of having a user list in our application, to have it around the side of the shell. So then you can hide that feature or maybe printing or exporting these kind of things. You can just hide them. Or you can enforce those on the server so that you simply cannot copy out of the document because there's no text in it. It's just tiles. Or you can watermark the text so each tile has someone's name over it. So pixels go to the client. You can screenshot it, but it's got your name all over it. And you could OCR it, but you're getting pretty difficult now for people to casually, accidentally copy and duplicate information. So the sales team can't screw over the product development team ahead of time too easily. Good. What else? There's a whole lot of other tweaks, of course. It would be too simple if it was all automatic. We like to use memory. It's one of the things. We use about 50 megabytes per document open. And then we pre-allocate a great chunk of memory that's then shared between every document. But at some point you want to stop using memory and start closing documents. So we try and leave documents open and in a saved but still available mode for quite a while so they can expire after an hour or more rapidly if you like that. But clearly you don't want to use all your memory so you need to somehow balance these. And yeah, there's a whole lot of things there. Particularly threading is a nice thing to do. We use threading to accelerate spreadsheet calculation or image scaling, this kind of thing. But if you're running for 1,000 users on a 50 CPU machine, wouldn't it be nice if you didn't have 50 threads for each of 1,000 users? Maybe 4 will be okay. So our default is to scale... Well, the default of LibreOffice is to use all the resource and that's not necessarily very good in a multi-user use. And then various other configurable options to try and reduce load. So maximum seconds before dimming and stopping. So I don't know if you're like me but I create more and more browser tabs and more and more browser windows until my hardware fails. And then I start again. And I think browser developers have finally got used to people like me by actually completely throwing away this stuff underneath. But even so, these things linger for a while and wouldn't it not be nice if your server is piling up with the tabpocalypse side effects? So basically we dim the screen and we close these connections after various limits here. If it's not focused and not active, we just tear down the server-side resource and when you come back, you click on the thing and it goes room and then it's back again. And luckily you can do that because you've got a nice fast server. There's a heck of a lot faster than your phone, for example, where competing products often take 40 seconds to load the JavaScript to render and lay out the blank document that you're trying to view and we can do that on the server practically instantly. What else? Yes, unloading idle documents, all of this good stuff. The logging is fun particularly because there's so much of it. We log the protocol as hexadecimal and dump the WebSocket contents and all of that good stuff. If you really want to, you can go tracing wild and performance suffers just a little bit. So you might want to consider the problems of concurrently appending to one big log file of all these users. So it traces the developer's friend but probably not the sysadmin's friend. So yes, it's all good fun. What else can you say? Logging. Yes, it's very good for debugging. Monitoring is another key piece. You know, what's going on? Why is the server, why is no one using it? Should I do more advertising to get more users? And so there's a pretty web view there that you can see what's going on, who's online. You can, if you have the power, you can kill them dynamically as they're typing. Isn't that great? All of that stuff. And it's got a very simple WebSocket protocol there that'll tell you all sorts of information about each document and so on. It's whether it's been saved recently, whether it's unsaved. You can even shut the thing down and restart it here. All of which happens very nicely and transparently. You know, all the documents are saved. It shuts itself down and it'll come back quite cleanly. For clusters, actually for Philippe here, Arroir is doing some alfresco integration. And alfresco has a whole load of locking stuff built in. And I'm not a big fan of locking at all. I think it's a nightmare. Distributed locks are an unsolved academic problem. For obvious reasons that, you know, people take their laptops, they lock a document and go into a tunnel. And the laptop is thrown into a lake. And what do you do? Where does the lock go? Does it time out or go in the lake and try and get it back again? Anyway, so I hate locking, but still other people want to compensate for my inadequacies in this regard. And so there's a way that we can connect. If you have a large cluster of these online instances, they can connect out to another server. So you can still have that monitoring protocol. You can see what the documents are, what keys they are, whether they're loaded, whether they're live, modified, active or not, and get asynchronous notifications. Pretty rapid notifications when that changes. So you can knock yourself out and build your own locking system and you can support it too. Anyway, it's probably a good idea in some cases. And we love alfresco and Arroir is awesome. But they're solving hard customer problems. Put it that way. Good. So this is how it works inside, pretty much. So this is pre-having a designer look at my slides, as you can see. But essentially there's a thing called a web services daemon. It's like a web server, pretty much. It's all written in C++, it's all relatively small. And we talk through everything through WebSocket, it's even internally, which gives some kind of symmetry. And a web services daemon is pretty nice. It ends up with a thread per document, something like that. And that thread does a sort of polling main loop that deals with all the clients. And then we have a kit process. So there is a LibreOffice kit API, which is a C++ API around LibreOffice. And that then has a single core for each of those. Now, one of the things you really want to do is make this efficient so you can launch a lot of them and share a lot of memory. And in order to do that, it's necessary to have a fork it. Because you cannot fork if you have threads for various reasons. There are a lot of bugs that don't turn up in the children and aren't released properly, which is really not good if that's in your memory allocator or something like that. And this does happen. So fork it is basically loads LibreOffice kit, initializes everything, allocates all the fonts, reads all the configurations, staticizes all the strings, blah, blah, blah. And it's completely ready, shuts all its threads down. And then all its job is to do is just to spawn lots of processes. All of you are just ready to load a document. And so it's the parent of all these kit processes. And it also puts them in CH routes. So it has capabilities, powerful capabilities. So online is pretty good for editing documents. But it's also really good for format shifting or thumb nailing or providing previews of documents. And this is a very common use case. I think our next cloud is using it for thumb nailing all the documents, which is quite nice. Because you look in a folder that hasn't been thumb nailed before and you see them go, blah, blah, blah, like this. And behind that there's a LibreOffice kit being spun up somewhere and some amazing file being loaded, thumb nailed, sent back, stuck in a database. Anyway, it makes me feel good because I know what's going on behind the scenes. Presumably the user goes, slow. But what can you do? So at the moment, many people have a thing, they use LibreOffice kit directly through small helpers. Or they use SOffice directly with this convert to switch. There are a number of significant advantages to using online. I mean, so the memory sharing stuff is a really very significant performance when it reduces your memory footprint. You share a lot of CPU because lots of work is shared with the shared initialization and the forking. We have these CH root jails. So every document lives in its own jail. If there's something nasty in that document, it has to beat Quailon's expert caverity, cum, crash testing work inside. But it is plausible that that could happen. But of course, when you break out of this document, say you could run some random binary code inside it, you discover you're in a CH root, and only your document is in that CH root. There's no code in that CH root. There's no shell. There's a whole lot of fonts, if you like fonts, but they're all open source typically. And when you try and get out of it, do this nasty P trace or whatever to try and escape, you discover that SecCompBPF has killed almost all of your system calls, particularly any that have had security problems recently. And so, you know, and then of course you put this inside other things if you're worried about that. So it's pretty secure. We don't promise anything. Well, we do actually. We support it and, you know, it's a collaborate at least. So it's much nicer than just running this thing on the command line with Lord knows what isolation in your server, which is really not something I'd recommend. Other nice things are that we know when we finished loading a document, because we launch a new LibreOffice instance to load every document, we don't bother doing nice things like shutting down or freeing memory when we're done. We just hard kill the process and let the kernel do its stuff, which is kind of nice. So, you know, you have instant off, you know, as it were, kill minus nine. And, yeah, and then we do these things right. So it's possible to load documents that trigger layout loops that take an infinite time to lay a document out. We fix a lot of these or have fixed a lot, but there are still some there, hypothetically. And so wouldn't it be nice if someone was there to lay them out and say, look, if it takes 100 seconds, just give up, give up and go home. And so we also profile it and optimize it. So some of these conversions are quicker than others. So we don't do word count before converting to PNG, for example, which is another, you know, like, if there's no word count in the output, perhaps not worth getting it right before you do that. And there's a very simple REST API. So, you know, you just dump your data in, your ADT or DocX or whatever, and you get what you like out PNGs and so on. So that's pretty cool. Aha, JavaScript debugging. So debugging was mentioned in my thing. Let me just play with this and try and show you some things. So here is LibreOffice online, essentially. And it's reasonably responsive. You know, I can fool around selecting different bits of text and scrolling about and so on. And partly because the selection there is just a layer on top of the thing. So we're not re-rendering anything here. It's just drawn on top. As of the cursor, you'll be pleased to know. Many remoting protocols don't know about cursors and so they send pixels each time it blinks. Look at that. Not so collaboration online. It's rendered on top and it knows lots about the document structure. Similarly, as you're scrolling here, there's pretty much no rendering going on. It's all in the client. It's all pre-cached. If I could pinch to zoom for you, again, that happens very interactively with the pan and then it re-renders later. So that's pretty nice. And of course you can type too, I don't know. Lots of benefits or whatever. You should never type in public. It's one of those sort of shameful things that you should hide, but anyway. So what else can you do? I think there's a whole lot of features I'll show you about later, but let's just look at some debugging bits first. So this is Chrome. You know, it's great. It's just great. What am I trying to show you? So I think perhaps, first of all, I'll show you the sort of built-in debugging. So if we go help about and you press D, magic things happen. Don't let your users know. So here you start to see that there's a whole lot of tiles actually underneath this and these blue lines are here at the boundaries of what's actually there. So as I type hello, you can start to see that some of these tiles are being refreshed. And some of them aren't. So you can see the blue ones there. As you see as I type, the blue ones aren't actually changing at all. I'm sorry, I'll wrap the text in a silly way. So maybe you can't see that, but can you see that these blue ones on the side here they're not actually getting refreshed because it's the same content. We've re-rendered it on the server, but yeah, turned out it was the same thing. And this is, of course, just down to the invalidation area. So if I come down and I haven't got a simpler case, you know, you see the invalidation area get smaller as we get to the end, which is quite nice. So we re-render those bits and then we can send new Ping images for that. And you can see all sorts of lovely things at the bottom. Of course, here I have a very good Ping time. 16 milliseconds worst case. I'm pleased about that. But we can cope pretty nicely with transatlantic, you know, Ping time these days, which is good. We've took that out in a bit. So that's nice, but some people don't believe that, you know, with all this sort of selection staff I can copy and paste text out of this, but actually the copy and pasting is some kind of text push to an edit that's hidden behind the scenes after some delay. And so this doesn't really exist. It's just in pixels with an SVG overlay on the top. Let me persuade you of that some more. Let me see. Perhaps if I turn off this overlay you can see that this is really, you know, it's a Ping which is quite fun these times. Which is cool. And also when you've got the debug mode on, you can start to see all of the network traffic coming out on the console here. So the scrolling I guess is just telling the server where your visible area is, perhaps. Is that right? No, this is some debug I added in myself to try and work out some problem. But if I'm... Let me get rid of that too. So as I type in here you'll see key presses coming out and tiles being processed and so on in the background. So that's quite fun. But the punchline is it works. If that scares you, just so you know. The particularly nice thing about this is not only do you get a WYSIWYG editor but you get the same exact document here as you get on your PC. If the fonts are still the same. Unless they're embedded in the document which we do a rather good job of rendering. Yes. And we actually support that feature which is kind of good. I think that was the JavaScript debugging. Easy to build. I keep getting told it's impossible to build which is annoying to me. Yes, they are quite complicated. So, it helps to compile LibreOffice which is fantastically easier today than it was a long time ago. And that's going to be in some path like say master like that. And you do autogen.sh I don't know why that doesn't do the configure as well. I should probably merge the configure into that. And then, well, I like... the silent rules is probably optional. You can take that out. That's just what I use. And then you tell it where the LibreOffice kit path is and say if they match nicely there's no problem. You don't need that. And with the LibreOffice path that shows where the binary site is going to run against and if you go enable debug your life is nice as developer. I would recommend it. Otherwise you try and sort of like bundleify and compress your JavaScript all into a tiny one character or something and then it's harder to debug and read. And then you go make run and then you can run it. That's what I'm doing over here. So you can control click on them in your terminal and it just loads up and it's fun. And if you like to package it there's a sample spec file and Debian rules there too. It should be easy. Excellent. I'll, you know, slow down. Right. WSD kit debugging. So one of the nice things about this is that it's this kind of multi-process model to make everything efficient. But that can be a bit of a pain when it breaks. So, yeah, you can run the services demon under the debugger easily. Sometimes you need to get it to sleep as it forks these children so you can catch badness going down there and so there are environment variables for that. You need to be root to debug it with the for kit and the kit. Although the kit drops all of its capabilities it needs to chroot first. And it is then privileged forever more, unfortunately. And, yeah, there's loads of unit tests there. So I'm just going to flick through some of the features that we've done recently. They're in version 6.2. So reduce latency. So you can see the bad red line and the good green line. There you go. Can depend how many characters you type and how quickly you type. We saw the text as bitmaps. Previously we were doing a round trip for every character. So you see the invalidation rectangle. When you press that, that was sent to the client. The client asked again for tiles. And now we actually just push changes where we can and we remember what we see. And this significantly reduces, well, this one round trip out of the picture which is not a lot here but if it's transatlantic it makes a huge difference. We've made the toolbar much prettier. So previously it was kind of gray and now it's kind of lighter. It's brilliant, isn't it? And you would think that this, you know, anyway, this is a major improvement. It turns out for many people. So that's good. And thanks to Andreas Kynes we have beautiful icons to use there and we now have this click-to-rename document thing up at the top. So, you know, you click on the file name and then go type another thing and do a save as if your back end supports it. And we have this revision history thing that shows you some of these may be familiar to you from other products. They seemed like a good idea but, you know, convincing people to change can be hard, can't it? Let's face it. Anyway, what else? Yeah, just prettier icons and a little helper formatting things. Some more functionality in dialogues, making better charts. A CSV import you would be amazed at how many people haven't made the jump from CSV to, you know, XLS even. You know, there's still, you know, the number of RTF people we meet is still quite surprising, given that that's the previous failed standards war for open documents, you know, just 20 years ago. So, yes, lots of nice new things just improving rich document functionality really. Another nice thing is making shape editing interactive. So, when you select something, it's nice if you have a preview that moves with it rather than just a surround. So, previously we had a rectangular box and now we have a, you know, prettier handles, rotation handles and previews. So, you can see that as it happens. Shape insertion, custom shape, signing documents. So, we have a company called Forine and we are blockchain enabled. Look at that. You can say, woo-hoo, awesome. Look at that. I have a smiling gentleman on the back. Woo-hoo, Forine. Forine, I'm sorry. Can't pronounce my own name. Don't worry. So, yes, so you can then effectively save to them and get it signed and it all uses the awesome crypto power of LibreOffice and blockchainness. We've also done a huge amount of work for mobile devices to make it easier to use. So, we load in a view mode straight out, so decluttered, much prettier and easier to use and then you click to edit and so we can defer some of the work associated with editing and toolbars and so on. And much better multi-touch there. Full dialogues on mobile and you can zoom in and out, pension zoom in so you get all of the functionality that you might want and it still sort of plausibly works on mobile. High DPI rendering for all of these people with more pixels than sense and a faster GPU than they need and so that's kind of cool. What else? Lots of other stuff. Big scripting API work. Yeah, when we hit Internet Explorer's WebSocket limit of 6, 6 WebSockets, well should be enough for anyone, shouldn't it? 640K is enough for everyone. 6 WebSockets. Anyway, so it now tells you you're using a lame browser to find the obscure setting. GDPR, log anonymization. After adding your massive logging infrastructure it's important to totally destroy the data as you're writing it, otherwise you're going to infringe some ridiculous regulation. Sorry, important social improvement but you can now anonymize your logs so they don't come and bite you later. And tile watermarking. So here we are. This is my conclusion. Am I not going to be beaten up by Mike? To the second. So I'm not going to read this out to you because I'm slightly late, but do let me encourage you to grab an image of this. We make lots of easy to use images of code and Collabra puts a load it's privileged to pay for almost all of the work here. We'd love to work with you and help make it better. So thank you for your patience.