 Welcome everybody to the second talk in the office track. This one will be different from the one before, different from the one after, because it's a very different project. This project is not a decade old, like the other ones, more than a decade. It doesn't have a million lines of code, only 3,000 at the moment. And it's all running in the browser, so it's quite different than before. I'd like to, by the way, thank Foslan for giving me the opportunity to speak here. I really love this conference, and by letting me speak here, I really feel honored. So, let's start with the overview of the talk, so you know what's coming. I'll start with a small demo on how you can use WebODF on your website, because it's so very easy to use it, I'm just going to start right off with that. Then I'll go a bit into the history. It's a young project, about half a year old, and I'll tell you how it came to be, why we thought of this idea. Then I'm going to explain a bit about what is ODF, what does the file format look like, and how can we put the information in the file inside of an HTML file. Then I'll go into how you can write JavaScript for WebODF, because it's a bit quite strict on how to write JavaScript. It's not so easy, as you might think. Then, at the end, I'm going to show you how to use WebODF, not on your website, but in your program. So, what's the goal of this project? The goal of WebODF is that we're making a JavaScript HTML5 library that makes it easy to add ODF to your website or your software. So, if you want to have a way of viewing ODF files, you are not dependent on a browser, sorry, on a cloud service. You can put it on your own website or your company server, and it's easy to look at your ODF files then. We will also, in the future, add editing support, and then you can also collaboratively, in the browser, edit ODF documents, and they will be exactly the same documents which you can open in Microsoft Office, LibreOffice, OpenOffice, Caligra, doesn't matter. And just because we're using JavaScript and HTML technology doesn't mean we are limited to the browser. In fact, it's very easy to use this technology to make an ODF application which runs on the desktop or on a mobile phone or on a tablet. And that's an example I will give of that at the end of the presentation. So, how to use WebODF? Okay, what do you need? You don't really need a lot. You just check out the GitHub repository. It's quite small, and you need a web server. That's all you need. You can see a small script inside of WebODF. It contains a file called http-server.js, and that's a script which you can use with Node.js to start a small web server. And I'm going to use that for a small demonstration, but you can just easily use Apache or any other web server. So, you take the web server or use the given script and you copy these files. That's all of them. Now, you can use the ODF HTML. That's basically the application. A CSS file with the default styling, the basic startup of the program, odf.js, and the rest of the functionality is put into classes in separate JavaScript files. But it's just a couple of files you copy into the same directory. Once you've done that, you have a web server with ODF viewing capability. What's missing then is just your files. You take your ODF files, put them in the same directory or relative to the JavaScript, and you're good to go. So, you can go to a URL which has a hash and then the name of the file. So, if you have a content management system, you want to have ODF support, just copy the files in there, link up the documents which the users upload with a simple hash to the ODF HTML, and you're done. So, I'm going to demo that now. In fact, I was demoing it all along because, as you can see, I'm running a web browser. This is Chrome. And what you see here is a URL. So, I'm running this on a local host. The previous speaker had an objection to using HTML technology because the web might go down and you can't use your software. I'm just running this on a local host and it's going down. It's completely safe. Yeah, and this is just a file. So, I'm presenting from this file. I'll go to here to show you. This is the server running here, demo directory. I'm calling the node executable with the HTTP server script which just does, if you send a get to it, it will give you the file. It's a very simple server. And here are the files. Oh, this is just a JavaScript. So, these are all the files in the directory. Basically, the files are just listed including then MyReport, ODT, the open document standard, which is something I will use to show you later, and a presentation and a spreadsheet. So, we're not just... I'm working now with a presentation, but we're also supporting spreadsheets and text files. So, this is the same... This is a different version here. I wrote a small UI into it. This is not the standard part of WebODF, but it's an example of how you can use it in a more sophisticated application. So, this is just an XGS. It's a JavaScript library with which you can build UIs. And here it's listing some files. So, you can open it, and here's a different presentation. So, you see it's got styles. Here's a big spreadsheet. It takes a bit longer to open, but not really a lot, if you consider the size of this file. It has colors for styles, colors for the cells, and basically it looks pretty good. So, that's how you use it. Now, how did it come to be? Why did we think of opening or writing an application which can easily allow you to view ODF documents in your browser? Oh, spacebar stopped working. Yeah. Well, for that I have to start a WebKit. I guess many people know WebKit. The browser I'm using now is Chrome. It's also built on WebKit. It has a history long ago. It started in the KDE project where the browser was KHTML. And that was a very popular browser. It was pretty good at the time. And so good that at some point Apple decided to fork it and make it into the Safari web browser. At this time, Mozilla was the most popular open source browser. And they didn't choose Mozilla because they liked the design of KHTML better. So, initially Apple wasn't behaving very much as a nice open source citizen in the sense that when they made a new Safari release, what they did was they just dumped the code. But after a couple of years and complaining about the KHTML developers that they wanted a more open development process, Apple started the WebKit project. And so it became really a proper nice project. And from that point on it gained more ports. That led to Chrome, GTK WebKit, Qt WebKit and many other adaptations of WebKit. Now, nearly every computer has a version of WebKit running. And by computer, I also mean mobile phones and tablets. So it's pervasive. It's a huge success. And within our company, KO, we were thinking, well, can we repeat this for office weeks? So to see why WebKit is so successful, we have to go back a bit and look at how it works on the inside. We have a lot of improvements for which there are ports, and you look at, for example, the string class for them. I mean, most browsers are written in C++. But C++ is different on every platform. In KDE, you have a Q string. Mac OS has an NS string. CF, or I don't actually know what I meant to replace that. WixWix has a Wix string and Haiku has a B string. And people can get very obsessed about what type of string they're using. And not just the strings, it's also the vectors. It's also the graphical interface which we're talking to. The clever thing Apple did with WebKit, when it ported it from the Q string and the whole Q environment to Mac OS is that they abstracted away all of these things. So they made it possible to implement a few abstract platform-dependent classes and then have a native application, a native WebKit in your environment. And that's why the adoption was so great. Now, we would like to repeat that. At the time we wanted to repeat that, this was an inspiration for a project which we called ODFKit. And since most free open source software is also written in C++, but also the libraries are different. I mean, LibreOffice, Kaligra, and Abyworth are different string types and different widget sets. We thought, well, why don't we take the WebKit approach and build on that. So ODFKit was born and it used the WebKit approach. And the initial scope was to have a server-side handling of ODF documents. So we didn't want to go do graphical route. First we'll say, okay, is this idea of loading, saving, ODF documents on different C++ frameworks is feasible. And we started working on it and one of the excellent things of WebKit is that they have a very good test environment. They have tons and tons of tests. And these tests are all written in HTML with some JavaScript to run them. So we are also writing a lot of unit tests and our unit tests were reading ODF files. And at some point I was thinking, well, why do we need to have the inside of the ODFKit part? Can we move it to the JavaScript side? And in fact, there are GZIP libraries written in JavaScript. So I started playing and at some point I didn't have any C++ left, I just had my JavaScript and I thought, well, okay, so it's possible in a browser and I tested it not just with WebKit but also with Opera Firefox. It's possible to unzip a file in JavaScript, then take the custom XML and put it into a web page and that's how Web ODF got started. So now I'll tell you a bit about what ODF looks like and how you can put it inside an HTML page. So ODF, it's a great standard. All three Office suites in this Office track today are very active as a main file format. It reuses many technologies, XML, ZIP, URLs, X, RTF recently since version 1.2 and I think that's a great addition. I won't go into too much what it is right now. Scalable vector graphics, X queries and it's very active. There's a weekly call of the people working on the standard and they are all from different vendors. There's a plug fest where implementers from different versions of different ODF software come together and see if their ODF documents actually work together well. There are many implementations. Calira, LibreOffice, well, most of these. The ones on this side are actually cloud services. Microsoft Office is also cloud service and the growth of cloud service was actually one of the reasons why we said that Web ODF would be important because more people want to work on documents in the browser but they don't always want to put that document on a Google server or on a Microsoft server. So, how does Web ODF work? Well, you start with an HTML file and in your HTML file you figure out what the path to the zip file is and you start loading it. The loading is usually done with an HTML HTTP request which gets the binary data and it doesn't get the whole file at once. If you have a file with large images it will not get that once it will be too slow so it will first get the index of the zip file then it will get all the bits which are important to load the content the most important file, the settings XML, the styles XML file and then it will put the information which is in there in the DOM tree which forms your HTML document. Then you've got a lot of XML added to your document and the HTML HTML standard says if it doesn't recognize a tag it will do nothing with it. So, any normal text which is in there is just shown as plain text and it doesn't look like an office document let alone a spreadsheet at all. So, what we need to do is we need to use CSS with namespace support and I'll show you later how that works to hide and show the important bits and to format them on a page, on a slide or in a spreadsheet spreadsheet cells. Then we look at what the custom styles are the bold, the italic the font used, the size of the font and we also convert that to CSS and the HTML file in itself looks good and after that we load the images. So, that's basically all the different steps and we'll go into each of these in more detail now. So, most ODF files are zip files not all of them and they contain XML files and pictures. HTML has one DOM so and it's usually an XML file or at least if it's not a proper XML file it will be read as one it will be corrected. ODF has a different serialization so the problem is HTML has one DOM but in a ODF file there are many zip files so many XML files which of the XML files will you actually put in the HTML DOM or how will you put it in there? Well, there's a solution for that because the ODF standard has a different serialization which is just one XML file actually or not really but it's still useful. The fact that it's in the standard gives us at Web ODF a good idea of how to put the different parts which are normally in a zip file into one DOM tree and this is what this will look like so here's the HTML this is the live tree I've collapsed the head and you see the body and then the document tag there that's an office an office tag it's in the office namespace it's not shown like that that's just something typical in the way Chrome shows this and then you see different tags and each of those tags contains parts from the different XML files in an ODF file so by following the standard we can put all the components inside of one DOM tree and once we have it there we can do lots of stuff I mean suppose you would like to do some editing or custom scripting you could just go wild with JavaScript like you do with HTML pages and modify it to your house content it's all available with the DOM API now so that's step one we've got the we've got the whole document loaded in the DOM tree but it doesn't look very good yet so what do we do? we want to use the styling information but the styling used by ODF is not the same as HTML HTML uses CSS CSS 3 by now and that's quite different from the styles in ODF which are based on XSL flow so we need to convert it but there are two issues if you use style names so whenever you have a style you just say this element has this style this element has that style whereas in CSS you have selectives you say this is a div and all divs which are inside of a paragraph or which have an ID like this should be green so it's a bit different and the properties themselves are different so if a bold might have a different name than bold in CSS but for bold actually it is the same but there are many tweaks you have to do conversions so here's an example this top part is an extract from styles.xml and it defines a style called myBold and basically what it says is text property this is bold and then below you see how we would do this so that the browser shows this properly you see text pipe p that the pipe is the equivalent of the colon in XML it's a divider between the namespace prefix and the local name then you see the angular brackets and the angular brackets say if the text style name is myBold then this applies so we have directly translated the selector based on just name to below the selector based on a paragraph with that name of course there's not just paragraphs you have also headers and lots of other lots of other elements and all of these elements will have to have the same rule so in practice we have to repeat this quite a few times to cover all of the styling and you'll see that later when I go when I show you some introspection in the browser okay so that's basically how the conversion works and you see that I should go back one step what we're doing here is because we're changing the styles into a CSS but when we're editing we will first edit the top part so it will change the actual ODF and then we do the translation so when we're editing which we're actually not supporting yet but which we're planning we will be editing the actual ODF and then we calculate what CSS looks like and this brings us to a point a problem with many office suites is that they usually try to warn you from using ODF or they try to warn you from the document which is not your native format Microsoft doesn't like you using open document formats and well it doesn't even matter if they like it or not their runtime model is quite different from what ODF looks like and that's why some features may be lost when you're saving unfortunately they never tell you which features exactly so that you could make a reasoned decision whether or not to use ODF or not but the same thing goes for LibreOffice for example when you're opening something in ODF and you want to save it as PowerPoint they will give you the same warning they'll say yeah we're not sure if PowerPoint has exactly the same features so are you really sure that you want to use it well in ODF our runtime model is ODF just like the model which we're saving so there is very little difference it's just the XML and the disk is the way it looks in the browser and I want to give a small demonstration of that by actually just going and looking inside of this browser now so this is the live document and by the way do you know what this is in office text editor you call this an underwater screen it's been it was popular and we're perfect basically in the browser it's back you can actually see what your document looks like so you see here the XML and here you see if you select an element for example the first page you see what CSS applies to it so you can directly see when you're developing or improving Web ODF if the CSS translation is correct and you can even see where it was defined but that's just the underwater screen okay the next part will be a bit of a JavaScript because writing JavaScript is something many people like to do but few people like to learn and the problem with JavaScript is it's quite flawed there are many problems with it but you can avoid the problems and that's what I want to go into a bit right here you have to be very careful about writing JavaScript but with the right tools it's a doddle it's just fine you have to be aware of them I want to talk a bit about that now so here's a list of practices which I want to talk about quickly you have to use the good parts of JavaScript there are many bad parts you want to use JSLint JavaScript good you want to use runtime abstraction and we really need that here because WebODF might be running in a browser it might be running on a command line it might be running in a native application so we are abstracting that way in our project you want to use callbacks for fast IO and you want to compile your JavaScript using the closure compiler and of course use a lot of unit tests so the first part I saw there's a standard for Riley here that's the biggest sponsor of the event so I think it's good that I say this is an excellent book and you should probably buy it if you want to write JavaScript it's called JavaScript the good parts and it explains which parts of JavaScript are bad and which you should avoid and yeah it's a very thin book it's a great read I think you should really have a look at it the author of the book also wrote a program called JSLint and JSLint will tell you if you're using a bad part of JavaScript so you can avoid it so the runtime WebODF runs in different runtimes and the only common thing between all the runtimes is that of course they're all JavaScript and some may have a DOM if they don't have a DOM you cannot do everything but you can still do the unzipping for example you can still do base64 encoding and the runtime is a thin abstraction layer that gives you access to the file system logging, use of timers the window object and we have currently runtimes for the browser for Node.js which is the server which is currently running this presentation and Rhino which is a JavaScript implementation you should also use callbacks this is a very cool feature of JavaScript and so it's often the bottleneck in your application instead of waiting for an event that may be slow you should pass a function so here's a function loadxml we want to read the file myfile.xml instead of waiting for the result we just pass along a function and the function has two arguments error and data and if the loading has been done then this function will be called with the error message or the data so because the function loadxml itself returns immediately sending a request to the file system and then putting hanging this function on there it can immediately continue with all of the rest of your application so your application becomes a bit faster quite a bit faster well it's very important to write unit tests it's tricky to write a program which runs in a browser they're getting much better but they're still very different so you really need to test a lot and also I have to admit I didn't mention this yet but we are not even supporting Internet Explorer at all right now because it's not worth it I looked at Internet Explorer 9 and we might support that it's a lot better but I'm not sure it's following the standards good enough when we're in doubt on how to develop something on the standard and use that and if Internet Explorer adheres to that we may start using that I don't see that as a big problem for adoption of the project because if you want to use a native application you're free to choose any browser you like a webkit component for example if you want to use it on an internet you control who is using that browser so selecting on quality of browser is not really an issue I think yeah so we also test with command line programs just so that we can do unit testing easily we don't have to press reload in all the browsers we can just run something on the command line instrumenting the code there's a very cool tool called the JS Coverage what you do is you run an executable over your JavaScript code and it will instrument it that means it will add monitors in all of your code and then when you run your tests it will tell you how often every line of your code was executed and that's very good because you can check if you are actually testing all of your application okay Node.js I want to say a few things about that because you want to run unit tests on the command line Node.js is a v8 engine it's the JavaScript engine which is in Chrome and it uses callbacks extensively so it's very good for a server if you want to implement the server in JavaScript well, this is the thing you need it's really up and coming we're also using Rhino it's a very slow JavaScript engine and it doesn't use callbacks so the reason why we're using that is because it's so different because the callbacks do need special attention you need to make sure that if you pass a callback that you don't need it before you actually leave the current execution loop and lastly we're using Qt Webkit because neither Node.js nor Rhino has a DOM and we also would like to do tests on the DOM on the command line so we have Qt Webkit which runs with no user interface which we can do more testing with okay, next in the list of tools which you want to use with JavaScript is a closure compiler since what does it do is it takes all your JavaScript files into one big file and optimizes that and in itself that's not really too important for this project because the code isn't so huge but it also does syntax checking and it also does type checking so instead of waiting for your browser to give an error you can already on the command line see if your code is any good how does this type checking work below here is a small code fragment and basically you set comments saying what type every argument is and even if you pass a callback function you can say what the arguments to the callback function should be and it's surprising how many type errors this will catch okay, that was most of the talk already I would like to at the end show a way of how to use WebODF in your own program if you download the code you'll see two examples there one example is to use WebODF in a Qt application and you create a canvas where you can load ODF documents and another example which is in there is to make an Android application which could show ODF documents at the moment there's no decent application on Android to show ODF documents and this one is also just a demo so it's not released as a decent solution which is in the marketplace yet but it's very small because most of the code is in the JavaScript and it would be very easy also to make for example an iPhone or BlackBerry version of this application so this is a small excerpt of the code of it it's basically two classes and here you see some of the magic happening here, what you do is you load this ODF HTML file in your WebKit widget and when it's loaded you change the current runtime you change the read function and you change the function for the get file size and once you've done that then you instantiate a new ODF container so we're overloading two functions and then this gives our application the ability to read any file on the file system so the advantage of this application is that it can actually show you files and I'll run it now it's here so this is the simulator it's an older version of Android I purposefully used an older version to show that it also works there you don't need all the new features in Android to get this running so here it goes let's open a spreadsheet the emulator is a bit slow and the device will be faster let's open a spreadsheet let's open a small text file and you see that it's really just an HTML page but also Android provides you with a nice UI here and it's a very small application most of it is just the JavaScript which is shared so what are the current activities in the projects? we've been going for half a year we're being sponsored by an Lnet we still have about three months left of funding well not full-time but part-time funding and during this time we want to improve the rendering of your documents because we certainly if you open a document in WebODF it certainly doesn't completely look like it would do in LibreOffice or Caligula right now so there's still quite some improvements to be made there we're making API so that you can control your documents if you want to write your own custom JavaScript HTML5 application which is an ODF widget in there it would be nice to have a nice API in there we do have something where you can zoom in and out and where you can exchange parts of the document programmatically but we want to extend that a bit to make it nicer and of course suggestions of people who are using the code are welcome we want to have write support so we can actually save a file back this is partially done already and we want to support limited editing in the user interface not complete editing you won't be able to completely modify a whole table that's quite complicated if somebody would like to write this code they're more than welcome but this is not our initial focus and if you're interested in this code it's only 3,000 lines of JavaScript right now so check it out and become creative just put it on your web server and see what's missing and I'm sure whatever is missing for you is easy to fix unless it's some big feature okay summarizing open document format is great I guess all three speakers in this track will agree with that the community for ODF is very active and web ODF is great to bring ODF to websites and to many devices and web ODF is also great because it doesn't mess around with your ODF it just keeps it in your document as ODF it doesn't convert it to some internal runtime so web ODF makes ODF easy and fun thank you so we now have actually 50 minutes for questions have you considered turning web ODF into a Firefox extension so that you can preview ODF files on sites which don't have web ODF installed you can read ODF on what so there's lots of ODF documents on the web right but in this system the owner of the website has to install ODF.html on their domain in order to be able to preview them if you took this code and made it into a browser extension then the owner of the browser could preview any ODF document that sounds like a very good idea I haven't thought about that because I wasn't aware that an extension might be able to handle a mime type I don't know about in other browsers but certainly in Firefox it would be very easy to make this I think this has actually been tried before there was a Firefox extension that did some very simple stuff but you seem to have got a whole lot further so it would be really interesting I personally think that it would be a cool feature for Firefox to have in the core eventually a preview mode for ODF files would really drive ODF adoption and if your code could get to a certain level of fidelity and a certain level of reliability then why not also if it's an extension when somebody clicks on it you can open it and show it but on the top have a big button saying actually open it on your desktop good suggestion are there any more questions all the way up there where are you again maybe you have said it I was a bit late but what is the added value of Web ODF compared to Native HTML well that's Native HTML isn't ODF so if you have Office documents which people are creating with Liberal Office, Microsoft Office, Caligra and you also want to publish them on the web how are you going to do that you need something for that so I think Web ODF is a very nice way of doing that as true to the original document as HTML allows so that HTML runtime allows any more questions yeah Dan here's a question sorry I'm not sure if I remember correctly but you said something like that since you modify directly the ODF of the document model then you re-render it into the HTML5 DOM and this could be a problem for the speed the performance of the program probably is unfeasible to do the other way around to keep true changes in sync or to use another document object model in the middle of what you are doing currently it will be outright so it's not visible but do you think at least for the simple editing capabilities that you are planning to do now it won't be a problem or I mean I'm just asking I completely see where you see that there might be a performance problem and for small documents there might be a performance problem because loading is pretty fast as well I didn't actually show a benchmark here but if you want to load for example the ODF specification which is a 600 page document it takes double as long as LibreOffice does it it was about as fast as OpenOffice but the latest LibreOffice release was quite a bit faster so unfortunately I can't say anymore that it's comparable speed it's double the speed right now but it's pretty fast that's of course the unzipping and then the rendering if you just do if you just update the rendering that certainly takes several hundred milliseconds for a decent document and your suggestion of if there's an editor in one place just do an update only a fragment of the CSS I think makes sense but I don't think you should change the CSS and then when you start saving then only at that point go back to the styles because then the programming logic becomes quite complex it makes more sense to say the styles XML is sort of the real value and the CSS is a reflection of it but updating it can be done in small parts to make it faster thank you any more questions? there's a question behind you Jonathan yes you said we can use web ODF can you list which Qt modules you may use for that sorry the Qt four modules for example WebKit or some other modules Web ODF itself doesn't need Qt but you can use the Qt WebKit module to display Web ODF so in Web ODF we have a small demo application which is just a couple hundred lines of Qt C++ which basically embeds the JavaScript files starts a Qt WebKit page and then says okay now just render this so all the application does is it says okay there's file access the web component is not allowed to access the file system usually especially not via XML HTTP request so it just goes back to the Qt code there's a binding for that and then the Qt code will read the file and pass it on it's just a little bit of Qt code most of it is shared so even if your application is Android GTK Qt the whole layer should be quite small most of it is all JavaScript code the native binding is something which you need to write a dedicated code for I'm currently publishing some ODF documents just by exporting them to HTML from OpenOffice it's just a few but I probably why would I move to WebODF what would be the advantages well you would do less work you don't need to convert your documents because what you would do I don't know what your web server looks like if it's just a page saying okay these are my documents here are links to the HTML versions and here are links to the real documents that's probably what it looks like so what you would then do is you could remove the HTML link or at least change it to go to this ODF.ML with the hash and then just link to the file and you're saving a tiny bit of disk space because you don't need to store the HTML version and also I think that this version might look better but I don't know what the converter looks like is it something which creates bitmaps or does it create which filter are you using well there are just a few pictures that shouldn't be a problem but I'll give it a try okay there's nothing of course if it's pictures by the way if it's pictures you can't select text anymore so and here you can well you had one slide briefly mentioning a project named ODFKit in the concept of WebKit is that superseded now by WebODF sorry you're quite hard to understand can you speak up a bit is that better okay you mentioned a project on one slide called ODFKit yes and in the context of WebKit so assume it's C++ something or C and when you continue to talk about WebODF so is WebODF now superseding the ODFKit or no they're separate projects ODFKit came first and ODFKit basically is a largest patch on WebKit to give WebKit support for ODF files so we added some logic there to do unzipping and to go to read either the XML format and also load it into the DOM tree and then do some JavaScript work on that however when we were writing this we saw that actually most of the things which we were patching WebKit for you could just write in JavaScript and that would be easier so I do think that WebODF is a more elegant solution but if you really want wall speed then you can still then ODFKit is nicer because then you have the unzipping support for example in the browser itself that being said I do think that and we might actually submit a patch to the WebKit people saying why don't you add unzipped support as a faster version just by having it in C++ code it would require an extension of their interfaces not sure if they would accept it but it would benefit everybody using WebKit and also WebODF would then be faster to implement that itself the way JavaScript handles binary binary arrays is basically non-existent it's just an array of numbers so that's really quite inefficient but it's fast enough alright are there any more questions I don't see any I think well thank you all for your attention thanks to the speaker