 So I thought of course of sex, drugs, and rock and roll, but I'm not sure if I could come down with this. So I could summarize it to one thing, it's money. So this might be interesting to you. The P is for the prototype fund, and it's about funding open source projects. Let's start with, you have to be German citizen, that's the, but it's nice place to live. You can choose, you're out, yes. You can be from any nation, and I was able to one the second round of prototype funding, which at that time still get only 30,000. And now, because that's quite funny, there's group of six girls or young ladies, which I believe, but they are handling it, that's their idea, and they are with the government of research, they're getting the money from them, but they're in between there, and there's very, just you fill in one formula with your idea, open source, how the idea self, everyone, and it goes very easily through. And within the 50,000, there's like two and a half for consulting for you, for coaching, but yes, even 30,000 was great. I was planning to be here with a friend of mine, but he started to do a fishing app, and so I had to do it alone, and that's so far as I wanted to go. So take a look at it, Protofundee, it's worth to jump into it. And the second thing is called, this is the good news, it's not a bad news, but ODF is about standard, file format, and to me, I think open source is not enough. Open source, sis, it's different software, have to be interval, have to communicate with other, otherwise there's still the chance of locking, and although standards are very slow, the ISO, the pros of ISO has been made for material, for pages that never changed, and software needs evolution, right? There's a contrary contradiction here, but you have to keep in mind that the ODF, the blueprint of the ODF application like LibreOffice is very, very important. So, but I hear often, oh, standards, I don't like them, I like open source, that's sufficient. No, I don't think so, and I'm pretty much sure that the user needs a freedom to choice, freedom of choice to switch between ODF applications. So, with no further, no further, I do some history of the incubating project. I believe it was 2005, I must admit, I've just did a good guess, when at Sun Microsystems, we came together, we had the staff at that time, maybe even open office, but we thought about bringing all these software solutions that we made for the server, these tiny things, oh, I just want to unzip the XML package with the XML and add something, and we had, most of us have something in Java, and we put it all together in one place, and because it was opportune at this time, we made it open source as well, and IBM had the same code fragments, so we came together, and it was then later being used and pushed by being at the back end of a web server, web office, sorry, web office, right? You have a browser and HTML document that you're editing, and basically it's been ODF in the server, it's being sent back and unzipped and transformed to HTML back and forth, oops. So, how do you know anything about the toolkit? You might have used it before without knowing. Recently it's been, Libra was sponsored the validator, the website for the validator, and we have a front end, it's a JSP, it's running out of the box, you just get a wall file from the project, and you can use it as a standalone version as well, and this is basically the main modules of the project. At the top is the generator, which is for me the most interesting part, because it's generating the source code from the schema, right? The basic idea, one of the principles I learned from software development is the more you generate the better it is, right? Otherwise you have to do the work over and over again or get a mistake, it's horrible lot of work. And ODF DOM, the DOM gives us, in DC, it gives a sign about, it's like in a browser, HTML DOM, every element has its own object. The advantage is that you will, from the start on, you have no information loss. You load the full document into the DOM, you can edit it, adjust it, and save it back. The idea of generating it is the schema's quite complicated, I'll give more details later. So the more you generate, the less the developer has to know about the schema and can be guided with the, let's say, typed classes. There's even an element class for the paragraph called text P element. But the disadvantage is the memory is larger than, let's say, binaries or bits and optimization. But I think in the toolkit, in the first place, if you need to have a research, you have to improve the generation. And then later on you can generate source code, not yet like in Java, maybe later in Rust, and of course, maybe in a binary representation and going away from the DOM. A side of this, using the DOM, is the XLT runner, which enables you to run the XLT script directly on the ODF document without the need to unzip the content and the styles and so on, right? So those guys who love XLT will happily be using the scripts directly on the document. The other thing we just saw before is the ODF validator. These are the both important work from the ODF storm. And the last thing that's been donated by IBM and no longer supported by IBM as soon they, yeah, as soon they choose to move on to something else, they abandon that. And then I make it in red because I think as well, it's, yeah, it could be done better. Yes, it was a very fast work they did once. So let's take a closer look at this ODF DOM. There's a package layer, which is taking care of the zip unzipping and the manifest. And totally independent. This can be used by any other software as well. EPUB one by way, use the same ODF one not one packaging format. Unfortunately, they forked for no reason. I am aware of maybe they didn't. Yeah, we don't talk to each other. And they invented their own sign in encryption and have their own packet format fork, which is nonsense, of course, but yeah, we didn't have dinner together. And the next thing on top of that is the, as I said, the generated layer. And this generator, sorry, can be split it again in two different areas. One which is totally, yes, generated the implementation detail of the XML and the above we call it here, the document API is the way the user knows it. Like my mom would say there's a paragraph, there's a table, she don't know anything about how it's been implemented in XML. And the funny thing from the user perspective, many office documents look the same, like DocX, ODT, they, if they're loaded in the same office, most documents look quite the same, but the XML is totally different. So on the abstraction of the user label there, they're very much the same. So these layer concepts can be found also in the specification, which consists in the ODF one or two specification in three parts. And you've seen the lowest layer here, the part three is the package format and which can be used by others as well, as I said. And the first one specifies the XML and the second one is just the formula for the calc. Might be used in writing as well, but usually only for the calc. So there's also this separation of concern or modernization being given by these three specifications. And what we have, we talked earlier, in the first part, the XML is the schema, the grammar that tells you what is allowed. This is quite complex. I'm not sure if you can read it here, but I would say it's ugly from a usability perspective. For me, it's ugly because we have about 600 XML elements and 1,300 XML attributes, and this is quite a lot. Some would say if we write an office with only the paragraph, it's also quite complicated than adding styles and so on, but embracing everything is quite impossible without the way of generating and making this easier to understand. I've chosen the table here, and you see there are a lot of references and so on. I won't go into it into detail, but we started this generator, 2050 I said, and in the beginning, we simple on the first try with XSLT, Christian Lipke did some Excel form a colleague, did some Excel T transformation, and read this XML file directly to fill it into Java, which was quite of work and quite of things he did, and only a subset, of course, and we couldn't use this, oh, it's B, it's not a P binding, yes. We couldn't use the Java XML binding because the Sun standard for mapping XML to Java classes only works for W3 schema and not for the relax and G schema. Well, the nice thing about standards is so many you can choose one. So no, there's no interoperability, as I said. So instead, and that's what we are currently using, we use two different open source technologies. It's a multi-schema validator from Sun, which takes part, take care of the parsing. You can read it, have to don't invent it or write it yourself, and there's internal model then, and from this, you take this and fill it into templates, text files, where you can create anything. We create a HTML documentation, there's some Python, I believe, and yes, mainly Java then, it's been tested, and all the information that we want to use was being sucked out of this model into List and Maps, and somehow I realized that was quite difficult. When I tried to improve this, I realized I couldn't find these things in the List and it was very hard to expand it, and I thought it would be much better if we could directly take the relax and G as a graph, right? Because every XML is a tree, basically, yes, but as soon you got references, like a style, ID to a style, you got cross-references within, and you're starting with a graph. And graphs, as you might know, with the success of the social networks, like Facebook, where graph theory comes in the daily work, in the main focus, the work, the research in this area has normally expanded, and the algorithm to use graphs and alter them much, much better. So what I did, and I reused as well, when it says, okay, I want to load the relax and G into the graph database, which graph database do I use? And the nice thing is the TinkerPop API, TinkerPop and Patchy, is again, hiding the implementation detail of a graph database, that you can use every graph's database, and they have a language called Gremlin, a script language, to traverse this graph, which is then transformed to each of the graph database they're using. And I feel pretty safe to go on an inter-rupled level, and again here, right? So what I did, let me first, I put this in the notes there. I've stolen this from the Kalscomputer Club presentation, where they did source code analysis with graph databases. And so I thought, when they can do it with a source code, which is much more complex, I can do it with the relax and G as well, because with relax and G, if I ask anyone here, and ask, please tell me what is the minimal document that is possible in ODF, right? Simply go to the root, and take all the mandatory elements and put them together. You will not know, basically, but this is an easy query for a graph database. Give me now, start here, and now give me the minimal document that's being used here. So I thought I need to reverse engineer the relax and G, or have a better tooling to understand it, and to control it. And that was the reason why I came up with it. So I started with the middle schema. Instead of reading the relax and G myself, I go as well on top of this, and I simply dumped this memory model into a text file, line by line, and then wrote just for fun, this antelar grammar to generate a parser, you read it, and you map it to the graph ML, which is just simply a graph format, which is quite interoperable. And with this, I could visualize first time a graph. So, are there any questions at this point? Because maybe this is quite, I'm speeding up a little bit on that, because this is an essential idea. Why I'm doing this? Because relax and G is so big, and it's one huge text file, and we want to improve it and want to work on it. And like Stefan using Clang compiler plugins to traverse to C++ source code, which is very huge, I want to use a graph database to traverse this tree of relax and G to answer me questions in an automatic way, right? And be able to do refactoring later, because otherwise it's too huge for manual editing, okay? This is just, we need a better tooling to embrace this complexity. So, what I did is, please graph database, give me from table to table all the child elements and everything in between, all nodes in between, and there are nodes in between like choice, sequence, and so on. So, you will, you see just a picture, like a star picture, you don't see the details, right? This is the table to table, and all the elements around. I have a GV scale reviewer there, just, and the red things are the attributes, right? So, do you see there some structure? Okay, I will zoom a little bit in, yes, the attributes, and then we've got this here. And I will explain a little bit. There's a sequence, okay? A sequence of one, two, let's mean there's an order. You have to first, you have to use this, and then if you use this. At the top, there's an element const text soft page break, and after this, you can use the table role, okay? This here is boilerplate at the moment, right? And this here, epsilon means nothing, so you have the choice to have nothing of this. In other words, it's just meaning it's optional, okay? So, the next step, and that's what I'm currently working on, is I'm refactoring it and improving it by exchanging this to optional, and whenever this name is similar to this, I remove this as well, just to simplify it. Okay, I've got five minutes left, I'm going on. So, what I'm trying to do now is, there are a few things like choice and sequence that I need to generate, that's not yet in the coding. And also, when there's a parent, like a style, and that has many styles, that have many styles, which have an ID, I want to have a map in there. I just want to generate it out of the box. I want to generate as much from this DOM layer as possible, because I don't want to roll it over and over again. And another thing is, when there's a reference, and XML said, oh, there's a reference, and there's a start of a reference, and there's a stop of a reference, but they don't say that style ID and style name, or style, they are connected, always connected, that's missing information. So, the next thing is, I want to annotate and enhance the schema with additional information, so I can generate more. And the last thing is, and that's the most important thing, why I'm doing this all of this, is there are user changes, we're not specifying the schema. The schema says, oh, you can put anything, as long as it's fine, but the users among us are just doing the same thing in all offices. We are adding tables, adding paragraphs, adding characters, and this is the high document earlier, see the high API, the user API, where I need to implement it for collaboration, because if we colorate it, the single document, this is the only way we have it, it's no longer possible, it's broken, we cannot merge, if I give you documents, give it back, it cannot merge, it's like, we need changes, like in a Git software commit, I want to ask you, what have you changed? Give me your changes, right? So, I want to have user changes on the high level thing, and I want to be able to answer this question. So, my work on the prototype thing was, oh, wait a minute, I forgot the site. So, this is just that the user changes isn't implicitly standard, but it's not being documented anyway, it's in our mind, but it's not written, and we have to start to write it down and have these injured delete and modify changes for all these user components, we have to annotate in this schema. So, my work on this prototype fund is that I promise to put an ODT into the ODF toolkit, use it as a black box, and it's been transformed into a sequence of changes, like a cook recipe, where you can say, oh, insert the first paragraph, insert hello world, do this in the second, do an image, third, do a table, right? It's the high level change, it's totally equivalent, and the other thing is, it should be able to accept new changes and merge it into it, right? To have a proof of concept of this and to see how it's worked, and the new thing is here, I want to generate as much as possible to avoid redundancy. So, the user changes are de facto not a standard yet, right? So we are in need of enhance the relaxNG to generate it, right? Otherwise, there's, because it's optimistic, why should I write it by hand if it's for all applications the same thing? It's much better to have a way to annotate it. And how we do it, that's easy, but I'm unfortunately running out of time. Okay, any questions? Thank you, first. Okay. Yes, please. Because I clearly hear you're saying, sequence is important and then why not stay in the XML3 model with x-query instead of r2? Yes, good question. So the sequence, by the way, is just, if you and I are working on the same document, we again have branches and we again in a graph, right? Like in the Git model, but the graph is because it's the natural reservoir.