 Hi everyone I'm Svante Schubert, and I'd like to give you an update on the ODF toolkit To my person I might just say it's during the talk, but I also involved with Michael Stahl in the OASIS TC I'm a co-editor and a co-chair from ODFTC and With Michael Stahl as well a co-maintainer of the ODF toolkit and I'd like to give you an update on what's going on So the highlight first of all it's been last year November, but wasn't able to tell you that we did two releases Last year that was the last time JDK8 and then JD11. It's just there's a There was a there was a gap for for the tooling and Finally, I was able to re We established a code generation which was not used and broken after 11 years and also worked myself complete into this area and Made a release for the multi schema validator, which is an underlying it sometimes feels like you're working through old ruins of antique technology There in the in the flowering time of XML there there were validators flourishing and lots was going on and This is from this time and we took it over and I yeah made a build system from and to To Maven so it's up building out of the box without jazz inside and JDK enabled and yeah, and Michael did the first fix for for it and yeah, and we continue working that so basically We still haven't got a one zero zero, but I Always thought that have to be something something special and but I believe Nowadays I believe we we keep the the API stable so don't because there's no tooling in Java where you can say I have these API chains apply them to your libraries and everything is fine. Usually you use it and suddenly the build breaks Yes, so Even if we have package names with incubator, which is some IBM relic to when working together with them but we just kept them and Might do refactoring on a different bunch and also doing the final icing of The in document search API for for engine zero, which is basically despite of prior project that were mentioned the That were the sponsorship of the work that was ongoing the generation and all this all my time here in the ODIF toolkit and also the reason why I rejoined the ODIF TC back again So what's the ODIF toolkit? Obviously, it's not a silver bullet. It's a silver hammer here And now it's it's for ODIF the file format and it's for developers. It's it's in Java mostly and Just to give you a brief update on what's about ODIF. It's just a zipped file with XML there's something called meta in for manifest which is like a content table and In early days, it was meant to to be reused by by PDF and it was used by EPUB But unfortunately they forked or we forked. I'm not sure there was miscommunication and They used their own signature Which was different than an ODIF one two and we did a release on ODIF one three last year in 27th of April and what's special on that release was that we we bundled all the tooling for the ODIF release and this is Where this can help as well. So an ODIF Think about it. It's a blueprint where all ODIF applications Derived from so everyone who is able to read and write ODIF like Microsoft Office LibreOffice is an ODIF application so from history we started very early in the old two thousands and Staff is where most of us some of us came from and staff as you might know is Heritor the anchorster from Open Office LibreOffice the source base and there we we came together that it might be nice to have some Java We were some the inventors of Java libraries on the server just to have a low-memory footprint or Some tooling on the server and then we had a joint went away with IBM And they did a fork on on simple a bi that was rejoined, but unfortunately it was not merged again. So finally we Dropped it last year because there's a lot of copied code in there. Yes, okay, and What we might say as well is that I Continued working since my drop-off 2011 from Oracle as a freelancer and was working for open exchange on a web office Which used this again as a back-end for For a web office. The reason for this is you put a document in and it's being yeah and zipped the XML extracted and then Earlier at Sun we transform the to HTML, but if you have multi-collaboration This is a bad design idea because if you get much HTML back, you don't know what the others have done It's just like you have to find the changes. So what we did an open exchange was we we transform this document to changes, right and I Show you I show you about this later bit. Okay, so but it's a toolkit And there are more things and the only thing that has a GUI is the validator and you you can browse as odf-validator.org and this Basically comes I don't think the library of LibreOffice logo is within but it's very simple It's it's small funding from 2011 were the only Odf-validator very similar to that was within the firewall of open office But as soon everyone dismissed we had no online validator more and so and on that was providing a small funding for me to to to to build something in there and to clean this up so There's the online validator and you but also you can run it by command line and we have something Where you can run accessibility directly on on XML within it's not the most prominent use cases But the most prominent is of course if you like to edit the odf and you might Be certain you don't do it with a layout like in LibreOffice, but you can insert and Delete XML and later on now with 10 2010 I get a prototype funding for this I Merged the earlier for an open exchange where worked about three years on this topic and Whenever you now call Java this jar all included with an odt file it drops out a lot of changes like the user would have created this Oh, I will simply now show you how this looked like think think about this. You see this No, don't see it. Wait a minute. I have to finish Just a second. Yeah, I don't doing presentations often. I Think I have to move this over. All right. Here we go. Yes so we if you have a have a simple a simple file then You can I move this over Here we go. You have Java jar and this job file and this as parameter and this documentation of these changes then you will You will have Jason because it's a web thing and every line isn't is a change of user because Also something later on but We we're still living in the floppy disk paradigm It's pretty stupid like we exchange a document attached to email and cloud But if you want to collaborate really with other people you you have to ask them the main question What have you done was if I've changed to be able to merge the changes into the gold master and this is a totally Design paradigm change of of working. So Here's like at I don't see if you I'm not sure if you can see it. I just make tried don't do it so long Second no, I don't got I continue the slideshow So any every line is it is a change basically and it's like it's user changes. So the semantic something different. It's I Got here we go Here we go so Yeah, so This is what was needed to have collaboration. That's in the back end. It's working and And I use this for my search engine as well. I'm not my search and I use from Edward Zimmerman's search engine for the project I don't Deal with the XML directly but on a higher level on this Jason files and create their form searchable content For the search engine. So basically the what's the architecture? You say a couple camera from Maria What's the architecture? So There's the audience on which the core part which is loading the document the ODT or everything and it's upping it and And this is generated From a source code generator. So it's just a compile time operate Dependency and this source code generator was totally overworked for myself in the past months and And and improved a lot And can be used for other things as well. I chose a little later. So and the other thing is So what what is the source code generator? There's always the grammar. It's an relax and g-grammar and And this is being loaded from this movie schema by data, which is called this way because you can also load xsd's and DTD's and and then something like templates a patch of velocity templates where you have text files We have iterations on it where generate like every methods like for every child element Child methods has been dreaded or attribute. So so the idea is we have a simplification here by code generation the user get a type DOM tree and They don't know they have to know or read the grammar, but most of it has been generated so and this is very helpful and you you can use the other grammars for the grandfather things like You the question is a kind of text P ever be nested in a document and the funny thing is yes It can but the layout is only then it's not a lay out like a feature paragraph. They only know on the on the first level Yes, so And this can be done by loading this this document this graph. Sorry. There's this grammar into a graph So and that's there's something which on which constantly doing is We have to deal with complexity like here and one or two It was about six hundred elements and one of 300 attributes and this number raised was sorry This number Okay, this number was raised look here it was raised a little bit and It's very difficult to read this this XML if you just skim through it. So the first step was for one or three I I Used a little trick and transformation Now we have the the grammar file as an HTML with links So it's a bit easier and you see here. This is rex and G There's a definition element and optional and you find choices and these things also and The the thing what what's what really came clear to me last year or the year before was that rng is Really simplistic. It's like if you take a look at it The specification just take 21 pages and tutorial 22 But if you take a look at XSD then there are 360 pages for one zero or three hundred eighty more than this for the for the grammar and personally I can really Read the primer every as I'll throw a thing else. I have to jump to it. It's it's to me I'm not unreadable. I will try again, but and additional there are papers That's stated relax and G is more expressive than XSD and easy D so I really invite you to to take a look at it and Play a little bit around with the generation and what you can do with it And you might do it not only for you can create your own relax and G for what you like and create source code or documentation from it, so There's one thing that's also very impressive I really look at the bold one. It's called I call it regular expression the rate if is this the event of Bruce I don't know how to spell it this now. It's a paper from the 60s and usually the people know think that pattern matching regular expression you have to do it with a non-domestic finite ultima with backtracking search or Transformation to domestic finite ultima, but this is pretty simple and it's very intuitive. You have this Full of freck, you know this your your grammar will be full of freck and then you get an F character then Your regular expression is changing. It's being derived from it from what you find so the state You you you you're doing you having during the parsing is changing and this is the algorithm This James Clarke was using for multi-scheme of sorry for for Jing and KK I Can't I have to train this earlier. This is the the nice Japanese guy who is working for our Oracle and who invented also Jenkins and yes, and he gave this Basically to me over is now a CTO for some company and anyway, so so this is very simple this algorithm Yes, and I really have to wrap my hand around it more a little bit and why this is so cool and and Yeah, just to point it and It's it's quite genius So what we have currently in the spec is something like this like you see a form property, which is already fine Yes, you see the form property. You see the element. You see the the the child elements and sometimes you see the Property yes, so and we have links to it in HTML. You see like the the hashtag you can general generate jumps to it, which is quite fine, but And now we have a generation still actually alpha version We have also the to to understand the rex on g easier I generate this now I have the parent element and the the attributes child elements and and the child relationship and What's coming next end of the year really next year is sorry? is is to To see look here. You see it only You have certain attribute, but it doesn't mention that they have to come and pass right and this should be made obvious the rex and g like in a something which is I would say very similar regular expression that developers can understand what is the What is the relationship between the child elements, okay, so and The reason why we're doing this is just to that the documentation is easy out of the box But also I think because it's a blueprint. We should be able to generate improved source code from it yes to like You can have a choice of things in the in the dump for that and there are other things in this publication like Only if this attribute exist this value will be evaluated I would call it a trigger attribute, but we haven't it's just written text and we have not annotated We cannot extract this semantic and where I like to drive Drive through it is that blueprints Should be able to generate source code from as much as possible, okay So so that's where we're we're trying to heading. Okay, so Coming back to it you see there there for instance definition of a table the blue is now the element and yellow attributes not here but Attributes the yellow things are like definitions like intermediate numbers and I I create a graph from it and Put in a graph database. Yes, and Visibly the key it's an open source to you have to pick up your rendering and play a little around it but you see it's quite complex and if you zoom in then these you see the attributes and Some choice absolute meaning a choice or nothing which is optional here. So and This is something we like to or I like to move further This is exactly what I'm trying to do early in the in the documentation time. Okay, so Long story short the source generator is being totally overworked and Still there are some features there and the spec that cannot be generated and still ongoing worker Yeah, and it doesn't have to be only Java. Yes I really love to do this in Rust one time and maybe C++ their parts that can be That can be Generated from that as well from the schema Okay, so the Scott source code generator is now with with the grammar schema validator and and the stinker pop which is optional. Yes, just to To have a different angle on that. Okay, so Derived from the ODF thumb is ODF validator which been using this and of course it's so to runner because the T runner is just taking care of of Accessing the streams from of the zip and as I mentioned before we Deprocated simple everywhere and dropped it out because there was too much to placate a code So the architectures basically or what we desire to is to have a user API as a meant to get the high-level API But it's the ongoing work that should I first skate from the XML. Yes, but and currently this is plainly visible and Just something we will I would like to to move into it. Yes, because nobody would like to know what is the XML syntax, but just like to say hey open text Insert an image into the table three to three and And then a paragraph hello world and make the third to the seven character red. Yes, something like like this So very high level very user cement like that's what I meant. Okay, and what I'm trying to show earlier as well It's like this paradigm that been changed and I try to make this that we really need to think differently we all still define Documents and especially for change tracking would be nice to define a change prior to do that and Yes, as I said earlier, it's a merge problem that we are just in a world We have collaborating with ourselves with laptops and by sending emails or putting files on the cloud It's it's just like faster floppy disk and we are unable to merge and this should be part of the standard this changing And only if we do this we we are able to have coloration functionality And we have to think of changes like commits in a in a in a branch. Yes so This is something I showed earlier is like the idea that no longer documents or initially the document is being dispatched like you clone GitHub repository, but afterwards you just patch only only changes and You are able to merge them. Yes, and so That's the link that I showed earlier you put in an oddity document in and you get a transformation of the user changes It's totally equivalent. Yes, and that's currently working and being used for the for web offices in the back end or for the search engine For instance and new changes single things like and on the one minute paragraph on McRat You place in it's being transform or transformed to a new DOM. Yes, and And by this And and by this you're you're you're able to to to really do color a collaboration. All right So that's it basically So what's what's next of course we The generation is just a still a pull request and This will be will be finished there will be this famous one zero zero