 So, hello everyone. Welcome to a little update on what's happening in and around the validation story, which was another TDF tender that CRB was doing. Now, let's see if all people are coming here. So, my name is Thorsten Behrens. I'm happily working for CIB, and I'm present standing in for one of my colleagues, Facili Milensio, who did most of that work about the progress there. It's not quite done yet. There's smaller bits and pieces missing, but hopefully I can show you at least a big picture and give you some demo how that works. So, the story behind that started with Marcus Morhardt, with Moggy, who in March 2014, so that's like more than two years ago, hacked up some validation, and added on to unit tests. So, throughout the, to give you a little bit of motivation here, I mean, most of you, I think, will know what I'm talking about, but just to outline this a little bit more for the, perhaps, video audience. The new office has tons of unit tests growing. Some of them are writing files. And those files can, of course, be post-processed, and Moggy came up with the idea to run validators on them so that unit tests would break if a file would be not valid ODF or not valid OXML. After that was running, he had another site project of his own which was crash testing. That's a massive set of documents loaded and saved again on a host on a regular basis. And for that, of course, it will also be interesting that the files that LibreOffice was writing, which was vastly more than that's happening during the unit test, would also be valid. Right. So once that was going, well, it was a bit of a special setup, so you had to download and compile and deploy those validators, and then you had to set the path, and it was a bit of setup work. So the EC last year decided among a number of other projects that that would be worth funding to improve the developer experience here with a goal to make this validation the default, so that it wouldn't be an opt-in feature that it would be running with every build for every developer. We'll be running unit tests. And here we started with that in June, so that's the two guys who are behind that. It's a Marcus with a fuzzy picture from Hamburg, and Vasili, that's also from a Hamburg hackfest. Okay. So the goals make expert validation default for almost all formats, based for ODF and OXML. There's another validator, BFF validator for the Microsoft binary formats. So that is still optional because that's a bit of a nuisance, and it doesn't work on Mac, but it works on Windows, and it works on Linux when you have wine installed. Secondly, the developer experience was to be improved. That includes a number of things. First and foremost, that you shouldn't do anything extra when you build LibreOffice. It should just work. And the third goal, at least that's my hidden agenda, is to improve ODFTC and LibreOffice Development Corporation, so it would be easier both for people on the TC to get new features in a form that are submittable at the TC and also perhaps for the TC to check what the hackest LibreOffice is doing there and why. So, yeah, expert validation default. As I said, that's the case already in master for ODF and OXML. The MS binary validator, we can't really distribute that, and the download URL is changing, so it's hard to automate, and it requires wine in a certain version, and all of that together makes it a bit of a nuisance, and since this was all about developer experience, we decided not to enable that, but you can if you add, I will get to that with BFF validator and then the path, and we'll pick that up and run that. Hello? There's a lag. Okay, developer experience. Yeah, no extra steps, so you just call autogen, and then you call make, and it works. And also error reporting, so when those validators say not valid, you probably want to know what is the problem and not that there is a problem, and maybe not how to fix it, and that should be hopefully obvious, but at least some more, at least the line and the column number, ideally also something that modern CC++ compilers do, which is just part of the input and then some pointer, like, there's the problem. And also instant feedback, so if you commit or if you change something and then you run a test, which you should use all the time, which would break immediately and not just on Jenkins or, I don't know, a week later on it's in the books. And fourth, true bisectibilities. So if something breaks and you discover that later and you add a test and you really want to be able to bisect it down to one commit, the problem with the solution before was that it was relying on external out-of-tree versions of validators, so that would mean that you would need some way to switch validators and schema files, so the goal was really to have everything that would affect the validity of the document in Tree, like in the core repo. Someone needs to bug-fix this. That looks like the right one. Yeah, TC and development. So there's two people, I think, three people from the project on the TC. Regina, Michael, no, four. Regina, Andrasch, Michael, and Yostruli. And, well, so it's been in the past, usually that it was kind of people had to invest time and effort to extract or to set up or to write schema changes, to write pros, and then go and take that to the TC. So there was usually nothing immediately useful that we could use, and at least now it should be possible that the schema updates, the schema changes should be a side effect of developing LibreOffice when you touch formats. Yes, and the other way around also, and we committed to do that, is to provide updated schema. So right now, ODF 1.3 is in the making, and there's some kind of hidden repository at Oasis, and there's several branches, and there's schema changes and pros changes, and kind of distilling that out and providing the current development version of LibreOffice Open Document format 1.3 in a way that a LibreOffice developer could consume that there would also be nice. So that is that if there's a problem, LibreOffice would know early on, and vice versa. Who wrote that? Okay, so hopefully that slide stays now. Okay, so that's the set up. It's really autogen, SH, and make. So what happens is that the previously you had to check out two repositories, ODF Toolkit and Office of Tron, and then you had to, that was both on Java, so you had to install some Java built-nonsense, and then you had to build that. Then you had to stick that into your path, but not the Java file, but some script that was kind of calling Java with Jar and a kiloton of arguments. And what happens now is that our pre-built versions of that, really didn't like that at all, but well, it's just for the testing. It's not really shipped anywhere. Then I pulled from def-w-w-w, and put into the download, like where all the other targets end up and are used from there, and it should be reasonably seldom that those need updates because all the schema, all the relevant schema files are now in the core repo. So that will be the occasional update for whatever reason. Someone discovers a bug in the, not in the schema, but in the extra schema validation that would still need to be changed in the out-of-tree repos, but beyond that, it should just work. I think I'm gonna fail to switch slides profoundly. Okay, BFFally data. It's a bit more involved, and that's why it's not the default. So, well, you have to install wine. Depending on your, well, if you're in a Windows, then you don't have to install wine. Well, you can, but it wouldn't make a difference. On the Linux, you install wine. On the Mac, I haven't tried it. Perhaps it works, perhaps it doesn't. But it's not the default anyway, and I don't think we have very many developers who run Mac as their main platform. Hands up? Whose? Tor? No? Okay, so it wouldn't matter anyway, I suppose. Since, well, it really only matters if this is your primary platform and if it's not your primary platform, then why would you care? Because you're hopefully running unit tests anyway, and it would never hit a master as it is. So, right. And then you pull the installer from Microsoft, and then you extract it, and then you MSI exit it, and then it's installed in your home directory or wherever your .wine there is. And then you can run and configure with this BFF validator with the path to that, and it will run it just the same for a set of extensions, file format extensions that are Microsoft binary files. Unfortunately, we had to disable chart and writer validation because BFF validator said, no, ain't valid, and it's kind of pain in the rear because BFF validator, we can't change error reporting there because it's closed source, and error reporting basically says chunk something, there's a problem, and then some error message that you have to Google for, and chunk something, it's also not really translating into at least nothing that I would be immediately able to use. And what's worse is that it only reports one error. So if there are, I don't know, 20 errors, you have to fix it one by one, and you never know if you're done tonight or if you're done next year. There's something perhaps that could be done about that. I will get to that later. Maybe this is repaint problems. Is that possible? Because of this really wonderful NVIDIA optimus. Okay, so that's the slide I wanted. Sorry. Okay, yeah, I said most of what's on this slide. There's been quite some cleanup and consolidation going on, especially for the baseline requirements, so it turned out, and that was the tinderboxes in the end, the tool is that there is a number of built systems out there that only have Java 1.6, and at least Office of Tron, it was Audio Validator 1 to 1.7 because of one dependency, so that was breaking, and so we consolidated that and fixed it and inserted this target version all over the place. We added better error reporting, at least to the Audio Validator, so that would now output a part of the input and some ASCII pointer to where the error happened. Let's do that in a moment. We just, I mean, we're used to that, so we just forked the ODF Toolkit project for having some fixes there, but I told Swanta and he, I think, reasonably happily took most of the changes there. Oh, actually, error reporting still, but it's not bad. It tells you line and column, so it will just be improved a bit more. So, yeah, irritating. So, demo time. Let me try if that works. So, what I will show now is the, that's the state, I don't know, is that visible, hopefully? If not, just come to the front. There's lots of, yeah, but I can perhaps just increase the size a bit better. So, this is, this is ODF Validator, and what it's doing, it is here calling the current version that's on def www. That has still old schema files, like the one, like the standard, one, the two standardized rubber stand versions, and I let that run over with extended conformance on this very keynote that I was giving on Wednesday. So, let it run, and it's kind of very unhappy because there's this dreaded draw fit to size, which is invalid ODF. Yeah, that's a shame. So, what do we do about that? Either we fix the code. I mean, maybe that was, we didn't know about that what we were doing there, and we just revert that, and find another way, or put it into another namespace. All we can do is change the schema. So, that's the, that's the relevant part here in the ODF 1.2 schema. So, that says draw fit to size, supposed to be Boolean. And what you, what, what Gleboff is doing since the dawn of time is writing some more values. It's true, false, shrink to fit, and all. So, with that schema, we let it run. It's now using the, the trunk of ODF validator as it is in the fork repo, which can load schema files from the command line. So, we can overwrite what's built in. So, we run it again. And it's much less errors. And you see that there's some nice, like part of the input and this pointer. And it says attribute office version has a bad value. Apparently, I'm happy about the 1.2 here because it's a 1.3 schema. Right. So, yes. And you can do that. So, this is one of the more annoying problems because it's even under extended conformance. It's, it's invalid. So, we need to do about something about it either make sure that it's in ODF 1.3 or put it into a, into an extension namespace. But what you can see is that it would be reasonably easy once you perhaps watch this talk to make sure that whatever you do, however you tamper with the, with the ODF import and export that what is written is valid. And if it's for some reason not valid, then you can change the schema and people can monitor the schema changes reasonably easily because it's just one place to watch and not, I don't know, six million lines of code to watch for, for changes. And then it can be discussed if it's a problem or not. And the same is true for the, for the TC changes that can very easily be brought back into, into the core repo. Another thing that could be done is to, to at least around ODF updates to say, oh, let's maybe go for strict conformance which means no extensions. What also can be done is that we have various format export settings one to two strict 1.0 slash 1.1 so those could be checked against the suitable schema files that are also in the, in the tree now. The same is true for OXML. There the schema already, the schemas already have extensions. Moghi and I think Bubli added, fixed a few bugs there or added some missing pieces. And in principle the same is true there. So if the liberal office finds problems with the schemas we could report that to ISO as they maintain those. I'm looking time wise. So, right. Then let's go back to the slides. I'm not thinking yet. Okay. I was mentioning a few things already that could be done going forward. Clearly what's missing from the tender is integration with crash testing so that what's happening here is also running with a crash test that needs a bit of tweaking there with the external scripts. I mentioned strict validation which is quite helpful because with that even extended or private namespaces would cause validation to fail. So at some point in time we would have all the changes that we did upstream through ASIS and it would be a standard or a draft standard. You could say, okay, from now on we switch on strict. And then if someone really, really, really needs another extension they have a 1.3 extended and then someone would need to add that to the tool's options so it's a nice way to kind of notice when that happens and make people aware of that they need to do something. I was referring to that earlier so this whole binary out of time are you serious? There's five minutes left. Okay. So binary formats. This is called bin schema from Just Fun and Over which is a XML description of the binary formats which are reasonably structured with this OLE funny file format and there's already for the three major office formats there's already schemas there so we could use that and perhaps it gives us better debugability because if we run it then we get pointed to all the schema line and we get pointed here and it might be easier to then go back to the code and find the place that's writing that. And another nice thing is that maybe we do not want to tamper with standardized rubber stamped or wonderful never-changed ODIF 1.2 versions and if we don't we can use schema includes so we can have LibreOffice extended and that includes the 1.2 and in this LibreOffice extended schema which is just an include statement and then lists those five additions or overrides that we have and we'll be even more obvious I mean it's obvious enough if you know how to parse Git history but now you can just have this one file and send it to waiters and say we need this kind of we would like to have to standardize. Okay. That's the end of time or more than that. Thanks for your attention. Thanks so much CDF for sponsoring that. Thanks to all the donors that make that possible. I love you. Okay. That's it. Any questions? Okay. So everyone's happy. Great. I'm happy as well. Is there a validator for RTF? Because we do RTF export too. You said that you have validator for all export formats. I wouldn't rule that out. It's probably there's probably something I don't know if it's open source but I would be surprised if there ain't. It's a rather structured format but I haven't honestly I haven't looked into that. Okay. Any questions? Any other question? No. Thank you. Great.