 community, and even build a better product in the future. So a bit about myself. I'm an introvert with team leader, Klaudam. You might have seen Shahal's presentation yesterday. And I love new technologies. Always have to have the latest gadgets. And here's my email in case anybody wants to contact me. Feel free to just send an email. So what is interoperability? So I wanted to start with DocX interoperability, but then I said that it's even a better example, which hasn't had anything to do with files. So let's say you come to a hotel here, and this is the power output. And you have no idea how to use it. So just go ahead and you buy this nice adapter, and then you can use the power output here. So the same thing with OX and interoperability. You've got a person using a file, a Microsoft file on one computer, and he sends the file to a person using Librox on a different computer. And you'd like the two people to be able to communicate between them with this file without losing any information. Both of the users should see the same thing on the screen, and they should be able to work with the objects and manipulate them the same way they do with the Microsoft Word, the same way they should be able to do it on Librox. So interoperability is the ability to communicate with a different application, a different OS, a different platform without having any data loss or without having any communication problems. So just a small introduction about the DocX file case. Some of you guys don't know what the file is. So it's the Microsoft file format for Word documents. And basically it's just a zip file that contains inside a series of folders and XML files. So basically it's just a bunch of XML files. Each one described in something else. You've got one file that's describing the main document and one file that's describing the headers and footers, maybe a couple of files. And you've got a file for the settings, for example, which level of zoom you want to open the document with or do you have a spell checking turned on or off of the document, things like that. We've also got web settings, footnotes and end nodes. So each of these files is in XML format. So that's about the DocX file. And what we're gonna talk about today is, like I said, different types of interoperability problems, some improvements we've done in the last year, the gaps that are still left open, something called the matrix, which is my baby. I've been working on it for the past for two or three months, alongside with the guys from Cinerzip, who have done a great job on it also. Some testing tools that we've developed and some of the beta observations that we've seen with our new app, with our new authoring tool that is based on Libra Office that Charles presented yesterday. We've had some observations and some feedback from users about the usage with Libra Office, about interoperability. So I'd like to share that with you also. So let's start about interoperability problems. So basically, last year I had this slide that we had four layers, right? And since then we've developed it even more, so we have six layers today. And when I say interoperability problems, we classify it with six different layers. So the first one is called crop. That means that you open a file, a DocX file in Libra Office, and then you save it back without doing any change whatsoever. And when you open it back in Microsoft Office, Microsoft Office complains that the file is corrupted. It says, if I cannot open this file, something is wrong with it. And it can be, due to a lot of reasons, can be something in the XML itself is broken. You started a tag and you didn't end it properly. Or maybe you have the wrong value inside the XML. You have the negative value where it should be only positive. So a lot of interesting things can cause Microsoft Word to treat the file as corrupted, even though sometimes Libra Office will be a little bit more forgiving. So these are the worst kind of problems that we have clouded on C because these problems are problems that the user doesn't necessarily know about. He uses Libra Office. He updates the DocX file. He might not even update it. He saves it back. He doesn't know if there's any problem. But the next time you try to open it in Word, Word will complain. So this is the most troubling problem. And thankfully there's not a lot of these problems. We've managed to drop it down from 25% of the files getting corrupt to somewhere around 0.1%. So that's a lot of work that we've done. The next type of problems that we have are called crash. Meaning we open a DocX file, Libra Office, Libra Office crashes. Or when we try to save it back, Libra Office crashes. This is also a bad problem for usability because obviously we don't want the user to use our product and it having it crash. But at least here the user knows it happened. He knows something went wrong. He might have lost his information before but he knows something happened. The next one is called Hang, which is just what it says. The user tries to open the file or save it back. Libra Office just hangs because it might be it has a huge 20,000 row table in the file. It can be because Libra Office goes into some kind of infinite loop. But the bottom line is Libra Office hangs. The next layer is called Preserve and this means we have some kind of data loss. It can be something to do with, you open the file, save it back and you lost a table. And it can be something that has to do with styling like the board color of the page has changed from red to black or the width of the table column has changed. So even inside Preserve we categorize it as data loss or style loss. Having data loss obviously more important than style loss because data loss the user usually cannot recover from while style loss the user can see that the table color has changed and you can change it back. But if you lost a table, he usually can't do anything besides re-entered this table. So the next layer is called Render. This means that Libra Office does not render the .x file the same way that Microsoft renders it. It could be because of the bug. It could be because this feature is not supported in Libra Office because it's some kind of proprietary feature that only Microsoft uses like SmartArt. Or it can be because of the Libra Office doesn't yet support the feature which is it could be some setting of chart, title, position of the title, things like that which we all just support. We ought to add it at some point but we don't yet support it. So currently the title is visited the same position as it is in Microsoft Word. So these are render problems which only affects the way the user sees the file in Libra Office. And the last layer is called Manipulate. It basically means that the user, these are the bugs with the usability meaning the user cannot work with the object the same way it works with it in Word or you cannot create the object in Libra Office. An example for this is again SmartArt. In Libra Office when we open a .x file that has SmartArt we try to do a good job of rendering it but we render it as an image. We don't allow the user to start moving the shapes inside because we want to preserve, it's more important for us to preserve the object itself. So for that we simply save the entire object in some kind of grab bag and we render it the best way we can as an image but the user cannot move the shapes. So this is a manipulation problem. The user cannot change the text inside the shapes of this SmartArt, you cannot move the shapes, you cannot change the size. So that specific example is because it's something that's not supported in Libra Office. But there are other examples. So these are kind of the six layers of problems we see in interoperability. And the most important for us is the preservation problem. When I say us, I mean us as cloud on, when we want to fix problems with interoperability because there are thousands of them, we first focus on preserve. Obviously after we have fixed all the crash hand and corruption because these are not so much there's like maybe 17 crash hand corruption together and maybe 500 preservation problems. So we focus mostly on preserve because we believe that the user's experience is affected most if he uses data, he uses information. Less if he can't change the size of the SmartArt, that's less important to us. We're more concerned about the user being able to know that when he uses Libra Office or the cloud on product that's based on Libra Office, we want the user to know that he doesn't lose any information, he can count this application, he can count on this project that he lose the minimal amount of information. So that's just an overview of the different problems that we have. And the next thing I want to talk about is something that I mentioned, it's called The Matrix. Basically what it is is it's a huge set of features that we've compiled that belong to Microsoft Word. And for each of these features, we've tested all the different settings for this feature. We've, for each of these features, we've created a synthetic file that contains a dochex file that contains only that feature, a small file. And also we've tested these features in three different layers, preserve, render, manipulate, meaning we open the final Libra Office, save it back, check if the feature was preserved. Then we opened it, checked if the rendering looks the same the way as Microsoft Office. And the third thing we tested is, can you change this feature the same way you can with Microsoft Office? So for example, if the feature is line spacing, then you should see that the line spacing is preserved after round tripping. You should see that it looks the same line spacing as Word and that you can change the values of the line spacing. Then we went and we did all of that against Google Docs also. So that we could have a kind of comparison between how good Libra Office is against the benchmark if you want, which is Microsoft Word, and how good is it against Google Docs. And also in some level, we also tested it against Apple Pages. And what we've seen is that Libra Office is much, much better than Google Docs or Apple Pages are. Certain areas where Libra Office grows them out of the park. A lot of objects, Google Docs, Apple Pages convert to images when you import it into their website. And when you download it back, it's an image. So you lost your chart, you lost your smart art. Sometimes shapes get converted to images. So all of these things, Libra Office does a lot better than both of them. And in other cases, it does as good as them if not better. So just to give you an idea of how big this matrix is, we've got 1700 features in the matrix and I remind you for each one, we've got a synthetic file and we've tested it in all these different levels. So for example, if you see, you can see that the text category, we've divided it into categories, more or less aligned to what you have in Word. So you've got text has eight subcategories and 47 features. Paragraph has 36 features. You've got things like diagram, which is smart art. You have 260 different features that we test. Shapes have 230 different features that we test. And we also test some legacy features like VML, which is the older way to describe shapes in DocX file. And legacy forms, things like that. So I think that you can't find a feature in Microsoft Word right now that you can tell me if we don't have an amazing, that we've covered it. And we keep it live all the time. Every time we fix a bug or every time we notice that somebody in the community fix an interoperability bug, something that was supported totally, we update the matrix so that we know at any given point what the status of Libra Office is. So just a quick screenshot of some of these. So here you can see the text of the simplified version. Here you can see the text category. It's not even the entire spreadsheet. You only see like 40 different rows. So you can see here that you've got, you can't see here, so I'll just tell you what you see here. You see for example, you've got bold, italic, underline color, underline style, sub-strip, super-strip, small caps, all caps. And these comms are for Libra Office and this comm is for Google Docs. And you can see that Libra Office is doing a better job in all three comms, which are reserved, render, manipulate, you see that in the category called Open Type Features, we now preserve all of them while Google Docs just loses everything. Or when you have character spacing, we do a pretty good job with character spacing while Google Docs sucks. So this is just the text category. You can see here the paragraph category. And again, you can see that the Libra Office side is much greener than that yet. Can you go back one slide? Sure. No, no, okay, let's do that one. What's the red rectangle in the, down in the middle of our... This is the worst job. This is small caps. It means that we don't render small caps properly. Specifically for this case, small caps is rendered as all caps. Which means that all the letters are encapsulated in the same size. Small caps should mean that only the first letter is big and the rest are capitalized in the smaller. So it just renders it wrong. And I think it's above because, and I think in Libra Office, you can't select small caps. It's a feature that is supported in Libra Office. So this basically is above. By the way, for each of these features, we've created a case in our system, in our ticketing system. If we track it, we know when are we going to fix it. If we're going to fix it, what are we going to do with it? Is it important to fix? Is it not? Is it used a lot? So obviously features which are less used by our users, we're not going to give high priority over other features. So things like legacy form controls, we're not going to waste a lot of time on. While it's in shapes and text, we're going to obviously put much more emphasis on it. So this is the paragraph category. And in case, I didn't even understand. So the red means no. It's not supported. The green means yes. And the yellow means partial support, which means in some cases it works and others it doesn't. So this is the paragraph. Here you can see the list. You can see the table. Again, in table, you can see that you've got different kinds of borders. The Google doesn't do a good job with. And you've got deleted cells, merged cells. So this is just four sheets out of the 29 sheets, category sheets that we have in the matrix. And we use it as our reference. It's kind of our vibe. How good are we? What is our progress at every good point? And we also have a kind of historic reference so that we know a year ago what were the values in this table. So hopefully we want to share this with the community so that everybody can know at any given time what the status is. So we need to figure out what's the best way to put all this online. But we want everybody to enjoy this work. So after talking a bit about the matrix and the features and the different kind of problems, I just wanted to talk about a bit about the workflow. Exactly what happens when a file goes in and goes out of Libra as a .x file. And what are the different ways to solving these interoperability bugs that we've spoken about? So, yeah. Just one question. Do you both test the transitional and XML and sometimes? So we decided to focus on .x files created by Word 2010, which is a transition of the string. So obviously, you can, this is two, three dimensions. It's two-dimensional, but you've got different sheets. So you've got different times. So if you add another layer of complexity, which is different file formats, like 2013 .x versus 2007 .x, you get, it just gets too crazy. So this took three months. I guess if you want to do it for each and every different format, you'd have to create a sample file for each one and roundship it and test it visually. And all of this has been manual work. It's not there's no actual way to do it automatically. You test these features. So we felt like we should focus on one format and know it best. And thankfully, Microsoft doesn't do too many jumps between versions. So it's like 2010 is like it's still almost what everybody uses today. So this is basically the workflow. So you have the .x file going into the filter detector, which detects it should be handled by the .x import filter. Then the .x import filter uses Uno API to set the data model objects. And the .x export filter has direct access to the .x export filter to the data model and writes back to the .x file. So basically a problem in interoperability can be in each of these links. It can be in the import filter because maybe we don't handle this kind of XML load. It can be maybe that we're handling it, but with the wrong way. Maybe we're not analyzing this value properly. It can be with the mapping between the import filter and the data model. It can be in the export phase. Maybe we're reading it wrong, and they've seen a lot of bugs in the export filter also. So it can be any of these phases. And it might be something that's not supported altogether. So there are basically three ways that we can solve interoperability problems. So the first way to solve the problem is just going ahead and fully implementing this feature in Libra Office. And for that, you have to do kind of an online wiki. There's an online shopping list, what you have to do. And you basically have to do a lot of steps. You have to add it to the data model, add the mapping to the UNO, handle it in the import filter, handle it in the export filter, add documentation. You have to do update the UI, the rendering, add the dialogues for manipulating this feature. It's basically a ton of work just to add maybe a small feature that nobody really uses. So an example is the next styles, next style. It's a feature, it's a setting of style, which means basically when you use a style and you press Enter to go to a new paragraph, which style should the next paragraph use. So this was something that was supported by Libra Office, but there was a bug in the import filter. So when you new round trip to .exe file, that attribute of styles was lost. So you had to kind of make sure that the entire chain works, which meant fixing the import filter, and then it worked fine, so that was good. The second method of fixing is probably one of the problems is called grabback. It's something that was introduced approximately a year ago, I think, like a brush a year ago. And it basically means that you just take your feature, you import it, you put it inside some kind of property bag that's attached to an object in Libra Office, could be attached to a paragraph, could be attached to a shape, could be attached to the document itself. And it goes all the way through the import filter, stays in the data model with that object. And when you export, you just see, oh, I've got this property bag, let's just write to the file whatever inside. So you don't have to handle the UI, you don't have to handle the rendering. You don't handle the UI and the rendering. And the drawback is that the user doesn't see the feature, but at least it's not lost. At least the feature is round-shift without losing information. And we do it with different things. We do it with the smart object that I described before, but that's an example of where you do render. But in text effects like glow, shadow, depth, all things like that, we don't render them because Libra Office isn't yet supported, but at least we save them. Also, style attributes. So there are a lot of examples. We use it more and more now because we figure out it's best that we support, we preserve as much as possible rather than wasting the entire three months to do a full support for one feature. So that's the second approach. And the third approach is, I call it filter change, filter behavior change. That means it's not doing full support, but it's not doing grab, it's something else. And we have two examples for that. One is the work that a collaborator has done for us on drawing a mail handling. Drawing a mail, in case you don't know, is the newer way to describe shapes in .x files. It was introduced in Office 2010. And up until a year ago, Libra Office used to import only the older format, the VML. And Libra Office did support drawing a mail, but it only supported it in Impress and in Cal, but not in Ryder. So basically what collaborator has done is they've changed the way Ryder imports the data. And instead of importing the older VML, they import the newer drawing a mail with the same classes that are shared between Impress, Cal, and Ryder. And in the export phase, they also adapted the filter so that instead of exporting VML only, it now exports both the newer drawing a mail and the older VML. So it was kind of changing the filters, not exactly having support, but just changing the way they behave. Also shapes with content was something that was done, which means shapes like triangle that had inside something like a table or a circle that had inside a chart. Anything that had contents inside was imported inside Ryder as a text frame, which meant that the shape properties were lost, but at least you didn't lose the content. And what was changed was now the shape itself is being imported by the drawing layer, while the contents is being imported by Ryder. So you've got a circle with all the shape properties being preserved, the size of the line width and the color and everything. And the contents itself is imported by Ryder as a text frame inside the shape. So you don't lose the content and you don't lose the shape. So this is another example of the filter that is being changed without actually adding new support or adding anything to the data one. So these are the three different approaches that you can take if you wanna fix some probability problems. Anybody have any questions before I go on about anything I've said so far? Yeah. In the test documents, how do you check the rendering? Do you use computer vision? No, like I said, all the work that has been done on the matrix was done manually. So you open the file in LibreOffice and in Word and put side by side and it just doesn't look the same. And you have to see, a lot of times it's, not a lot, but sometimes it's like really subtle differences. So you really need to know your shit. You need to know what to look for. So you need to, the person doing this work, he has to know the features really well. He has to know what each feature means. So there are some kind of like strange features like use spacing, but not the same, but only if the previous paragraph is different than yours, and it's just a simple check box, but you need to know the context of this feature that you're testing. Because if you don't understand, if I show this document to somebody else, you look at the two documents and you might say, it looks the same, I don't see any difference. But if you understand the contents in the top context, you might figure out that it doesn't look exactly the same. So it's a manual manual. But the good news is that 99% of the work is being done already now, and now all that needs to be done is just kind of maintaining it. So just if the feature is being fixed, you just update that line, you just rerun the test on that feature, and then you're good to go. You don't have to do the entire matrix again, unless of course the work on the compatibility will be exponentially speeded, and then you'll have a lot of things changed, but currently that's the case. Anything else? Great, so. A bit to talk about things that have been fixed. What time is it? I don't have the clock. So a bit of now, a bit of eye candy, things that have been fixed in the last year. So the people, the groups that we've worked with that have done this amazing job, it's basically, it's Gallia, which is mostly Kobo's work over there. Center Zip, which has had an amazing, huge team of engineers that have been working for us. We've got two representatives here today, Sushi and Dushin, and Collabra, which mainly was work that has been done by Miklosh, some by Marcos, some by Tomash, right? Miklosh, Tomash has done some of the work. Yeah, so maybe even Kendi did some work for us. I do this too, was Kohei did some chart fixes? Yeah, Kohei also, thanks. So basically this is the team, it's like a team of all stars that have done this amazing job. And I'm just gonna show examples of things that have been fixed in Trappability in the past year since last conference. So the first thing is alternating content. Like I said, drawing the mail was being not important, the VML was being imported, all the way to describe shapes. And so after you've rounded the file, what happened was, I'll show you a picture here. So this is the original file. You can see here at the top that you've got the menu of Microsoft Word, and you've got all these full features like the different colors, and you've got effects that you can use and gradients and things like that. After roundtripping the file for Libra Office, this is the menu that you receive at the top. You lose all the effects that you can use. You can only use two-dimensional shades and kind of two-dimensional rotations, but you won't have all the newer drawing mail features. And after roundtrip, after the fix, now you can see that the menu at the top is exactly the same as the original file, which means if you have drawing mail, you didn't lose the ability to use all the more newer features. Work has been done on word processing groups, which basically means grouping of shapes. I have a couple of cool examples here. Here's the original file, and this is what was roundtripped Libra Office a year ago. So you can see a lot of the grouping here. It's a bit messed up. Colors are being changed. And this is how it's been groundtripped today. So it looks exactly the same one-to-one. Another example we have here, this is the original file. This is a roundtrip file. It's exactly the same. And the last example I have here. You can see that the document title here is behind the other shape, and you lose some of the fields here at the bottom, and you have your lines that go from the top to bottom that are lost, and the images pop differently, and this is what it looks like today. So it's almost a one-to-one. Relative size is something that has been recently fixed. You can have shapes, and you can tell the shapes to have the size, not an absolute size, but the size which is relative to the page dimensions or the margin dimensions. And here's an example of what it looked like before after roundtripping. So these are the shapes that have relative size to the page and the margins, and this is what it looked like after roundtripping. And this is what it looks like today. So the shapes keep their size relative, and it's not like the size is being computed and saved as absolute after roundtripping. It actually keeps the feature called relativity, relative size to the page. Text effects. So this is something that's not being actually rendered in Libre Office, but after roundtripping, this is what the file looked like. So you lose all the glow effect or reflection, the shadows, different outlines with dashes, things like that. These are both text effects, and this is what it looks like today after roundtripping. You don't lose anything. Everything is preserved, nothing is lost. This is another example. You have transformation on the text. You have also, here you can see the glow and the shadow. Again, being preserved perfectly. Also it's shape effects. You can have the same effects. You can have a text. You can have shapes. Here's an example. So this was the original file on the roundtrip one a year ago. You see that the effects like the shape is 3D here and you have effects on it and you have glow. And this is what it looks like today. Artistic picture effects. I'll just move on with this because nobody really uses it. Shapes with content. So that's what I talked about before. What happened is if you had a shape with content inside, it would get really messed up. So you lose the shape itself. Like I said, the triangle was converted to a text frame and you lose, like you see here, there's a table inside where you lose the colonization. Sorry. In some cases you lose the chart also because charts weren't supported when you brought this year ago. They were lost after import and this is what it looks like today. Almost perfectly preserved. The chart has rounded edges but basically it's almost perfectly preserved. Content controls. Content controls are things like document fields or check boxes, date pickers, form controls, reference objects like table of contents, things like that were lost. SmartArt, like I said, was being like smart art used to be converted to simple shapes. So you'd actually lose the smart art functionality and it would just be converted to like different triangles and rectangles on the document. But now it's being rounded perfectly. So this is the original document. You can see the round trip one. You can see this round trip one. When you click on it in Microsoft Word you get the smart art from the context menu so you didn't lose anything. Themes is something that is not yet supported in LibreOffice. When I say themes I mean not the theme of the LibreOffice application itself but themes of the document. Which means you can type in text, you can, and then you can change the theme of the document and then type fonts change, the sizes change, the colors change, even table borders, colors change. So the theme controls all of it. So up until today after you round-chip the file you've lose both the theme inside the file and the themes on the objects themselves. So today after round-chipping and these are different examples of themes of a theme attribute. So you have font type, font color, paragraph color, shading, table style. Okay, charts. Like I said, charts were being lost and today after round-chipping charts are being preserved. There's still a lot of work to be done on charts because a lot of features inside charts needs to be with the legend position, the title position, the sizes of the plot, access labels, things like that. Yeah. You talked about text, what on the charts, descriptions. Text, text, what, what. So I need to check the matrix to tell you the text that is being supported because I'm sure it's there. But I don't look at it how hard if it's being preserved or not. But the problem here was that the chart itself was being lost, you lose the chart. That had happened during the input itself meaning when you open the file you didn't see the chart anymore. So today at least you preserve it but it doesn't look exactly the same. The size is a bit different. Specific chart types aren't supported yet. So like radar, radar chart, things like that. But at least for most users it's good enough. Not interesting because nobody uses it but it's working. Prop images, being preserved now. And besides all the things that I said we've done a lot more besides that that I just didn't have enough energy to put screen transfer. So you've got here embedded objects, table styles, and things to do with track changes on tables. Latents style, nobody knows what it means. Citations, bibliography. A lot of other cool stuff is now supported but was not supported a year ago. So like you see a lot of stuff has been improved in the last year but still there are still gaps remaining. So this is where we need you. We need the community to add more support and more effort into this because the gaps that remain are still pretty big but it's much better than what it was a year ago. So the first one is like I said regarding charts. These are different things in charts that are not being preserved perfectly. So you see the chart sizes lost. Different effects like we've mentioned before like glowing or reflection or things like that. Effects on the chart itself or the title are being lost. Customizations like the position of the axis, the legend, the title. Is it on top of the chart? Is it inside the chart? These are not yet being preserved. And like these different chart types are being lost or converted to other chart types. The second one is forms. So it's something that I don't know how many users use today but it's important to let you guys know that it doesn't have good interoperability meaning a lot of these legacy form controls are just being either converted to simple text or just being lost. So and also active X controls, they're being lost also. Fields. Yeah, I did kind of put a picture for that so I'll just put a picture of the field. So you've got different document fields that are being converted to simple text and they don't preserve their fieldness. Like title, author, other field that nobody really uses but still they're not supported. Review. So that two things fall under review. The first one is track changes in LibreOffice we call this redlining. So there are apparently a lot of different types of redlining in the doc and in the OXML format. So there's not just let's insert text in the lead text. There's also the ability to move text on one place to the other. There's the ability to move paragraphs. You can delete content controls and insert content controls and funny enough they have their own way of describing this was inserted and deleted. It's not the same as we've inserted in the lead text. Different redlining inside the math controls. So you can have it inside a doc X document. You can have the math formula. So if you had track changes turned on and you deleted some of the formula, it should be represented in the file. And today these changes are lost. So it's, I think this is the most, these are the most if you want to support right now because it requires kind of deep work inside the data model. But I think it's important even for LibreOffice itself not just for compatibility with Microsoft Office but for LibreOffice users to be able to track changes not only in text but in all of these different features. It's important to have the support. Content controls, like I showed before, things like combo box, different document properties that are being lost. Group shapes, so group shapes they are being, are losing a lot of information inside. So not only the attributes of the group shapes themselves but also if you have complex content inside the group shapes, you have different kind of problems with that so things are getting lost inside. And besides all the things I've said, we have other gaps like mail merge, restricted editing, signatures in the file, Kansas support, old style email shapes, age background. So these are all the areas that any of you guys can jump in and do a fantastic job in. So I'm nearing the end of my presentation. I want to talk about two testing tools that we've developed that we would like to share with the community and we need to figure out the best way to do it but hopefully it will be used to better find problems with each release or each day we build. And these two testing tools, the first one is called the visual comparison tool. Basically what this tool does is it gets a bucket of files, you can define the files you can, we've got 15,000 files that we use as the buckets for these tools. And what the tool does is it automatically takes the file, round trips it through the office, meaning it does open, save as .x, then takes the original .x file, the roundtrip one, opens both of them in Microsoft words, creates XPS images for each page of these files, converts these files to PNG images, and then for each page it does an image matching function to check are these images identical or are they not. And we're using this tool to, first of all, understand how many pages match, meaning how good is the preservation, the roundtrip. Also we use it to find trends, are we improving with time? So if we find out that a month ago we had 72% match, and today we have 74% match, we know we're doing a good job, we're improving with time. And the last thing we use the tool for is finding regressions, meaning we can know that if somebody added a new feature, which wasn't supported so far, he might have mistakenly added some kind of corruption problem to store the file wrong and we suddenly see a spike in corrupted files. So this tool also enables us to find, catch these regressions as soon as possible, not waiting a month or two before we know about them because it constantly tests 15,000 files. It's not perfect because there are a lot of small differences between images, sometimes on the same machine, the same file generates a pixel, one pixel different image. But we use it basically to see the big picture, to see are we doing better, are we doing worse? Is there some kind of regression? Also, this tool doesn't allow us to understand if there's any kind of functionality problem. It only checks the visuality. Does the file roundtrip look the same as the original file? The other tool we use is called the feature extraction tool, which is a tool that instead of looking at the images themselves, it looks at the contents of the file, the XML itself. And what it has is it has a list of approximately 1,500 lines and each line represents a feature. And it's largely based on the matrix. And each feature has a description of the XML that comprises that feature. And what the tool does is for each feature, it goes over the 15,000 files and it checks, did this feature exist in the original file? And if so, did it exist in the roundtrip file? And then we were able to know for each feature, for example, line spacing, we know that out of the 15,000 files, it existed in 750 of them originally. And in the roundtrip one, it existed in 729. So we know more or less we do a good job with line spacing. But if we see digital signature, for example, which is represented by an XML load called ASIG, for example, we see that it existed in 520 files in the original ones. And in the roundtrip ones, it existed in zero. So we know this feature is not supported. It's not being roundtrip at all. So again, the same as with the other tool, this tool has problems also because the tool does not check the context of the feature. It does not check if the feature in the roundtrip file is in exactly the same place it was in the original file. And it also, there are different kinds of things like default values. And sometimes the feature is not in the original file, but it is in the roundtrip one, but it's not a problem because it's a default value. So again, with this feature, with this tool, you have to take it with a grain of salt. But we use the tool to know kind of anything above 90% is probably working perfectly fine. Anything under 10% is probably not preserved. So, and we also, we compare the results with previous ones like a week ago, two weeks ago, so that we suddenly know there's some kind of degradation. We see that suddenly something that was supported 90% has dropped now to 20%, so there's probably some kind of bug that was introduced in the last week or two weeks that caused this feature to suddenly be not supported as good. So these two tools are almost fully automatic and we very much like to share this with the community and somehow see if there's a way to add this either to the Tinder boxes or to some kind of daily build machine that will be able to test the daily build and see on any given day how well is it being approved? Is there any kind of progression so that the entire community can benefit from it and know if there's suddenly a problem, let's go fix it immediately instead of waiting week two, week seven up. So here are those two tools. The last thing I want to talk about is the observation we had from our data with the Cloud on Product. So basically what we did is we added analytics into our product that kind of tracked all the unsupported features or almost all the unsupported features, features that we know are being lost in the office and subsequently in our product. And we used these analytics to understand how many files are being opened that contain these unsupported features. And so we wanted to know which features should be prioritized over others so that we know these are the features that users are using the most. And you can see that the ones that we found out were that for example, content controls inside header footers appeared in 4% of the files. Shapes in headers and footers, which we didn't know were that were being used that much were being used by two percent of the files and were being lost. So we've given these things high priority and since I made the slide we've already fixed them, it's already fixed in the master. And we can see other examples like deleted cells and tables, poorly embedded overly objects, document protection, different kind of track changes. So what you can see that the numbers, the percentages are going really best to feed down so that this is under 10% of a percent. So another thing that we've noticed is complaints from users. I personally expected a lot of complaints because users were used to Microsoft Office and we switched them to Libre Office. So I was expecting complaints about interoperability but funny enough we've only received a handful of complaints out of tens of thousands of users that have used the product. And these complaints were about, for example, why don't you have hyphenation yet? We're losing content control item, we can't select and deselect them. And here we have graphic citation and tabulation are working properly. So I guess it's good news because I expected people to lose a lot more or complain about losing a lot more information and I guess the work that has gone into interoperability in the last year was probably focused on the things users used the most that we saw almost no complaints. So that goes also to say on the Libre Office product itself that users would use it if they don't suffer as much as I thought they would suffer interoperability. So that's it. Anybody have any more questions? Just let me put this slide on. Yeah. Anybody have some questions? Yeah. You referred to when you did the last comparison everything was fine and equal solution for the users of these last years. For the, for our product? Yes. We try to use the latest version. So the master. That's what we use. Anything else? Yeah. One in the back first. Do you have any, not switched to NLVN? Two. Using to switch to NLVN. So we're not switching anything. We're letting the user open the file that he wants to open. So he has access to his Dropbox, his Google Drive, OneDrive, his box account. So the user can open DocX files with it. I'm not sure if he can open ODT files. I'm not sure, but I don't, definitely I don't see any problem. There's no technical problem around the user token. I don't know if it's yet available on the app itself if you've added that feature, but technically there's no problem opening. Can I add to this to your answer? Yeah. Here's the product manager, so you know. So if you want to have people use the Dropbox more and more, one of the things I discussed in my presentation, we want to keep, this is an open environment. Asking for people to now convert their, export their documents to different formats, we'll just push them away, okay? So if they have the format they're used to work, their company's working with DocX, let them use that format. ODF, ODT, we also want to support it, and there's no reason why not to. But we just don't want people to export an input. That pain is not even. So basically what he says is, we don't want to force the users to convert one file to the other. We want to let them use whatever they want. They want to use DocX, but they'll use DocX, they want to use ODT, but they can't use ODT yet. Yes, you can create a new file in it, the editor, so it would be. No, no, you can't create it, of course you can create a new file. Yeah, but you know, like if you create a new file, you can use a choose if you find Microsoft World file, or a Libre of this file. Yeah, so I'm not sure how it's handled that, shout out. Well, that's a good question. We'll have to answer it once we complete the switch to the editor. Right now, the default is DocX. There's no reason why it shouldn't be default to be anything else. Or give the option to the user to choose. Yeah, I think what happens is that if you have a desktop environment, that's the reason the organization that is switching to Libre of this. And now one of the problems it might have, and I had an interesting session in the morning that Microsoft has probably supporting the ODF, not so then the other way around it in mobility. Such organization might want to ensure that once people are opening files, even within the organization, but on the mobile devices, they're not kind of having really problems because they're opening one of the tools that does not support well ODF. And so I think it's very good that once Libre of is coming to mobile that we can support the ODF moment on the mobile devices. And then let's the people also the opportunity to save it in this format and not just on the desktop. So definitely I think that's a good point. I think this is the kind of thing that we want to do and can help Libre of is to spread it because the more it being used and finalization, the more they need to make sure that nobody else is, you know, and broke breaking the file and the comparability of the ODF itself. Yeah. Okay, two questions. You mentioned that you ran the testing round trip. So you create dockets, can rewrite Libre of this and compare. All in Microsoft Office. Yeah. You're talking about the matrix. The way. Yeah. Did you consider just checking the display in the office? Just the first half of this. What compare the Microsoft original and the Libre of this presentation? Yeah. So we do that when we want to test the rendering. So we compare the rendering in Microsoft Office, the original file with the one open in Libre Office the way we love this rendering. So we do test the rendering. Only when I say preservation, we test the preservation itself in Microsoft Office because we want to do an end-to-end check to make sure that nothing is lost going through Libre Office. Okay. Okay. The first question. And if you compare the original and the processing, you generate bitmaps. The PNGs, yeah. PNGs and you compare them in pixel by pixel. Yeah. And that has, it causes different problems because sometimes you've got, for example, even if you have a file with 100 pages in it and these buckets have, we have, when I said 15,000 files, it comprised of different buckets. 10, 12,000 of these are real-life files that are downloaded from the internet like real people's files. And we've got another 2,000 files from the matrix which are synthetic files. And we've got 2,000 more files that we collected from Brazil. So it kind of got different kinds of files. And if you think of it, if you have a file with 100 pages and you have one unnecessary paragraph at the top of the file after round tripping, it will ruin the whole processing, the whole file because everything will be shifted down one line when you can really compare the images. So we have these problems. And that's why I said that we use this more to see and not on a specific file, talking about the visual comparison to the one that's round tripping and checks the images. We don't use it, we less use it for a specific file and more to see the trend. So if there's an unwanted paragraph there, it will probably be there the next time we run it so the file will stay the same. But suddenly if we fix it, we'll see that the comparison will also suddenly be better because suddenly 100 pages will match. So we see it more as a trend for all the files not for a specific file. But we do use it sometimes because we have different criteria in the system so we can tell the system. Give me a list of all the files that had less than 50% match. And then we kind of minimize it from 50,000 files to 200 problematic files. And we know that we need to focus on these first because these had 0% match with 10% match. So we can use the system to kind of find the worst ones but we, to see the big picture, we usually look at the entire result of all the files again. Yeah. The matrix is extremely powerful and I love using all the griffin between you all this and how soon do you publish it? So basically, first of all, the matrix contains two things. It contains the mapping of the feature, of course, and the gaps. And also it contains files that you can use to... The first thing that you want to do is to give the files and see how we can use them in the specific, in the pressure system. And the matrix itself, I think we need, the problem that we have also internally is the way we kind of manage this and update it. So it's also kind of an infrastructure logistic that we need to decide and once we do this, I think we can share it. So it's a very, hopefully, help everybody. Anything else? Thanks.