 So, let's get this founded. It's a group to talk about this group called Interpreterative Relation, and it's a coalition with a protest in the middle of this. So, first of all, about me, my name is Christo Polic. I'm from Spain, a lot of people came down from Spain. And, well, I'm a QA in January with a TTF as is September 2016, so five more programs are left. And this is from an email that you wanted to send me, and normally I'm in the office QA. And I've been working on this group for the last couple of months now. So, basically what it is, it's a set of scripts you use to generate PDF files from different office suites. And then you compare those PDF files to see how different they are. And basically you range those differences in a range from zero to five. So, there are four kinds of comparisons. We compare those PDF files with vertical alignment, horizontal alignment, side-by-side comparison and page overlay. And basically you can use it with any office suite which supports command line. So, you can use it with Microsoft Office, LibreOffice, OpenOffice, as well as from others. So, this tool was developed, or is it still developed by Miloš Srami? Is it correct? She's Slovak. And he first presented it at the Blackfest in 2011 in Berlin. Later on he presented it in the LibreOffice conference in 2013 in Milano. And this year he presented it in the LibreOffice conference. So, I attended this conference and, well, it was the first time I knew about this tool. So, I talked to him about it and we thought about using it in LibreOffice to find comparisons in the documents. So, here you can have the code. You can find it here in GitHub. This is Miloš's repository. And here you can also see his presentation in Bruno. So, basically when I talked to him in Bruno, we thought it could be a good idea to use it in Master Branch. So, we could find interoperability regressions as soon as possible. Then the idea is to use as many files as possible. And then as many of the parts we have in Baxila are related to Microsoft Office and the performance they use. We thought that the best would be to compare it against Microsoft Office. But we could use any other office to compare it. So, some numbers from Baxila. We have, well, I use basically the keyword regression and the keyword, we have some keywords in Baxila. I use this regression keyword and filter keyword and we have right now 40 open regressions out of 69. 76.px regressions out of 155. And for RTF, we have 18 regressions out of 176. So, the numbers say that we can use it correctly. Then I decided working on this tool, and well, I found some bugs in it, so I just fixed them. Then I also add support for in-press documents. And then I added the possibility to have different output files because what Miller did was just to, you could use different input files, but there was only one output file. So, now it's possible to have different output files. And now I move it into a virtual machine. This is the virtual machine we are using. So, now it's possible to, because firstly I was using it locally, so my memory was a limit, so it was complicated, but now we have a virtual machine with 100 gigabytes. So, we have plenty of space to use this tool. And here you can find my code. So, this is, nowadays these are the formats we are testing. So, .gigs, .rdf, and .ppth, and .ppth. And, yeah, basically in order to get the files, I'm using a script, I think, colon deep, and probably Marcus, and yeah, this script. So, basically we are just getting the files from all these backtruggers. So, yeah, we have many documents available. So, basically how this tool works. We have this batch of documents here. And, let's say these documents are the input format, which is .gigs and .rtf. So, basically what we do is we convert it with library of this to PDF. So, we see how it's imported into library of this. And also we define the output format, which is .gigs and .rtf. So, then we convert it with library of this to those formats as well. And then from those formats we open them in Microsoft Office, because it's the tool we want to compare it to. And then finally we get each PDF for each format. And then on the other side, the same file, we open it with Microsoft Office, and then we generate the PDF. And eventually we compare all these PDFs against this one. So, we use the horizontal alignment comparison, the vertical comparison, side by side, and the overlay page comparison. So, in total we have four, yes, four numbers to compare all these PDFs. So, basically things we are testing with this tool, well, the first one, we test library of this import, here. Then we test the library of this import. So, all this part is testing library of this import. We also have test library of this import, because probably let's say we find a problem here in the comparison, but then when we test that file in library of this, we see it's correct. So, then we export it and we see that the problem is in the PDF export. So, we are also testing that. We also, as we are using timeouts, we are also finding, well, I could find, while testing the tool, I found some import and export time problems, regression. I could also find that sometimes, let's say we generate a dog file and then we try to open it in Microsoft, hopefully, but then we have a corruption problem. So, then that's something we are also testing with this tool. And, yeah, eventually, we are also testing the availability of the relations, which is the goal of this tool. So, I'm going to show you an example of how this tool looks like. So, basically, once we run the script, in the end we have this spreadsheet. So, basically, in here, we have the import part and in here we have the export part. So, here, this is something I did locally. So, right now I tested with 64 documents and it found nine regressions. So, let's see an example here. And I used this bit, which I think is the first comment of LibreOffice 5.2, sorry. And I also used a master build, a commit, it's a bit like it locally. So, for instance, let's take a look at this file. So, here we have the comparison side by side. This is Microsoft Office and this is LibreOffice 5.2, which is correct. On the other side, if we see this one, we see Microsoft Office and here we have master. So, we see that between this and LibreOffice 2 and LibreOffice master, something where our regression was introduced here. So, this is the side by side comparison. We can also have this one. Yeah, this is the overlay comparison and here is the horizontal comparison. Finally, this is the vertical comparison. So, you can see that this number is minus 2. It means that something was introduced here, kind of regression. Sometimes, it's not a regression, you just check it and you see that when they look just similar, they don't see much difference. But you can also use it to see improvements. I don't know. Here you have this one. This was in LibreOffice 5.2. So, I think this one, the ballots were incorrect. But here if we check it with LibreOffice master, they are correct. So, yeah, we have an improvement here. And same for round trip documents. So, let's say the final regression, like this one. This is the same one as before, but this one for instance. Yeah, we see that. Well, that's different. But for instance, we see that here, we have an empty page. And in LibreOffice... Where was it? No, sorry. So, I can find it now. Well, basically, it says that the number one is null, it says that a regression might be introduced here. So, you just need to check it and then... Well, it's an easy way to find interoperability regression here. So, yeah, that's all. Any questions? What tool do you use to generate PDF from Microsoft Office? Yeah, so, we are using Linux. We are using Y. Microsoft Office running under Y. And there is also another tool. Well, you have to integrate it into the Office directory. And it's called Office Convert.x or something like that. So then it allows you to convert files to PDF. And yeah, right now I'm using Microsoft Office 2010. But I think Wine was released a couple of days ago. And I think it has support for Microsoft Office 2013. So, probably I'm going to test it and see if there are differences between one or the other. What is the speed of testing? The speed of... Well, it takes some time. Because, yeah, let's say if I use 1,000 documents, it takes about... The first time, it takes about a couple of days. Then next time, because you already have... But what we do here is, okay, we compare it. Let's say I want to test master right now. So I use a previous version as a comparison. And now I have the comparisons in a week or after sometime I want to test it again. So I have a ready master. So I just need to generate this PDF and all the files for the new master. So then it's half of the time. So the first time you run it, so let's say for 1,000 of documents, it takes a couple of days. Then it's one day. But the idea is to have more documents than 1,000 of them. Then the point is, in this case, for instance, if we are generating documents and PDF output files, then you need three times the time of comparison there. So if you use a co-interface, it's a Microsoft Office, then you can keep work running and then load the same documents in your search engine. So you could use the co-interface to talk to Microsoft Office? Yeah, we are using it. Okay. Then you can keep it running. So maybe you're already doing that. Well, the thing that takes more time is the comparison of the PDF. In fact, we are limiting the comparison to five pages. Because let's say we have a document of a hundred pages, we are limiting it. Because it's doing horizontal comparison, vertical and so on and so forth. So it's quite heavy. Is comparison of the page of the page either your own code or... It's Milo's code. It's not your general library. No. He did it and well, in this repository you can find it. You can find everything. And basically in my repository, what I did is to adapt that code to what the LibreOffice needs. What we need to do. Could you distribute the comparison? Could you distribute? Can you distribute the comparison? If it's a heavy issue or both, maybe we can spread that across the whole machine. Well, right now, as we are limiting the comparison to five pages, it's not that heavy at all. So in comparison, the worst scenario may be three minutes for this comparison. So we have... You say 1,000 a day, we do the crash testing and we run like maybe something 1,000 a day. Yes. It takes a lot of time to do each test. It's like 10 times, 15 times what we normally do for the export crusher. We do all the export and run through it and talk about it. So the comparison part is both five and across five times. Based on the number you said. So we should maybe distribute the comparison to the permit on multiple machines. Yes. It looks like we could do that in a reasonable amount of time when we have backstages. Yes. We could experience each other's life. You see the difference of the PDF and if you look in the this week editor you don't have the difference. I see the differences in the PDF but then I don't see it in LibreOffice for instance. There are some false positives as well. How large is the portion of all the views which comes from the PDF? I found just one. So it's... 1,000. Let's say 1,000. Let's get out. In LibreOffice the code... the code is not as as much as dog or dog filter and things like that. There are more things going on in the dog filter dog exporter and so on. So it's more common to find relationships in dog filter in porous porous and so on than in PDF exporter. We also plan to use it to find the questions in ODF. But we need a reference that we know is... although we are testing here against Microsoft Office and sometimes it's not correct.