 It's a good talk about this group called Interferability Progression and Policies So, first of all, about me, my name is Christo Polic, I'm from Spain, a lot of people came down from Spain I'm a QA in January, 30th, this is September 2016, so 5 months from now, more or less And this is from my email, if you want to contact me, my username And normally I'm in the office QA And I've been working at this school for the last couple of months now So basically what it is, it's a set of scripts you use to generate PDF files from different office suites And then you compare those PDF files to see how different they are And basically you range those differences in a range from 0 to 5 So there are four kinds of comparisons We compare those PDF files, well, vertical alignment, horizontal alignment, side-by-side comparison and page overlay And basically you can use it with any office suite which supports command line So you can use it with Microsoft Office, LibreOffice, OpenOffice, Office So this tool was developed by Miloš Sramit, she's Slovak And he first presented it at the plugfest in 2011 in Berlin Later on he presented it in the LibreOffice conference in 2013 in Milano And this year he presented it in the LibreOffice conference So I attended this conference and well, it was the first time I knew about this tool So I talked to him about it and we thought about using it in LibreOffice to find regressions in the documents So here you can have the code, you can find it here in GitHub This is Miloš's repository and here you can also see his presentation in Breno So basically when I talked to him in Breno we thought it could be a good idea to use it in master branch So we could find interoperability regressions as soon as possible Then the idea is to use as many files as possible And then as many of the parts we have in Baxila are related to Microsoft Office And the format they use, we thought that the best would be to compare it against Microsoft Office But we could use any other office to compare it against So some numbers from Baxila We have, well, I use basically the keyword regression and the keyword we have some keywords in Baxila So I use this regression keyword and filter keyword And we have right now 40 open regressions out of 69 76 dot each regressions out of 155 And for RTF we have 18 regressions out of 176 So the numbers say that we can use it Then I started working on this tool and well I found some bugs on it So I just fixed them and I also add support for in-bred documents And then I added the possibility to have different output files Because what Miller did was you could use different input files But there was only one output file So now it's possible to have different output files And now I move it into a virtual machine This is the virtual machine we are using So now it's possible to... Firstly I was using it locally So my memory was a limit so it was complicated But now we have a virtual machine with 100 gigabytes So we have plenty of space to use this tool And here you can find my code So this is... Nowadays these are the formats we are testing So dot each, dot RTF and ppth and ppt And basically in order to get the files I'm using a script I think colon and probably markers And this script so basically we are just getting the files from all these backtrackers So we have many documents available So basically how this tool works We have these virtual documents here And let's say these documents are the input format We say it's dog, doggies and RTF So basically what we do is we convert it with LibreOffice to PDF So we see how it's imported into LibreOffice And also we define the output format We say it's dog, doggies and RTF So then we convert it with LibreOffice to those formats as well And then from those formats we open them in Microsoft Office Because it's the tool we want to compare it to And then finally we get each PDF for each format And then on the other side at the same time we open it with Microsoft Office And then we generate the PDF And eventually we compare all these PDFs against this one So we use the horizontal alignment comparison, the vertical comparison Side by side and the overlay page comparison So in total we have four numbers to compare all these PDFs So basically things we are testing with this tool Well the first one we test LibreOffice import here Then we test LibreOffice export So all this part is testing LibreOffice export We also have test LibreOffice export because Let's say we find a problem here in the comparison But then when we test that file in LibreOffice we see it's correct So then we export it and we see that the problem is in the PDF export So we are also testing that We also, as we are using Timeouts We are also finding, well I could find While testing the tool, I found some import and export time programs I could also find that sometimes Let's say we generate a dog file And then we try to open it in Microsoft Office But then we have a corruption problem So then that's something we are also testing with this tool And yeah eventually we are also testing the property of the relations Which is the goal of this tool So I'm going to show you an example of how this tool looks like So basically once we run the script In the end we have this spreadsheet So basically in here we have the import part And in here we have the export part So here this is something I did locally So right now I tested it with 64 documents And it found 9 regressions So let's see an example here And I used this bit which I think is the first comment of LibreOffice LibreOffice 5.2 And I also used master build Commit it's a bit like it locally So for instance let's take a look at this file So here we have the conversion side by side This is Microsoft Office And this is LibreOffice 5.2 We see it's correct On the other side if we see this one We see Microsoft Office And here we have master So we see that between this LibreOffice 2 and LibreOffice master Something where our relation was introduced here So this is the side by side comparison We can also have this one Yeah this is the overlay comparison And here is the original comparison And finally this is the overlay comparison So you can see that this number is minus 2 It means that something was introduced here Kind of regression Sometimes it's not a relation You just check it and you see that They look similar They don't see much difference But you can also use it to see improvements I don't know Here you have this one This was in LibreOffice 5.2 So I think this one the ballots were incorrect But here if we check it with LibreOffice master They are correct So yeah we have an improvement here And same for round trip documents So let's say the final regression Like this one This is the same one as before This one for instance Yeah we see that Well that's different But for instance we see that here we have an empty page And in LibreOffice Where was it? No it's right So I can find it now Well basically it says that The number when it's not A regression might be introduced here So you just need to check it and then Well it's an easy way to find Interoperability regression here So yeah that's all Do you have any questions? What field do you use to generate BDSN For Microsoft Office? Yeah so we are using Linux We are using Y Microsoft Office running under Y And there is also another tool You have to integrate it into the office directory And it's called Office Convert Tool Something like that So then it allows you to convert files to PDF And yeah right now I'm using Microsoft Office 2010 But I think Y was released Y.201 please a couple of days ago And I think it has support for Microsoft Office 2013 So probably I'm going to test it and see If there are differences between one or the other What is the speed of testing? Well it takes some time Let's say if I use 1000 documents It takes about the first time It takes about a couple of days Then next time because you already have What we do here is Okay we compare it Let's say I want to test master right now So I use a previous version as a comparison And now I have the comparisons in a week Or after sometime I want to test it again So I have a ready master So I just need to generate this PDF And all the files for the new master So then it's half of the time So the first time you run it So let's say for 1000 of documents It takes a couple of days Then it's one day But the idea is to have more documents than 1000 of documents In this case for instance If we are generating dog gives and PDF output files Then you need three times the time of comparison there So if you use a co-interface to Microsoft Office Then you can keep Word running And then load the safe documents And you can probably get a significant speed up You could use the co-interface to talk to Microsoft Office Yeah we are using it Okay then you can keep it running So maybe you are already doing that Well the thing that takes more time Is the comparison of the PDF In fact we are limiting the comparison to five pages Because let's say we have a document of 100 pages So we are limiting Because it's doing horizontal comparison Vertical and so on and so forth So it's quite heavy It's Milo's code It's not a general library No He did it and well in his repository You can find it You can find everything And basically in my repository What I did is to adapt that code To what the LibreOffice needs What we need to do Could you distribute the comparison? Could you distribute? Can you distribute the comparison? If it's heavy or cold Maybe we can spread that across the whole machine Well right now As we are limiting the comparison to five pages It's not that heavy at all So it's comparison The worst scenario May be three minutes for this comparison So we have You say a thousand a day We do the crash testing And we run like maybe something a thousand a day So it takes a lot of time to do each test It's like 10 times, 15 times What we normally do Or the export crash Which do all the export and run through To the top of that So the comparison part Is also like the cross-by-time part Like some numbers So one solution Maybe to distribute the comparison To the top of the machine To be able to actually do that In a reasonable amount of time When we have that So it's not that heavy It's not heavy at all I should look at it The experience It's just like You're seeing the difference of the three apps And if you look in the This week editor You don't have the difference I see the differences in the PDF But then I don't see it In LibreOffice for instance There are some false positives as well How large is the portion Of all the views Which comes from the PDF I found just one So it's one of One of Let's get out In LibreOffice The code The PDF code is not that It's not as much As a dog Or a dog filter and things like that So there are more things going on In the dog filter And so on So it's more common to find Regressions in dog Filter import and so on Than in PDF export We also plan to use it to find Regressions in ODF But we need We need a reference That we know it's Although we We are testing here against Microsoft Office and sometimes It's not correct