 So, good afternoon, those that are still awake. I want to congratulate you to the big luck you have, that you are seeing the Document Liberation Project first time, presented in public. We launched today at 1300 Central Europeans Daylight Saving Time and we are going to wash your brain about what we are doing and who we are. So, the agenda, so the public agenda is to speak about the project and give you some boring details about what happened since last year and for the hidden agenda you will have to be careful and listen because it is between the lines and it will be revealed towards the end. Oh, this thing works. OK, so the history of the project is that it was launched officially today and how it came to be a group working on file formats within LibreOffice since the beginning formed from 2011 around Google Summer of Code project that produced Libvisio library to parse and render visual files. And since the beginning when this library was still a little we started to feel that we are doing something bigger than LibreOffice and that we are actually opening the file format for the free and open source world and we were taking it as our mission as a service of LibreOffice community to the wider false world because LibreOffice is using enormous quantity of open source code that we didn't produce so it was also a good thing to give something back and this feeling was confirmed because there are many other projects that now use different libraries that they produced like Inkscape, Scribbles, Caligra, some proprietary applications also because we are released under NPL2 so they can use it but they have to give us back the code if they modify a library. We came at certain point to proper scalability issues because there is a hardware limit for an individual that can't produce more than 24 man hours per day so we realized that there were two possibilities either someone will pay us to do this as a day job which is not extremely easy to achieve or we attract more people that is also not easy but it looks in order of magnitude easier and why also we split from LibreOffice we are still part of Document Foundation but we split from LibreOffice proper because we don't want other people to feel that they are making a treason to their project because they are contributing to LibreOffice so we extracted these libraries from the very beginning out and now if you need to do something on file formats that are proprietary and they are not open you can always come to the project and nobody can ever tell you that your LibreOffice developer. About our project, what we believe we have beliefs about ownership of documents we think that documents and their content belong to the creators and not to software vendors. As a second step, we believe that access to content that you own should not be hindered by the fact that an application that created this out of life and operating system you are running is not running that application. We believe that truly open standards are the only long term solution for document interoperability and preservation of digital content for future generations but we also believe that in the intermediary period that might be quite long because we are not determining how long it will be having a free and open source implementation of LiarOff parsers for file formats that are not documented is the only way how we can assure that we will be able to read them in the future because the open source libraries you can always look at the code and you can actually from the code understand how the file format is composed. So, and what is in that case our mission our mission is to try to understand the file format by looking at them in HEX editors in trying to apply brain to them and try to cut them and, yeah, all these nice long winter nights activities and implement parser libraries for those that we understand enough. It means to extract data from them and data we say not only content strict or censored but also formatting, positions basically convert them. And we also want to be good citizens of ODF ecosystem although we actually have generators for other file formats than open document we generate for all the graphics we generate SVG why we settle around open document because open document is a file format that covers many types of documents and basically these are the types of documents that you can find in the wild you can find spreadsheets you can find text documents, graphics you can find presentations, databases although database is not specified in open documents so, now the interesting part finished and we are going to tell you something about the boring details Valek, we tell you something about Oletoi actually I decided to not take too much of the time to talk about all small details that happened in Oletoi I chose three most important things so I will tell two things and show one picture the first thing which is very important for us now we understand ad-op page maker format and even library was started to implement importer for that it's for versions 3 to 7 of the format so once it's done I believe Scribble's team would be able to use it for rare application second most important thing is that now we have one more contributor to Oletoi, the tool which he used to reverse engineer files and that's a tall, shy guy here David Tardon he implemented support for few file formats so very listed on the screen it's software 602 text Zoner Kalista also known as draw Zoner Zebra which is predecessor of Kalista and also newest version of Apple Keynote 6 page 5, number 3 which is become binary suddenly and now last thing and it's especially for Simon he ask it for few years if it would make sense to implement binary div to do all these things so it was implemented it's actually at the moment on the screen there are two dialogues in the middle I don't know if I can use something like that no of course, okay, well there is a small dialogue which allows you to select which parts of a file you are going to compare and when it's generated it's a bigger thing it shows here we have two things orange is the bytes which are different between two parts of a file and green is where you need to add something yesterday we successfully used it some of our reverse engineering activity and also it could be blue if you need to add something on the left side that's all my boring details and now I'm giving it back to Friedrich for real boring details so, very quickly new library, so we added some new library since last year David, he produced a library that is called Lipetonek I want to know what it supports you can read it from the other side and it supports keynote documents and we are extending support from numbers and pages so the free and open source software will be able to open those funny Apple documents another one is Lip ebook that supports the host of ebook file formats parses, different flavors of HTML and imports them into a normal document Lip free hunt we started the library but we are not shouting so much about it because we don't have yet the fields so it's a library that parses free hunt file format is one of the most funny file formats that we ever seen and not reverse engineered but still it was quite fun to do around Christmas we told ourselves how it is possible that we can't open Abivor documents so I told myself that in the it started to be dark early in the evening so we wrote Lip ABW that is now we can load documents of our cousin and there are more libraries coming if we are still alive we added new document types in the API so before we were able to convert graphics and convert text documents but we realized that we will want to do some more things so we added APIs for representations of presentation documents that were we first told that actually the drawing and presentation will have similar APIs but presentations have some things that drawings they don't have so we had to basically make another file model I don't see so and we extended it also to the spreadsheet so now we have four document formats but it's not very interested for LibroGraphic because we were always supporting graphics we split these libraries we have a library that is called Lip ODF Gen it's a library that takes these document models the document models are basically callbacks that the libraries are calling and you can make generator we can generate from these callbacks documents we have these generators generator for ODF and since Femke was showing me those numbers that I don't understand I will try to go a little bit faster and we made something that we called Lib Revenge it's nothing with Revenging although we say that Lib Revenge is sweet but it's from reverse engineering so Lib Revenge basically before all these APIs that were embedded in the first library that we produced for the type of document like Graphic APIs were in LibWPG that was the first library that we used for Graphic we are going quick ok I would be really surprised if someone was not hearing me my wife she has to put the parts in her ears so that she can be quiet a little bit so basically we extracted the APIs plus common types in a library that for example a Graphic application doesn't have to introduce LibWPG that is a library that parses v all perfect documents just to have some types and this basically created a nice structure we have some stream interfaces in this library so that again is C++ library so you have purely virtual interfaces and you have different implementations and you have generators we generate several file formats from these APIs like SVG there is an exception this SVG generator for drawings it's in the main library because we were producing SVG for example for Inkscape since LibVizio started to exist so we didn't want to split it in an optional library if it's basically needed for applications and yeah I will upload it to slideshare so that you can go through it and internalize all this crap that I wrote here what is the advantage of the design the parcel library is independent and self-contained it's easier for filter writers it's easier to find where the APIs are and we also avoid sucking in unrelated libraries and it's also a considerable reduction of code duplication which is good because it creates less risk to have bugs that are fixed in one library still existing in another and also for someone who wants to write a library parsing a document it is faster because the APIs exist the type exists you can just start to write your first function and then go through it from it ok, now as I see in your faces you are all excited and you want to be part of this so how you can contribute you can develop code it's very easy you can contribute to one of our existing libraries or start a new one and we are there to advise to guide you to pat your back to Facebook about what you are doing and everything or you can help if you like to look into Hex as editor until you are blind you can help in understanding and documenting file formats we use Oletoi that Valek presented because it is much better than to count offsets in Ghex or I don't know how it's called in the other site or you can prepare sample documents for us although we have many of these applications I still don't see we have many of these applications through our MSD and subscription and these kind of things we don't have everything I don't have many Adobe applications so if we need to understand Adobe file formats it's good if someone of the graphic people has an Adobe maybe page maker in design and generates a host of document with atomic features so that we can try to understand how the documents are done what is the future we have some projects that might go through in Google Summer of Code within LibreOffice students will be extremely blessed by that because they will be working with outstanding mentors like David, like Valek and we have some file formats that are already reverse engineered and now they are ready for straight engineering and they are up on numbers and pages there is a project for that in Google Summer of Code and we have actually a student who wants to do it so let's see whether there will be a place for him or Adobe page maker we also have a student and we are just seeing whether we managed to push our will inside the project to have this project so thank you you can actually go to our handle of twitter we tweeted a nice okay nice, we are not designers we are just zeros in design but there is a wallpaper that you can put on your computer there is no values in it and it's not calling home or you can check our website document liberation.org and see everything that I said here with Valek but in more intelligent soon thank you so well last visitor is setting up we have time for one question my question is about WMF and EMF the windows meta file formats would those also fit into this model or are you planning to leave the support for those in open office as completely separate open what? in liberal office, sorry to the microphone so we know that there is a big demand for WMF and EMF the problem is that it's not very easy to extract the code from liberal office because in liberal office for EMF and WMF is not a converter code but it's a renderer code and it's the design of this is extremely intelligent so it's quite it takes the the records and it channels them into some file format and it puts them to the renderer and the renderer renders it I have skeleton of EMF plus library there is specification for it so provided that someone would be interested to help with it we could have mentoring bandwidth for it but I see that there is some support so it's more fun for us to open something that nobody ever was able to open a load for the vile ok more questions before we go into a second bonus question we're looking for a small hdmi mini hdmi mini normal to be converters can anyone provide one ok so your question is the same like mine one more at some point I tried to reverse engineer some binary file format it was like lighting design program so I tried to get some information about how to reverse engineer and how to reason about binary file formats and I failed to find proper documentation or maybe books and stuff so you guide me about something I need that so the question is how to do it ok well for me it started as take a binary file make a binary file or take it from the source because at that time I did not have access to video so I asked my friends to send me files with some specifications make small changes save a file, look for differences that's why Nomis was asking for div because it seems to be natural to try to find differences in two files where you know what difference should be and when did you say if it's difference here that should mean this and that and I've used Gnomeric as a tool for storing because all this information surprisingly the guy who use Oli Toy to make Yamaha Yep file converter also was using spreadsheets for that and once I found that it's not that performance of this process is too low I've just made an application which looks like hex editor on steroids and when you do that you just open files and there is no thing that will allow you to open file and say this one is this, this one is that and it so just comes with experience when you work with some file format suddenly you will find out that you open it, look on it and say this one is 8 bytes IEE 754 fraction normally what you do you know that file starts at the beginning so you look at 8 bytes integers and you try to see some of them in whichever endian they can point to something within the document and then you try to go there and try to see that there is some sense and you try by errors that's why it's good for long winter evenings