 My name is Simon Schumann. I'm here sponsored by Subsoftware. I'm a freelancer. I worked on ODIF, let's say, I worked for staff since 1999, and later on for OpenOffice and for Sun, and for Oracle. I continued after Oracle dismissed the team and handbook as a freelancer to continue my work on the standardization because I worked at the time at the browser office and I wanted to sell my knowledge to people who were just interested in continuing what I would love to do in the past. So my theme is about the ODIF and the enterprises, and there are many ways to think about it, and I thought about you can even write a thesis on this, but I would like to focus just on three things here. I will neglect like encryption signature and a lot of other stuff, and also what might be part of the thesis is to analyze what the enterprise is like, a large set of document enterprise sets, what they're actually doing. So what I know from the past is that there are three main areas which are very typical for large enterprises like Amazon. That's the generation, the metadata and the coloration, which I know is a continued project for others. So just to get you on the same level, I would like to give you on the generation the basic look on what's ODIF. I know that Josh will tell you a little bit more about the standards, the difference between ways and prices, so let's keep this. But so much for certain, there's three parts. One of the scheme of XML that is being used in the files that are being zipped. You see there's a typical document, and you have a split of this documented with some content styles and a meter and a few things that are discussed a little later on. And the next thing is the formula which is basically used for spreadsheets. The last thing is package, the zip, which could be reused for other formers. And I would like to tell you more about things that are very important, enterprise based validation. Common user is not so important to have a validator, but if you want to be certain that the documents that you're creating must likely not only as a common client but also a server, then you have to be certain that others can use it. That you have this ultra-operability with other applications. And you can be certain by this or much more certain if you validate this. And this is the validator. It's part of the patch on your toolkit in the Vedian project. And this is a frontend that I once wrote for openbox society. And it's hosted here in the Retop Cloud. You can access this. And it's very easy to install it from your local machines. It's just a Java-based web archive, so-called war. So if you have any developed by hand, just download it, build it, and then download some web stuff like Tomcat and include it. Then you can have the same frontend on your same machine. And it's also being used by for mankind to have this on a large scale. And what it does is, in the specification, there are three things. You can use something you may use it. You should use it and you shall use it. You have to. And the may and the show are being detected or being considered. So if your may is not correct, you get a warning. And if the show or not is being neglected, then it's there. And by this, we have a much better understanding of what might be the problem. So we want to get at another thing that might not just give, for instance, the model, a server-based editing of a document. And just to give you a quick expression here, we have two basic layers. One is the package, which is the third part of the specification. So you can easily access files and put your old files into it with a Java-based with a line of command. And the other thing is, because all developers don't know all of the XML yet, I don't have the lines like 600. So if you want to be certain that you're at everything right place, we have some classes generated from the schema. So every element is represented by a class, and there are methods that gives you the ability to add the certain possible child elements and attributes. So you have some guidance here. And the document API is some high level view. It's being started to give you the user view, like please enter some paragraph, enter some text, enter the table. And for those people who don't want to get into deep into the XML details. So this was created, or started by Simon Iyer, I can say, eight years ago, and was later debated to Apache and still used by companies like Open Exchange for their back end of the browser-based web office. Another thing for generating was mentioned by Michael Mies as well, is the main merge. The interesting thing is the main merge is not a specific ODF feature. You won't find anything that's called main merge in the specification of ODF, but the common most feature which offices like Microsoft Office and Beeper Office is providing such a standard. You might know that it's just to create one template or one basic letter, and the slightly variations like changes based on the data you put into it. Like you want to have a birthday party or you want to write to a customer, and by this you only have to write once and change the data to save a lot of time. And as far as I know it's quite good to have it used between 10 and 60,000 people. And CBE is working with customers, insurance companies and banks. They require more like even billion documents. And for this reason they created something else, something that calls JS Merge. And JS Merge gives you the, this is the only commercial slide here. There's a later talk about from Tossel, this gives you more detail. So I want to keep it simple. So there's the ability not only to include and add and neglect text and paragraphs, but also arbitrary content, even components of paths that are inserted. And the reason I wanted to show this as well is that the side of commercial is that there might be a deal to extend the ODF standard to have a much more common usage of creation of documents. When I ask a few friends who work in the insurance companies and banks, there are a lot of people doing this for their own music. Oh, this is easy, I just write a document and create it for my own. But the problem code they're using is all different. So if one bank buys the other, they have two different kinds of software and there's no safety. So there might be the interest of, to be discussed, of having a common generation and a text program. Especially to have a better input mechanism, that might be something for just talk, to be able to insert other documents inside your document. So that's what I see. And of course, I think the most important thing is the templates. Templates are basically the same thing just like a common document. It's always a different mind type. Mind type is an identification that is being used in the beginning of a zip. If you put one of your ad documents into a text editor, you can see the first 20 lines without the coding and it's going to play me what kind of document it is. It's a text or a text template. And by this mind type, by the suffix, the application knows that it's being, that this application has to be, sorry, that this document has to be open differently. So in this case, it's, you don't edit the document, you just load it, but you get a new document which is initiated with the document you just provided. So with that overwrite template, that's the whole simple mechanism. And something that has nothing to do with, with ODEF is the best practice to set the license to take control of the templates of the company because there's a lot of chaos going on and so it's very good to know where your templates are and it's controlling the templates. The city really needs to know the lesson and I'll make big improvements. So if you consider this, there might be not to learn from the city. And two different things that's especially important for companies is the integrity with other applications, mentioned it for the applications and of course macros because macros are not specified, not part of the ODEF language. So it might be that Microsoft macros work with LibreOffice, but there's no guarantee and there's no testing. Same with LibreOffice macros. And another thing is accessibility, so the color blind, the blind, you have to provide a certain standard. In the end there's some references and the UK government give a wonderful guidance on ODEF and especially on accessibility. So if you work in this area or if you plan to work in this area this is a very good source of information there. But the last thing is the metadata, ODEF, which I explain just right now. So the metadata, the metadata is already since 1.0 in the document it's defined as structured data about data. It's actually machine-readable data and it helps you to identify, to categorize or give abstraction on your data. There's some predefined set of data and the meat.xml file that we have in the zip. It's very easy, if you're ever interested just rename your ODEF document with zip and open it and you can see the files and copy it out and put it in there. This is what I did here. The J-edit and even R-edit plugins can directly edit and save into the zip which is quite helpful to start testing. So here's an example of the page count, paragraph count of this document where to create a data there's sometimes an author so it's very helpful. But what about if you have your own metadata? One year of your own text, this is an invoice or there's a certain own of this document. You want to add a document and you want to send it around maybe a mail, maybe other processes. What's the status of this? You want to analyze this by machine. And with ODEF 1.2 we edit a new generic mechanism for metadata. This is the W3C-REF standard. So usually you will never ever use it or anything about this but I will molest you with this now anyway because it might be interesting to know a little bit about it. Michael Meeks, Michael Meeks, Michel Starr here at the end did the implementation for writer by the way. So if there's any questions for the browser you might write one too to answer the question. So what is this RDEF? It came from the very foundational intelligence and pushed by it symbolically. It's one of the pet projects. It's very easy to have a language, a subject, predicate object where you can say well every dot here is something you want to talk about and you have the subject and the predicate is obviously a URI or UAL and the object is either UAL text and you make no sentences and like I might need a pointer but can you see this? Yes wonderful. So this one, this person and the person that's attending this meeting and this meeting is being chaired by this one. This meeting has a homepage and a policy and this meeting has a location and this is URI representing this meeting this is a predicate saying oh this thing has a homepage and that's the homepage most likely directly to UAL. So the nice thing is you can have different files like the different colors here and if you love them all they match together. You can have an arbitrary large graph and traverse it. So there's a very nice primer to read about this in an interesting field. So ODIF has a mechanism now to zip when ODIF fails into the zip the orange part and to reference to elements with it or to all files by the similarity like relative UALs and we map this by the manifest ODIF so you can see that ODIF at one point at a single point were made to date exist in this file. And you have more abilities you can also have part of the text document being part of the tree the reason for this is that you don't have a data redundancy and this is done by that's a very simple data example here here the name of the doctor is being the doctor and his name is Dr. J. Frank and that's the graph being done and the subject and the predicate would be then part of the element and the meter is something like a span but cannot this text meter element is something like a span, they're identical with the paragraph but cannot be split, that's all about. So the last thing that I have enough to tell you about this is something that we have a field a certain meter data field that was the reason why it was initiated in the first place the field gives you the idea that this content was already initialized by a third party software, like a plugin that you was to have a citation plugin which gives you content citation created by the plugin the reason to have something like this is that many magazines the US have different needs of citation that you can simply choose this magazine you're one writer you send this and you have different types of citation and they all change automatically yes, and they find places you have to site by this field so much about the meter data standard so the last thing I want to talk about is collaboration this is my pet project and I I want to go there on a very high level I will neglect a lot of XML here so this is basically the talk I gave last day and now, so let's come up with the requirements what we have for listening together, why we go for this search solution so collaboration means we want to work with a lot of people we want to work with a lot of people and you can do this since the 80s for this reason we have put the same thing on the floppy disk and reached it wrong it still works if every user just reached over the floppy disk and has a single access but if you have simultaneous access then it's very difficult because then I receive all the properties that can be changed with files and I have to find what are the changes that's very very difficult and it should be done by different application that's the reason of the only F standard that you have interability between different applications for this reason the file format exists and we only have interoperability on a complete file so the easy way to solve the problem now is to do now the interability by exchanging files I can flop them by mail and send them around but this becomes very difficult because the merchants on the site have a very large document a lot of changes then you have a lot of difficulties to find what has been changed and also many applications save the document differently we have a voice here you can have it in source in the same way for the developers to know some will be linux, some are windows and they have different line breaks the same might occur here some applications remove all the spaces and they have only one line to save space and some do sort of line breaks some have spares that are nested and so on or even the prefixes here, it's an American change so you might need to normalize everything to get things easier to be compared but the problem is also that despite the HTML browser there always exists a DOM we don't have common runtime model files are usually being used to be read into the runtime model they are being mapped because it filters because every feature that it does not know is being neglected and what you know is being put into your application model so now you have difficulties to exchange these files during runtime so the next thing is it becomes easier when you know that you are obvious that you think it changes that's what you want to do why setting a document to find out what's the change afterwards why not just matching the change that you're doing in the first place the problem is it's not standardized we only have the file format and also you want to change your XML file that's a problem for the application that don't know anything about XML so the solution was to see what's in common plus basically the same thing in office application it's inserting things that are quite similar like a table, a paragraph and text and so these are like logical blocks or logical contents that we are now adding, deleting or modifying with similar properties and the fine thing about it is you have an abstraction that does also provide between the file format application models but you might even think about abstracting for file formats so we said that only they have also similar features and you can go to a higher level so the next thing that you need to do if I want to tell you that I inserted something I need to tell you where to do it so it's good to say you are in the first place of your document so we count everything through the document and then we tell you where we edit it like in the third place I inserted a paragraph with parallel world so you know what my changes and I say to the back oh by the way I changed the 1 million paragraph and made it red so this is very easy the merge of these things is a trivial and it does not, that's the best thing at all it does not scale it does not belong to the right phrases the complexity is not dependent on the size of the document it might be a very useful file as long as the merges just the changes the merges, the changes do not have to meet the complete document so this is the basic idea so what do we gain from this if we have something like the ability to have logic blocks and the change like hey I inserted a change in the third place a text piece or a paragraph and the insertion completes the change up in the third place we can test like we do now the load in a safer document but it can load, apply the certain change and save it back by this we have much more better performance testing we might have this a better feature testing which is currently not possible the second thing is the most important thing efficiency of merge because whenever we do collaboration then it's all about merges that's why Git is such a success because the merge is so easy subversion CVS, they had a lot of problems with merging and if you make this very easy then you make the collaboration very easy and of course the abstraction gives you the like we were earlier the OEF top model if you don't have to deal with the XML but with things you know, the logic units it's much easier so there are also some new features of my PIA if we use this for instance if we have a contract or read only document we can send a change or put this aside of it in the package the changes is just a key of changes or we can have an XML aside of a signed document which can be then seen like an annotation of the contract without breaking this sign it's a signed thing and we can sign all changes as well and also, I mentioned earlier we have currently the filtering of documents so we load everything what we know and what we don't know we neglect but this is things what this web office of a change instance does it just creates changes from the documents loaded and then it merges the new changes in by this you can even have a full feature document in a mobile application which only knows text and paragraphs correct some typos but have all features still in the document by allowing this merges and the last thing there are many other things but currently the change tracking is being done by saving the previous state you have like a before and after XML you save the before XML just to make the undo but if you have two regions that are overlapping and you want to neglect the first one the second one is just still tainted with the formatation of the first one so what happens is when you try this like ABC make AB red and BC green and then you want to do the first change then it will simply not work with none of this current of simplifications this is for formatation not a big deal but think of the metadata example which can be overlapping this would destroy the okay and the last thing is that it might be possible that you have the simple design for change tracking under video and the history the problem is that current applications are quite old and the design is already made it's very hard to redesign everything in a way like we did now alright then was it basically so I've got a lot of reference for you as I said the URIF guidance from UK government project then some use care for payment document and the website of the subcommittee that is currently working on the advanced document collaboration and standardizing these changes I've been talking about some ACM paper about these little changes that I wrote with Patrick and another engineer from building interesting as the PDF for new features especially and the last next thing is some comparison between the URIF 2013 about the change tracking we will see that the only thing that Microsoft can do is like change tracking the change of template styles like any one which is just a minor thing and the insertion of roles of course they do like and finally there is some open age change some example how changes the revision of the document has changed the list of changes might look like it's a JSON file just a mail and it's been attached so that's it are there any questions for you anything shall I ask you looking forward to see this in action actually well which part, the change tracking I mean this correlation with change stuff yes well I worked after my time let me give you an example with open age change this is a they do a they do a mail frontend and mail attachment they edit the documents like topics and all you have with that browser and what they do is they use the toolkit to translate a document to a list of changes they send it to a browser and the browser knows all these changes and creates them and every user action creates another change and sends this in the end after a while back to the server so the only thing that's going to change are these are no files but only changes so the browser doesn't know if it's in the other browser for a few months or a file and it works quite well and the feature set is much higher than it would be and a lot of open office guys are working in this queue but it's not it's not standardised it's just a proprietary thing it's a good prototype and it does not have a cost the standard scenario is but the collaboration is not on the priority and I skip the way you are with the like you can change things in this queue like you move changes around but it's quite interesting basically this is like a wave from a patchy wave from Google Docs where it's called a version of information how they do it also the IPI from Google Docs that's the same way as well so they do the same way as well I'm not sure if your email is correct please tell thank you one and it won't be used by Google www.shubert.com oh did that something wrong no the email is correct okay yes my name is Ron in the title I think that's quite a this is a quote question talk to you later um what time is it try to set one I believe Chelsea would like to take a look okay thank you