 Okay, thanks again, and we go from Markdown to XML. Exactly. So, yeah. I'm Stefan, I'm a technical writer at Sousa, and I'm going to be presenting about what we do to try to keep our documents Q8 automatically or even just half automatically at this point. I've been working with Sousa for the last two and a half years. Before that, I was a working student there, so I already have some bit of history with the company. So, first of all, this is the outline of the talk. I'll give you some pointers on our workflow and our team, what we do actually. Then I'll come to the normal client-based checks that each team member can do individually on their machine. Afterwards, I continue with the server-based checks that we do. Finally, I won't be speaking about the documentation as such, but the star sheets that we use to output HTML and PDF from our XML sources. So, first of all, the team. We're actually pretty small. I mean, we're actually not that small, but we kind of look like this. This is a few of you have ever seen, as soon as the documentation presentation in the last two years, you may have seen this photo already. But that was just a few of us. And given the amount of people that we have, we have quite a big task, so we have to maintain around 15,000 PDF A4 pages. Mind you, there are some pages that I counted twice in there. I will come back to that in a minute. And the content that we do is everything that is published on sues.com. And also the release notes that are at sues.com. Release notes or ship with our products. Our main formats for documentation are HTML and PDF and EPUB. What we do not do is we do not do Linux man pages, and we also do not usually work on the output of the various management help commands. So, and our workflow is based on DocBook. We're also currently looking into adopting ASCII Doc, at least for some tasks that are easier, but we're otherwise still quite happy with DocBook. Even though there's sometimes this kind of pushback that developers want something that is supposedly easy, we always put that in square quotes and scare quotes when people say that markdown is easy, because it's actually not that easy, but ASCII Doc is a good compromise, I guess. You lose some semantics. Yeah. Our basic approach in what we do is this robustness principle. So we try to be liberal in whatever input formats we accept, but we are relatively conservative in what we output, which is pretty much always DocBook at this point. So, yeah, we have single sourcing. That means we have multiple output formats, multiple output documents that are produced from the same source. So that means, for example, we have this less, and so the server and the desktop product documentation that are basically produced from the same XML source. The only difference being that we exchange the product names and also that we switch off some chapters for each of the guides that are not relevant. And, of course, the second aspect of single sourcing is usually taken for granted, I guess, today because it's just that you can produce multiple output formats. Our workflow is intentionally oriented at what developers at SUSE also do. So we use Git and GitHub and increasingly use pull requests and reviews on those pull requests. And we also use the so-called GitFlow branching model, which I think helps us a lot. And, yeah, keeping an overview of our branches. And so we have this thing called OBS, the open build server that also builds some of our documentation. And because of that, we also have a fully open source toolchain. So because this server basically operates only on open source stuff. And this toolchain is called OBS, the Dockbook authoring and publishing suit, which we need because the upstream Dockbook toolchain is sort of not a toolchain at all because it still needs all the glue that holds it together and it also has some functionality gaps. That's a bit different from Dita where you actually have a full toolchain that is prefab. And that solves this toolchain problem for us. And it's also open source, so if you like, you can take a look and maybe solve, it might solve it for you. And what it does, it takes pieces from various upstreams that are glued together. For example, to enable the single sourcing of documents for different products, we need something called profiling and those profiling star sheets need to be run before the actual output star sheets. And this is basically stuff that DAPS does, keeping this flow. It's a command line tool. Yeah, so we're a Linux company. We work pretty much all on the command line. And it's also editor agnostic in that it's not particularly integrated into any specific editor. And we all use different editors, so that's also something that I guess is somewhat special among documentation teams. There are some processes that are external to our team and also are not really handled by DAPS or only some fringes of it are handled by DAPS, which is translation, which we have outsourced to another company, or yeah, freelance translators. And a final publication is also something that the other team at our mother company microfocus is doing. So with that, I come to the actual topic. So the first main thing we do client side is of course validation because validation is necessary for XML and it's also one of the things that makes XML worthwhile because you can very strictly say which kind of content you accept and which way in your document. So we of course need validation. To validate we have an own DocBook 5.1 based RNG schema and it's called Beacodoc. Basically what we do with that is we restrict DocBook to a certain degree because DocBook is quite free and for example allowing lists within paragraphs which for layout reasons are not really what we want so it's quite hard to lay out stuff like that sometimes. It's run using dAppsValidate, so that's just the command we use and the validation is based on the GING tool which is just an upstream Java tool. The next thing we have is something called Style Checker. We have a SUSE style guide for our documentation basically laying down language and structure rules that authors should adhere to. Why do we do this? Because we want to avoid confusing our readers and we also want to avoid translation costs. So that's why we have a style guide that also includes some terminology rules so you get better translations and also avoid confusing your readers with synonyms, etc. The Style Checker is a custom thing and it checks for both language rules and some sort of soft syntax rules. That are usually helpful but which we didn't want to have validation crash on or which we didn't want to pull out the big hammer of validation for. So one example for language rule is that if you use the words in order two which is a quite wordy phrase it suggests to use just the word two which is already a lot shorter. An example for syntax rules is to avoid lonely sections. Lonely sections are basically subsections that don't have any peers in their own section. This usually hints at a structuring issue. One of the nice things that having this custom program means is that we can actually integrate with XML and it's also a necessity for us because our documentation features quite a few commands with very reduced and cratic names. It also enables some nice things. For example we have a rule against using the word hit to mean press a key. Because we mark up our keys in DocBook with the keycap element we can just look for the keycap element and look whether there's the word hit or press in front of it and output an error message or warning message if there is the word hit. It's written in Python with lots of XSLT and lots of regular expressions. I'll give a short preview. This is what the output looks like. We have messages. You see this is an XML file actually styled with CSS which works. But it tends to produce quite a bit of output so checks for sentence length. This is just actually a relatively well checked file and it's also just the installation quick start guide which is maybe 30 pages and a lot of screenshots. It tends to produce a lot of output for longer documents. Some of it is also false positive. That's not ideal. What do we have in terms of future plans? One of the first things is that we already have a work in progress for this is adding spell check which might seem weird to you because you might think this is the first thing we'd add but we already had a pre-existing spell checker in depth so I didn't re-implement this immediately. There's also an issue with source lines. That's also a bit of a hard problem and we have the issue of output formats because this XML file is not scaling that well I must say and we'd also like some editor integration though. We probably need HTML output and also some kind of plain text output that is readable. Next on we have server-based checks and then we have Travis which is relatively new in our tool chain. Travis is a Docker-based CI system that integrates with GitHub. It's free as in beer, unfortunately only. Why do we want this? I said we have this GitHub pull request workflow and this pull request workflow also means that we get pull requests from other teams at SUSE. Not everyone is actually running a validator and some people are validating only with DocBook 5 instead of our more strict Gecko doc schema and Travis gives us some sort of quick feedback there. It also consistently checks all output documents which is necessary to prevent a problem we've had before. That is that sometimes people would only check some output documents and other output documents that were not needed at the time were then languishing sometimes even for months and were basically not buildable. Travis makes sure that we keep this stuff buildable all the time. To do that we have a Docker to container with OpenZoozer and Dapsunit with which we validate when we also make sure that all images are successfully checked in. Sometimes people forget, get adding an image they add it to the documentation and it's good to check that on the server. But we're not really at the end there so one of the other plans we have is that we want to use Travis to actually build our documentation and publish it to GitHub pages and integrating the style checker would also be a nice idea but we'd have to look at how to not inundate people with too many messages. But as I just spoke about publishing we already also have an internal basically nightly build publishing platform which is helpful for getting early developer feedback and sometimes unfortunately valid XML does not mean that our PDFs build. That's basically fringe cases but it's relevant enough so in those cases you actually get mail from our internal publishing server. Yes this server builds documentation automatically every day or on demand from a web UI. It also has a search and it has a quite interesting sort of technology stack if you want to call it that. I have a quick example here so this is not the actual live server but this is what the overview page of it looks so basically you can just request a rebuild here. And our documentation looks like that then so with the draft logo and things like that. Now on to our star sheets so I'm finishing with the documentation itself. The star sheets what are they actually so the doc book upstream project has these XSLT star sheets that format your document as HTML or PDF or man pages or EPUB lots of different formats but of course we want to add a dash of green to that and our SUSE logo and things like that. And so we have our own custom SUSE star sheets that actually import the doc book star sheets from upstream. And why do we need to check that? That's actually pretty simple. XSLT is complicated. It's not always clear at the first glance how the flow works because it calls template rules and depending on which template rules are also present the template rule and use might change. Then we also have to keep doc book four and doc book five compatibility. We have to keep language compatibility and since we output quite a few output formats those are also in need of checking whether they still work. So since our HTML is also responsive that's basically like three different formats all at once. Okay. We need to check whether our bug fixes work and as a bonus we also get documents that we can use for manual tests. So what does steps compare do? It creates lots of little images from our documentation and then it basically you run it once before you make change to the star sheets and once after you made this change to the star sheets and it automatically compares those reference images to the comparison images and then shows you in a nice little cute based viewer and side by side view of those images with changing sections actually highlighted. We made something custom for that because what we wanted was actually pretty simple we just wanted to run dubs as parallel as possible and then get those images. Unfortunately I can't show you the program right now not even the viewer because it hung up on me this morning so I'm sorry for that and we have some future plans we need more example documents there especially in different languages and another idea is to also run this on Travis which we're not yet doing and with that I am at the end A practical question we've used .book for our documents a very small book for 10 years XSLT as you say is quirky and complex to use first of all it is not itself terribly well documented there is little documentation of what it is but not the why or how so you have to learn for yourself have you found anything that explains to you how to write XSLT star sheets? Unfortunately you're not German there's actually quite a good documentation site that is German only I've been prepared to learn German there's a company called Data2Type if you're German they have actually produced quite good documentation for XSLT sometimes you can use Mozilla developer network that works but also Microsoft has pretty good documentation I must say that also works for me quite a lot of times yeah sorry the question was whether there's good documentation for XSLT okay more questions yes so the question was whether we use JIRA or Buxillow or another backtracker and how we pull that together with Github with our Github workflow so yes we have SUSEBuxillow we also have an internal feature tracking system called FITE it works basically the same but it's for features not for bugs which is a sort of arbitrary distinction but yes we do that but we do also notice that a lot of our developers are actually working on Github as well so keeping people on Github is actually not a bad idea okay yes you have your kind of custom RelaxNG that's a modified version of custom doc books and you have some various other checks have you looked into using Schematron to implement any of those checks instead of modifying the style sheets or so the question was whether we are looking into Schematron yes we currently are on the other hand Schematron is sort of like XSLT lite in a way so yes we're currently looking at Schematron too because Ging also supports Schematron but it is a bit buggy because the doc book RNG actually includes some Schematron rules but Ging can't execute them because they're because they're embedded in the RNG so we're currently looking at how we actually can solve this pickle there yeah one more maybe one more maybe cool okay thank you very much right there is now the seat shift one up