 Welcome, Stephen. OK, afternoon. Can you hear me? OK? OK, good. Afternoon, everyone. Thanks for coming. My name's Stephen. I'm working for Red Hat, but I'm here talking about something completely different from my day job today, which is Sphinx readers, writers and stuff that you can use for, as an alternative to Pandak. So we did a quick show of hands a few minutes ago on who'd use Sphinx and who'd use Pandak. So I'm guessing most of this is going to be, you're going to know this all already, so I'm going to skip through it nice and quickly. But a quick overview of restructured text, dark utils and Sphinx and how they all kind of build on each other. So for anyone that hasn't seen it before, this is restructured text. It looks kind of similar to Markdown, something slightly different, not quite as simple to write. If you use it very, like the basic level stuff, restructured text, though, differs from Markdown in that it adds a whole load of additional functionality through these things, roles and directives. Roles are inline things that you commonly use for things like cross-referencing, other documents, cross-referencing terms, that kind of thing. Directives are something you can use to generate code or generate documentation from code, transform documents, whatever the hell you want to do. Very, very powerful functionality. So restructured text is the syntax. Dark utils provides what most people would see as like the reference implementation of restructured text. It's been around for years, provides so for example github, readme, renderers, that's all using the dark utils tooling under the hood. Sphinx builds on top of this by providing some nice little additional functionality such as the example to build up books of documentation, cross-reference between different documents, that kind of thing. We're going to go into this in more detail shortly. So under the hood, most people's experiences of dark utils is going to be something along the lines of text. They write a restructured text document, they pass it into one of the tools like the rest of the HTML tool, and on the other side of that they get this nice pretty HTML or PDF or ODF document or whatever they want. So for the most part you kind of think of dark utils as this black box that you feed stuff in and chuck something out the other side. But dark utils itself actually provides some tooling that can kind of help you understand what it's actually doing under the hood. The rest to XML or rest to pseudo XML tools give you a pretty good illustration of this. You feed in your restructured text documents and you get out something resembling this for like a very simple document. This is the pseudo XML, it's like an XML YAML hybrid thing that makes it a little bit easier to read. But you've got this dark tree model similar to HTML I guess, where everything is basically rendered into that and dark utils is taking that and converting it into whatever else. It also has this tool that will skip applying transforms and I'm going to go into that very briefly or quite shortly. But through looking at that we can kind of go look at the source code, you'll see that dark utils has this four step process for converting something from restructured text or markdown or something else into whatever you want out the other side. Readers and transforms, the first isn't that interesting because it's basically pulling the text out of actual files. Transforms are interesting but I don't have enough time to go into them today. So I'm going to be mostly focusing on the other two which is passers rewriters, Sphinx, then builds again on top of dark utils. The workflow is slightly different, you give it a restructured text document, you also need to give Sphinx this configuration file which at a minimum should say what your root document is. And then there's a whole load of other stuff you can do for configuring extensions and that, not going to go into that here. Ultimately though instead of using something like the resty HTML tool, you'll use the Sphinx build tool. You'll tell it what build you want to use, where your source files are and where your output should go to. And the output for something as simple as this is pretty much the same because it's a one document tree. Sphinx, because it builds on dark utils, also uses the readers, passers, transforms, writers. It makes some modifications to them. That's not that important, the modifications it makes. What is important is that it builds on top of those four things with another set of things. The most visible of those is the builder. So your HTML builder, your PDF or latex builder, whatever other builders you want to write. There's also some, it also builds on top of like nodes and transforms and stuff. All that isn't that important so I'm not going to go into that. So markdown is in the summary of the talk. It's one of the passes that's currently available for dark utils. There's another one there which is for Jupyter notebooks. I've never used that. I've used the markdown and restructured text one. And obviously it is possible to write additional ones for ASCII doc or whatever you fancy converting. Using these from dark utils is pretty much the same kind of workflow. So the markdown parser is provided using the re-common mark package. And that package provides this nice little common mark to HTML tool, which works just like the rest of the HTML tool. Only you point it to the markdown document. You run it and it chocks out a load of HTML at the other side. And because that document was basically a markdown version of the restructured text document, the output looks the exact same. And again you've also got this pseudo XML thing. You get to see that the dark tree model that it's building. Again almost identical. When you're using the markdown parser naturally you are going to lose a lot of the power that restructured text gets you. But if all your documentation is written in a markdown anyway or people refuse to use restructured text for whatever reason it allows you to kind of keep using the same tooling in the form of dark utils or next up Sphinx that you've always been using. So from the Sphinx side of things, again for restructured text you had the restructured, the RST to HTML tool. For the Sphinx version you just use Sphinx build with a builder. You basically do the exact same thing for the markdown. The only thing that changes is that you need to tell Sphinx how to parse these particular markdown files. This is all documented as part of the re-common mark package itself. And also the read the docs website has some really good documentation on this. But in summary you're just saying that markdown files should be parsed with this common mark parser. And we're going to build a HTML and it's going to give us the exact same thing. So that's the input side of things. On the output side of things you can do almost the exact same thing. So dark utils itself provides a whole load of writers so this is just a few of them. The most common ones naturally would be the latex and the HTML one. There's also a PDF one. I think there's a talk at the Python room tomorrow on the RST to PDF package. But those and then a load more available on PyPy. So for example the RST HTML5 one which is packaged gives you HTML5 as opposed to HTML4 or XHTML output. You can also call all of these things programmatically if you want which can be pretty useful if you have some external tooling that you need to call like you don't want to be calling out to your shell. And naturally if you wanted to do something else. So for example the best text tool will take a restructured text file and it will strip out a lot of the restructured text semantics and give you something a little bit simpler. That can be useful for things like you're building an RPM package from a Python file and you want to strip out all that restructured text because it's not readable in plain form. You use this tool. And when you saw them most of these things provide a similar program. So things again builds on top of this variants of pretty much all of the writers that provided with docutils. Again HTML and latex tend to be the ones that most people use. It has the text one, man pages as well I see used in a lot of projects. We used to use them quite extensively in OpenStack not so much anymore. Calling these is the builder argument. The dash B you also can do it via make files. Sphinx quick start will generally include a make file that will let you call these that way. And again you want to try different output formats. There's builders available pretty much any output format that you can think of. I'll give an example here of the ASCII doc one but again if you want to render straight PDF and skip the whole latex step there are builders available to do that. In terms of writing your own parsers and writers I'm going to give a very quick summary of this but I find it tends to be very hard to explain this stuff in the form of a presentation so I've given a whole load of links at the end of this presentation and the slides will be shared afterwards. That's probably as good a resource as any and I'd recommend going reading into those if you're interested in this. But from the parsing side of things all that a parser needs to do is take in a file that has already been loaded by one of the readers and convert it into one of these doc tree models. So you can think of the doc tree model as something like the intermediate model that LLVM uses. It's the canonical representation of your document regardless of what format you're using for input and output. Passing that I'm not going to go into how you pass documents because that is a two hour class by itself. What we're going to do is cheat, we're going to take the output of the rest to XML tool and we can simply pass that back in and regenerate this document model that we wanted. It's kind of a nonsense example but it does show you that parsing can be a very simple thing. You'd use this, so you would use this for example if some of your documentation team insisted on using ASCII doc for example or heaven forbid if they wanted to use doc book or something like that. The important thing that you need is that the input format has to be in some way kind of semantic because something like trying to pass graph files for example would be next to impossible because graph is more focused on what the output looks like as opposed to the semantics. There are some semantics tied up in it but not in the same way that you'd have with restructured text or HTML. But in this case where you can use the XML utilities provided as part of the Python standard library, pass your document and re-render that back in and you get your nice doc tree model at the other side that you can go and write to HTML or whatever you want to write it to. Writing is significantly easier than parsing which is probably why there's so many builders and writers available. All that that does is you need to provide a translation layer for something like the Pseudo XML or the XML file. All they're doing is they're writing the doc tree straight out to a file out to the screen. For something a little more complicated like that Resta text writer that I demonstrated earlier they use translators and all a translator is that it has a node visitor and a node departor. For each one of those you determine what you want to do most of the, if you're going to write one of these writers generally ends up being a state machine where you're just keeping track of where you are within the documents and what the output format should be. Building on top of this from a Sphinx side of things is pretty simple. The only thing that Sphinx will do from a builder side that doesn't happen with the writer is it'll keep track of the cross-referencing between documents and it'll also resolve those cross-references at building time. So in summary you've got Sphinx docutils. They're pretty much the same thing under the hood. Sphinx is a superset docutils. They all use readers, parsers, transforms and writers. Sphinx modifies those ever so slightly but for the most part they look almost the exact same thing. Sphinx just tax on stuff on top of that in the form of the builders, the application and environment, the latter two of which are more applicable if you're writing directives or roles for Sphinx. There's multiple writers and builders available for both docutils and Sphinx. You can use docutils just fine by yourself if you're happy not to have that cross-referencing and translation or anything else that Sphinx gives you. You've got all of the writers and builders that exist in one, exist in the other, but the set between the two is pretty much everything. There's a load more available on Pypy, ranging most of them being writers and builders but there are parsers in the form of the Markdown and the Jupiter playbooks. If you decide to go and write your own, the Sphinx documentation quite funnily for documentation tool is pretty awful but the docutils documentation is excellent and they provide a rake of information on anything you could want to know about writing and passing documents. Outside of that I think we've still about five minutes left. If anyone has any questions I'd be happy to take them. Go for it. The main reason I use Pandorca unfortunately is for converting Markdown research and texting stuff to Word. Is that possible? There is a docX writer, the standard of it, I don't know I've only used the ODF one, which naturally you can open that in Word as well, but they do exist. It does exist. There are parsers for docX. I don't know. I spent many weekends working through and trying to convert the ASCII doc Python implementation as both of the Ruby ASCII doc to one into something that could render a docutils document model and it turns out that there's a reason that they switched away from the Python version of the ASCII doc tooling. It's really bad. It's really crafty. So I'm basically halfway through rewriting that from scratch but the two ones that I've seen are Markdown and restructured text. The issue I see with the Word thing would be the lack of semantics around the document so it's quite hard to know that this is a title and this is a code and so forth. Can I take a picture of the audience? Is there anybody who doesn't know? So like refactoring your documentation essentially not so much the guidelines that I normally see enforced so I'm working on OpenStack, that's my day job and we used Sphinx for everything moved away from Docbook years ago and the guideline that we usually have is so you can reference documents via like the doc role or you can use the ref role. Using the ref role means you can move the document wherever the hell you want and it'll just keep on cross-linking whereas if you use an absolute path or relative path to something as soon as you move it you have to go and update all those references. So sticking in like an anchor at the top of your document and referencing that way. I don't know if that's exactly what you're getting at but yeah we so for the pure link or pure avoiding having like to fix those cross references we use the ref tag instead of the doc tag. The other thing that we do a lot in OpenStack we have tooling that will generate a what are they called the redirects, the Apache redirect file so it will you'll say I want this file to that was living here to now point here it'll generate that the hd access file and those are deployed on the documentation servers so we had a document living here someone goes to access that and they get redirected properly. I know read the docs will actually do that but it has to be done manually. That's about the only kind of tool we have around there move splits and up sections and that tends to be a pretty manual process. I just wanted to be sure that you're able to use anything within the common mark spec in terms of the extensions. I don't know how it actually handles those. It should be able to render them but I don't know it would be able to render them for the built in ones but I don't know how Sphinx will handle. If they get converted to a docketill it's fine but I'm not sure I haven't experimented too much with the extension if I want to do extensibility I usually jump into reshock's text. Sphinx with remark and mark down is worth the kind of with common mark but it's not always really stable but there's quite often there's a new release of common mark when remark is broken for some time and stuff like that and out of the they are not always 100% or in things with features. It's running kind of but it's tricky. You have to know that if you really decide to use it together. Interesting. Google ducks. I wish I could have used Sphinx but I had to be honest I still do presentations in PowerPoint if I could just because I've been broken in that regard. I've never used Sphinx for documentation or for presentations. You do have there is my colleagues use Beemer more source but there are builders available for in docketills for Beemer and also this script HTML. I just never use it because I just haven't gotten around to it yet. Any last questions? If not. Very briefly so it uses it was there's two aspects to it. The first one is that the code itself is really old. The Python code as opposed to the Ruby stuff it's all in one giant file like what you consider modern Python practices are like using anything to install it. I think they ever make files to install stuff. Tox testing, unit testing and stuff all that is very 90s so it's hard to start working on that. It's also been pretty much abandoned at this point. There is a Python 3 fork of it that have it running and passing all of the built in tests but I don't know if anyone's even using it and then lastly how it's actually implemented they use this like streaming parser which leaves means they have a couple of issues like for example they can't render their top trees and table of contents are all javascript base because I think they need to pass over it twice to basically build up their document model and then they can insert the table of contents which is what docutils does by way of a transform. It just doesn't the two don't really map very well together so there's a lot of downsize and given how crafty it is I can see why they just someone decide wrote a fairly good implementation in Ruby and ASCII doctor has basically taken off since you need a lot of TLC.