 First talk on Sphinx from Steven from Red Hat. OK. Thanks very much. Thank you. So, show of hands. Has anyone in the room not heard of Sphinx? OK, good. Right, because I have a whole introduction section here. I was just going to skip it if everyone had already heard of it. So a very quick summary. You've got Sphinx. You've got Dark Utils. And you've got restructured text. And they all kind of build on each other. First, to start off, restructured text is a plain text, like markdown-like format that you use to mark up your text. So it supports a couple of things. Your basic formatting options, whether you want to mark something as bold and italic and that. It supports linking. You can do substitutions. You can obviously mark up your titles. For anyone that's ever used GitHub or something, it looks very similar to markdown, only a lot more powerful, a lot more extensible. If you feed that restructured text through something like rest to HTML, it'll give you HTML output. And this is kind of the basic functionality. This is what a lot of people would be familiar with from the restructured text perspective, is what rest to HTML will give you. But that's only a part of the power of restructured text. There's stuff that we can do, stuff that's kind of unique to restructured text. You've got things called roles, and you've got things called directives. I'm going to go into a lot more detail in these over the course of the presentation. But roles, essentially, this is your role. It's something surrounded by colons, and then your back-tilders and whatever you want to pass into the role. You can use roles to do like the formatting style markup. For example, to say something should be bold or italic. But you can also, there's a lot more functionality that can be embedded into roles. Directives then are indicated by these two dots. The name of the directive and the two colons, some arguments, and optionally some body text. This is an example of an admonition, which is like a call-out in the document. So when you render this to HTML or latex, you'll usually see this as a big call-out with a box around it. All of these are provided as part of the standard restructured text parser, which is Dark Utils, which I'm going to go into next. And again, rendered. So restructured text is essentially your syntax. Dark Utils is the thing that converts that syntax into your output form. Dark Utils also supports other kinds of readers. Markdown, for example, was added recently as part of a separate extension. Restructured text is the thing that most people would be familiar with. Now this is where Sphinx comes in. Sphinx builds upon Dark Utils, and it adds things like the ability to cross-reference documents, build multiple output documents as opposed to one input document equals one output document, and basically a whole load of functionality that means you can use it as a proper documentation tool as opposed to some way of like storing your documentation in play text. And just to reinforce that, Dark Utils readers, parsers, transformers, and writers dealing with individual files, Sphinx extends that to work with multiple files, and it uses a concept called Builders, which I'll touch on here. I won't go into too much detail though. And the reason that I'm here talking about Sphinx today is the last point on this list. It has fantastic extensibility interfaces. Pretty much anything that you want to do with your documentation, if you know how to use Sphinx, you can do it this way. And all of these, when I'm talking about building documentation, the Sphinx build, that's the executable that most people would be familiar with. So yeah, let's get to extending this stuff. So some warnings before I start. But Sphinx is written in Python. It was initially developed for the Python API documentation. So it's still, it supports multiple different languages through a thing called domains. But obviously it's still all written in Python, so there's a little bit of Python knowledge expected here. Just when I'm showing code, I'll be keeping the code to a minimum though, so don't worry about that. I'm going to be referencing some OpenStep projects because that's what I work on in my day job. And I have a GitHub link here. The slides should be available on the Fasten website afterwards with these links. And I'll also give a reference section at the end. But all the code that I've given here is all available in the GitHub repo. So from the perspective of extending Sphinx, everything starts from the application object in Sphinx. This is your main entry point. So when you call Sphinx build, when you call the build doc, setup tools, command point, all of these call into Sphinx application, the application objects, that provides a load of different functions that you can use within your own extension to add various different capabilities. The things I'm going to focus on today here, the add-in configuration values, directives, roles, which I've both mentioned before, and another thing which is Sphinx only, which is hooks. Firstly, roles. Just as a reminder, roles are these things that come with like colons and then back to this and you pass in something. So it's an inline node. The normal ones that are provided, like I said, they're used for formatting your docs and also a little bit of like additional functionalities. So there's like an RFC role. There's also, I think, a PEC role and a couple of other ones. But for the most part, it's quite simple. The way that you go about extending this, they provide, you have to add this function, this set up function, and call the add role thing on your application objects. This goes in some way. It gets passed into configuration. I'll go into that in a bit. But this thing at the top, Dark Utils defines what this should look like and you just have to basically copy and paste that and put all your functionality within the actual function there. So we'll give a real world example of why you might want to use both. So this is taken out of the changelog for things itself. It's an example. This is all done manually at the moment. And for anyone that can't spot it, there's Git issue numbers on the left and then there's a description of what issue was actually fixed in that release on the right. Now it would be really nice if instead of using this in just plain text, we were actually able to cross-reference back to the GitHub issue so that people could go and click on this in the output and say, oh, well, this is all the information I wanted about the issue. So if we go and we slightly rework this and we add this GitHub issue role, I've moved stuff around just because it looks a little bit prettier done this way. And all it takes as an argument as such is the exact same issue number that we had previously. Only now instead of being in plain text, it's in the form of a role. How we'd actually go about implementing that, you just have basically your role again. Remember all these arguments, they all have meaning and they're defined by docutils. All that we're doing is we have our base URL for the repo that we want. We build a docutils node, which is a reference node, which is a URL or URI. And we pass in a URL that we want and when we actually go, we add the role, when we actually go and render this in the output, it gives us nice pretty HTML links. And if you go and you click on those, it'll bring you to the GitHub issues. And this is a PDF, so the slides for this, you can click on those and they will indeed bring you to the GitHub issues. Which is nice, right? So it's an inline thing, it's a quick way of referencing something without having to manually write your URLs for everything. Especially if you decide to change your issue tracker or something like that and you can manage to keep the IDs the same, you just go and you change your base URL and job done. I don't know how you'd keep your IDs the same, but that's a different issue. So if we wanted to go the next step beyond that, we wanted to do a little more powerful, this is where directives come in. So directives are also a docutils thing. But again, because Sphinx has things like the ability to cross-reference, directives in Sphinx tend to be a hell of a lot more powerful than they would be in just using plain old docutils. So again, a reminder, this is what a directive looks like. And in terms of running it, again, docutils this time, instead of a function, it takes the form of a class. I'm using... I was expecting to highlight that. I'm building a tree of nodes. There's a nicer way of doing this, which I'm going to go into in a bit. But for now, I'm building a node, it's a section, it's got a title and it's got some text associated with it. This is just a hello world example, it doesn't really do anything. But that's an example of what a directive would look like if you were just using plain old. This will work with docutils or with Sphinx. So instead of just cross-referencing the issues, let's say that we wanted to dump that information from our issue tracker into our documentation. I'm not saying this is a realistic example, I'm just saying that if we wanted to do this, this could be something like you wanted, instead of using issues, you have, for example, a YAML file that has some structured data in it and you want to dump that into your documentation in a readable format. So we'll take an issue. This is issue number 2463 from the Sphinx issue tracker, which was fixed at some point. And I had it manually written into the documentation like this. So I've got issues, it's the issues page, this is the name of the issue and some cross-reference who opened it and then a summary of the bug. And this is all available via GitHub and the great thing is that GitHub has an API and we can hook into that API. So instead of doing that, we can just create this directive, GitHub issue directive, we pass in a single argument, which is the name or the number of the direct of the issue. And we hopefully should be getting some kind of output. So what the API returns doesn't really matter here. GitHub have really good API documentation. You can go and check that out if you want. But if you could make a request to this with a specific issue number, it'll give you a JSON blob with all the information about your particular issue. So who opened the issue, what the title is, the body and so forth. So if I use the request library to make a request to the API, I do some formatting of the response that I get back and return only the stuff that I care about, which again is the title, the ID, who opened it and the body. That's basically the issue as far as I'm concerned. I can then call that from my directive. Again, remember the Hello World example, I'm building a section, a title and a body of text and the IDs mean that's the way that I can cross reference. So your anchor in HTML, the anchor, this is your href in the page. And I'm also using a literal block, which is like the feedback till there's a markdown and GitHub markdown. It just means that it's going to be an amount of space funds and it's not going to strip any of my new lines or anything. It's going to display it like code. And then again, this setup thing, this gets called by Sphinx. It goes and it looks through every extension that you pass through for a setup function that tells it what directive it is. Also, what version of Sphinx is necessary and whether it's suitable to read documentation with this extension enabled in a parallel mode. There's also a parallel write safe option, which defaults the false. So going back to the directive, this is what it looked like before. We go and we render it and we get nice, pretty HTML. So a handy little trick for this, building docutils nodes manually tends to get a little bit, it tends to get a bit complicated and it also means that you can't use other like Sphinx stuff like the code block directive and stuff within your documentation. So there's a thing that Sphinx provides which is nested pass with titles. If you call nested pass with titles on a string that contains restructured text, formatted text, that will actually go and it will generate all the suitable docutils nodes for you. So you don't need to go and look at the docutils reference documentation to figure out what I need to use if I want like a cross reference or if I want literal text, that kind of thing, Sphinx highlighting and so forth. And all I'm doing here instead of just returning nodes, I'm now formatting it so I'm just returning strings. Gives me the exact same output. That's roles, directives. The last one doesn't exist in docutils. It's a Sphinx only thing which is events. So through the build process, when you call like Sphinx build it will emit a series of events. You can add your own custom events but there's a load of events that are added by default by Sphinx. So things like when the builder is initialized, when your configuration is loaded, when it's all your source documentation read in and when it has your documentation read in if a doc tree exists. So this is probably the most powerful of all of these because this allows you to create brand new documentation from scratch. So generate restructured text documents that didn't exist before. So for example, if you have a million YAML files and they all have some structured information inside of them you don't want to have to go and write a million restructured text files that have references or directives that reference each one of those YAML files. You want to do it automatically. So events is the way that you go about actually doing this. The way that it works, every event is slightly different and the handlers are all different as well. They all take different amounts of argument depending on the information that's available at that point in the build process. The one that I tend to fall back on is the builder initialized one because that means that everything has been kind of set up and configured. We know what output format we're going to be writing. We've read in our configuration. We've read in our source files. The only thing we haven't actually done is started writing out like our HTML or latex or whatever we're actually writing out. So for example, I give an example where I was using a single issue and that would mean that for every single issue that I wanted to reflect in the documentation I have to go and insert a directive for that. How about if I wanted to pull down every issue that has been raised in the last 30 days and for some reason put this into my documentation? Again, I'm not saying this is a good idea but I'm saying that it is possible and you can extend this any which way that you want. So again, we're using the GitHub API. We're going to pull in the information from the issues endpoint and for each of the issues that we retrieve, instead of retrieving just an individual issue, we're going to retrieve every single one of them or at least as many as it'll give us in a page. We're going to do some formatting of that. In this case, again, we want the number. We want the title. We want the body of that and we want who created the issue. And then when we have all that information, we're just going to create a restructured text document within this issue sub-directory, which goes within my source directive from my documentation. And for each of those files, we're going to do what I did in the previous directive and I'm just going to manually write a restructured text document. This isn't perfect because, for example, if your document or your issues, if they contain stuff that is magic for restructured text and isn't magic for a markdown, then your document might not render properly, but for the sake of a quick example, this will give you something approaching what you'd be looking for. You could do fancy stuff if you wanted to. For here, I'm indenting all of the body of the issue, so it'll just display like a literal block. If I wanted, I could instead pass it like markdown and convert it into restructured text, which would be a lot nicer looking. But again, quick demo. I hook in that issue by way of the connect function, so we had add role, we had add directive, now we have connect. And again, we return this information which says what version of Sphinx this works with and whether it's possible to use this in a parallel mode and it is indeed possible to use it in parallel. And then we just have this central issues index, which will glob everything that's within the issue subdirectory that we have, which might be nothing if we have no issues, but we always have issues, so there will be stuff inside that file. And when we render that, we get this nice, pretty format. And we could do a hell of a lot more with this if we wanted to, but for the sake of talking here, this is good enough. When it comes to actually enabling every single extension all that you need to do, if you don't have your extension published on PyPy, then you can include it as part of the documentation itself. If you do that, you need to mess around with Python's path just to make sure that it actually realizes the extension exists and can use it. If you use the Sphinx quick start tool, it includes all of the necessary boilerplates that is all commented out, but all the boilerplates at the top of the file, you can go and use all of this. And in this case, I had an issue role, an issue directive, and an issue event extension. And each of those, I just register by pumping it in there. And if I'm using my own extensions, that doesn't mean that I can't use other people's extensions. So there's the Sphinx contrib organization on GitHub and the legacy one that still hasn't been taken down on Bitbucket. If you go searching through there, you'll find hundreds of extensions for pretty much everything that you can imagine. From the OpenStack perspective, we have extensions for things like order generating documentation for all our configuration options. We have them for all our command line tools. We order generate all the documentation for that. Basically if we can order generate documentation, then we order generate it because people aren't going to write it otherwise. So it's better if we can pull that stuff out of code. So yeah, quick summary of that. The stuff that we went through here, directives, roles, connecting and disconnecting events. You've also got a lot of other stuff. Builders are usually the thing that will generate particular files in a particular output format. So Sphinx provides builders for HTML, for latex, for through PDFs via extensions, get text. Pretty much anything you can think of is there and if it's not there, then there's an extension on PyPy that will let you do this. Domains are a way of kind of grouping roles and directives together. So there's a domain for Python available in Sphinx. There's a domain for C, possibly one for C++. I don't know, I've never used it. Again, like in PyPy, there's other ones there for Java. There's ones for Ruby. Pretty much any mainstream language is going to have a domain available for it. You can add your own custom events if the events that Sphinx admits normally aren't enough for you. So for example, if you decide that your extension wants to emit a custom event and then other extensions could hook into that, you can do that. And then of course you can add your own node. I don't really know why you'd want to do that instead of a directive or something, but you can. And at the end of the day, that's all that matters. And there's a lot of other stuff as well that I'm not going to go into. HTML themes, parsers, search languages, that kind of stuff. To be honest, the only one of these that I've ever gone anywhere near is HTML themes. But they're there again if you want them. And yeah, that's a very quick summary of everything you can do in Sphinx. Thanks for listening. Time for questions. Yep. No difficult ones. I have a question. I have the impression that when you develop all these extensions, the content depends on the code. So if you want to get the content, you have to publish the code that understands the content. Do you mean the code as in the code of the application or the code of the extension? The code of the documentation. I mean, the code. Yeah. For instance, if I write a markdown code, and I put it up, for instance, another person from markdown interpreter can generate the documentation they develop. If I do the same thing with ask.doc, another person with ask.doc engine will be able to generate the documentation the same way I do. But with the structure of the structure together, I have to gather the code after 2,000 extensions to get the content after 2,000 extensions to be able to generate the original content. So the question for the recording is, like it's a disadvantage that unlike markdown or ask.doc or something, if you start writing these extensions, your documentation becomes dependent on these extensions and it's not possible to build it without those extensions. So that's a trade-off, essentially. If you want, there's nothing that says you can't go and write plain old docutils, restructured text. So without any of the stuff that seems to give you like cross-referencing in that, that will render on GitHub because they use like a variant of REST-HTML that will render on any of these like restructured text renderers online. And it will look lovely. But if you want to build upon this, then you are trading off a certain amount, not so much portability, but you do need to keep those extensions, you need to publish them on PIPI or include them in documentation. And that's no different to, for example, if you wanted to write your documentation in like text, you could write it in like the plain old text primitives or you could go and install a couple of packages from, is it CPAN? And this will give you all this functionality that you don't, so you don't need to worry about it. Same thing with writing code. You can go and implement your own, if you're not talking about Python, if you're talking about C, you could go and implement functionalities to do something or you could pull in a library. And it means you now depend on that library, but you don't have to do all that work manually. So, it's a question to trade-offs. And when we're talking about something like configuration documentation in OpenStack, like the compute project has, it must have a thousand options. There's no way that we want people to have to go and like every time we change, put a full stop in a couple of options, go update all the documentation for that, we want to make that automatic. Is there any catalog of extensions that you can use? So, is there a catalog of extensions? There is. There's a, I can't think of the name of it now. There's a project on Read the Ducks where they go through PyPy and they just search for, there's markers that you can use and that will say this is a Sphinx extension, a Sphinx builder and so forth. I can't think of it off the top of my head, but I can put the link into the slides when I re-upload them later. We have time for one more. I want to add another language to Sphinx. Where do I start? Do I use the inter-domain and ask my language and translate it to rxt or do I add a programming language? You mean what kind of language? Programming language. Programming language. So, if you want, the question was if I wanted to add an additional programming language, how would I go about doing that? Yeah. Probably the best thing to look at is it would be a domain. Well, a domain is your way of grouping all of the stuff together. If the language has something like doxygen or something, there's bridges that exist for doxygen to Sphinx. You could use one of those. Otherwise, you need to implement some kind of like a parcel that will pull in whatever you want from your documentation and then emit that. A good place to start would be to look at an existing domain, so something like the Ruby domain that's published outside of Sphinx because you'll see how that's evolved separately. Any other questions? We'll take them outside, but thank you very much, Steve. Thanks, guys. Thank you very much. Appreciate it.