 I thank all of you for coming. So who am I? Why am I standing here? Why are there no questions at the end of this talk? So about five years ago, I helped co-create Read the Docs, and that is probably why most of you know me. About three years ago, in an attempt to build a community around documentation, we created our own conference called Write the Docs, which I help organize, and there are no questions because that is how conference talks should be. Come up to me at the end of it and ask questions, if you would like to. So what are we going to talk about today? First thing I'm going to cover is why you should write documentation. Because even if you think you already know, hopefully I'll be able to, like, clarify some of the points in your mind as to why you should write them and then also give you some ammunition to talk to other people about why they should write them. Then we're really going to cover kind of the meat of the talk, which is restructured text, sphinx, and Read the Docs. So why should we all write documentation? My favorite is a selfish appeal to reason. I know how many people here have sat at a command line looking at a piece of code and they're like, all right, get blame, not me, not me, not me, not me. Right? Like someone you a year ago is indistinguishable from someone else that wrote code. And so when you actually have something loaded into your brain and you're actually in the act of writing code, writing documentation allows you to kind of save that state in a way that it can be loaded into other people's brains or your brain later. And this is hugely important because, like, so much of what we do is reading code and, like, loading state into our brain. And any kind of documentation that we write about the code, whether it's a comment, whether it's a tutorial, anything like this, really is just super valuable for people that are actually reading and using software. As open source developers, we want people to use the code that we write. I view documentation as marketing from developers to other developers. Right? Like, if I just put something on GitHub and there's no read me, zero chance I use that library. Right? Like, who does that? Who's like, yes, a read me, or a GitHub repo with no read me. I'm going to use that software project. Like, yeah, it just doesn't happen, right? And so writing documentation, writing tutorials, writing, getting started is really like the only way for people to really actually use the software that you create. And as open source developers, they're just developers. Like, nobody wants to write code that nobody uses. It just doesn't feel good. You have your, you know, project canceled at work and nobody gets to use that code. It's a waste, right? So we all write code for other people to use it. And so writing documentation really allows that to happen. And it makes your code better. There's, like, this whole kind of movement towards read me and driven development, right? And there's this habit of just, like, starting on projects and you get, like, super deep into code. And you're just kind of, like, you're marred with, like, the implementation details. And then you kind of get in and you finish the project with almost no thought of, like, the actual public-facing API for the code that you just wrote. If you sit down at the beginning of a project and write a simple read me that's, like, how will users of this software actually interact with the code I'm about to write? They'll end up with a much better designed piece of software so that, like, that kind of public API can actually influence the implementation details below it. And so writing documentation before you write code but also really writing down the ideas of why software is written the way it is makes the software better. Writing a design document and actually justifying why you made certain design decisions in your software will make you think more about those design decisions and will force you to actually, you know, clarify the ideas in your mind. And it makes you a better writer. I really loved Lacey's talk here earlier. We're open source developers. Literally 100% of the communication that we do with each other outside of this room is with the written word. Like emails, GitHub issues, commit messages, comments, documentation. All of this is technical writing. We just had a book author panel here and it looked like the idea of writing is a completely separate skill than the idea of writing code. And writing documentation makes you a better writer, which makes you a better programmer, which makes you a better open source community member. So that's why you should really be writing documentation. Hopefully that gives you some ideas that you can share with other people. So let's get in kind of to more of the technology. So this is basically kind of how everything stacks up. So we're going to start with restructured text, which is one of the worst named software projects in existence. I usually just say RST. It works fine. It's an example of a lightweight markup language. Who here has heard this term before? So about 30 or 40%. Lightweight markup languages are just kind of a plain text format that are used to generate other formats. Traditionally, you know, wiki markup is a classic example, markdown, ASCII doc, restructured text. And they work really well with programmer tools. That's why we really care about them, because all these tools we've built to work with source code, that works as plain text files, works amazingly well with lightweight markup languages as well. GitHub, diffs, pull requests. All these tools work the same. So the real power of restructured text and of HTML, honestly, is semantic meaning. So there was a movement about 10 years ago in the HTML world towards like semantic HTML. And what semantics really is, is saying what something is, not what it should look like. You know, you describe an object as what it is, and somebody else can come along and say, this is what it should look like. This is kind of a classic separation of concerns kind of issue. So an example of this, right, is like, don't make issues bold in your HTML. You know, you should take your HTML and give it a span or a div or something and make that an issue. So this allows you as the author to separate what something is from how it should be displayed. And someone later can come along and write CSS to say all issues should look a certain way. Another example, don't make something font color red, right? That means nothing. As someone reading this, I have no idea why something is red. But when I say something's a warning, I know exactly what it is. And so this bottom example is restructured text for, you know, this is a warning. So the really cool part about this, right, is it is separated from its format. Like the bottom part that's restructured text can be generated into HTML above. It can also be generated into a PDF. It can be generated into a man page. It is independent of the output format. So another thing about semantics, restructured text gives you this power, right? Here on the bottom, you can say just like, I want to link to pep8. Usually I'm linking to pep8. An example of something that doesn't have semantic meaning is markdown, right? Here it's like, check out pep8. But it's just a link to a URL on the Internet. There's nothing about, like, the object that actually is defined in this markup. So say they change the URL for where pep8 lives on the Python website. You know, this bottom one we have, we just go in and say, hey, change the pep function to generate a new URL. This other one we have to go in and look for every single, you know, thing that might be a link and, like, actually have a human look at it or do some kind of transformation, right? It's super important for actually writing software documentation as well. So semantic markup is super, super important. It really shows the intent of your words. And it works across output formats. And then you can use that to kind of style things a little bit differently. So I just want to kind of touch on this because a lot of people talk about, you know, markdown, it's like this amazing thing that has made the Internet better. Markdown is just shorthand for rendering HTML. It's really not a good tool for writing software documentation. Restructured text is a little bit more complicated, and because of that it's a little bit harder to write. But that complication is there for a reason. There's a part of the design that's actually important for that complication. And so if you care about the words that you're writing, you should write them in a way that preserves semantic meaning. Like, if you're writing in a way that has no semantics, like, you're just kind of losing information in your brain that's not making it into the pages and the documentation that you're actually writing. So RST, we're going to get into, you know, kind of what this actually is and what it looks like. It is whitespace sensitive, just like Python. It's extensible, and it's powerful. Slightly awkward. And so I think it's really kind of useful here to just really show you what this looks like. I know a lot of times it can be hard to kind of wrap your head around exactly what I'm talking about. So this is a basic restructured text document here on the left side. And the right side is just rendered HTML. In a lot of ways, it's very similar to Markdown, right? Like, we can make things bold, we can make things italics. But the real power of restructured text is, say, we want to add a table of contents, right? If we're doing this in something that doesn't understand what a document is, you have to just go in and look at all the headings and it links yourself, right? But with restructured text, we can just say, add a table of contents, and it will automatically create a table of contents knowing everything about what is in that document. So that's an incredibly powerful thing, right? I see these handwritten Markdown, like, table of contents on the internet, and it just fills me with sorrow. This is what computers do, right? Like, why is a human doing this thing? So, restructured text. It's this kind of Markdown-y thing that's a little bit different, and the main way that it's different is that it's extendable. So the big thing, as you just saw with that contents directive, is it has the concept of page-level markup. And that's a thing that starts with, or a line that starts with two periods and a space and then some other markup. And it ends at the next unindented line, very similar to Python. So directives are kind of the main example of this. So you have, you know, dot, dot, directive name, colon, colon, as with the contents example previously. And this is really the main thing, right? It's basically a function call in your documentation. You know, that directive name can be anything. You can write your own, but it's just giving you that ability to really, like, do more programmatic tasks inside your documentation. And this is really where Sphinx builds on top of restructured text, adding kind of programming-level concepts into the markup language. So a directive example, right? Code block, Python. That's the second line there is just a line number option. So this output will actually have line numbers. And then just indented code example, right? Like, it's pretty simple. And this turns into output that is syntax-highlighted code example that has line numbers, right? If we remove the line numbers option, the line numbers go away, right? It's pretty simple. But how do you, like, good luck doing that with markdown, right? Like, there's no way to even, like, conceptually think about doing that in markdown. So the other option that restructured text gives you is inline markup. And this is anything that's included kind of within the paragraphs and, like, the content of the text itself. And this is mainly used for, yeah, just including things inside, you know, making things bold, making things italics, all that kind of stuff. So this just looks like a set of colons with an arbitrary role and then backticks with an arbitrary target. So again, the pep8 example here is an example of something like this, right? We're saying, like, put in something that's called pep and give it an argument of 8. And that's basically just, like, another type of function call to generate that documentation. To generate a link to pep8, but really it can do anything, right? It's just Python code on the back, in the background. So one of the great examples of this is references. So here you can see on the top we're defining a label for this section. And down here we're actually referencing that section with the label that we defined. And so this is really powerful, right? This allows you to actually reference documents in a semantic way across your entire set of documentation, right? The way of doing this kind of traditionally is, like, just put in a link to a URL that is, like, the page URL of when it will eventually be rendered in HTML on a production server is, like, the way to do that in Markdown, right? Or a relative link to an HTML page. But this works with PDF output, for example. It's much more, like, you're just telling it what to do and it's doing it for you. And this is what it renders into, right? Like, there's a heading and the reference automatically takes kind of the title of the heading. So I find that the simplest way to think about this is, you know, for Python code there's something at, like, a module level that's more similar to page-level markup and there's something inside of a class level or, like, a method or something like that that's inline markup. These are just two different kind of ways of extending pros in restructure text. So that's basically, you know, directives are the main one and something called interpreted text rolls are basically those little pep 8 examples that I gave for inside paragraphs. And so you can actually that little live preview that I made you can actually go play with it online at rst.ninjs.org. So if you just want to kind of play around with a syntax, that's something you can do. So now you have RST. So then Sphinx takes RST and really makes it into a really amazing documentation tool. Basic Sphinx layout is a conf.py which is a Python configuration file a make file that just allows local development a little bit easier and then a bunch of restructure text files. To build them, you just run make HTML. So if you pull down a Python project chances are it'll have Sphinx documentation and if you want to build those docs locally you just go into the docs directory run make HTML and you have all of the HTML documents in your project. You can do this with Django. If you have a Django checkout I've been on a plane and been like I need the Django docs. I can just go into the Django checkout on my machine and generate the HTML documentation. It's really pretty cool. So Sphinx is the best documentation tool I know of. I've been working on Read the Docs for about five years and I've looked at a lot of other documentation tools and they're all pretty much awful. And Sphinx is pretty good. It's not amazing but it's definitely the best tool out there that I know of. And it was actually created to document Python. Think about 10 years ago I think they documented the Python language itself with a bunch of Perl scripts and they were like alright we need a better way of doing this. So they built Sphinx to actually document Python itself and then it turned into an open source project that could then be used by members of the community to document other pieces of code as well. And so I love Sphinx so much I built an entire website around it which is what Read the Docs is which we'll get to in a bit. But Sphinx takes that baseline of an extensible markup language that is restructured text and then adds some really cool stuff to it. The big thing is the talk tree. It's a table of contents tree and this is the way that Sphinx actually adds structure to a set of documents. If you just have a directory of files there's no way to say this one should come first, this one should come second, this one should kind of below that other one. This is how, for example, Sphinx generates its sidebar navigation. It actually builds the structure and links all the documents together in a hierarchical way. And that as a code example just looks like talk tree with a list of pages in it. Cross-referencing as I showed earlier is really, really cool. Restructured text has cross-referencing within a single page built in. Obviously it gives you cross-referencing across an entire project. And then there's actually an extension called Intersphinx that lets you reference third-party projects in a semantic way as well. So if I want to reference the keyword part of the Python documentation in my docs I just say, you know, reference Python with the keyword. And so that will actually take me to the reference for keywords in the Python documentation and generate a link in my own documentation. And if Python moves where the keyword references I just rebuild my docs and it automatically relinks to where Python moved its references to. This is super, super powerful because it makes your documentation way less brittle. You're just able to pull in exactly all of the references, all the documents from any Sphinx project and reference them in a semantic way and not just be like, I'm just going to link to this on the internet, it's some URL that's probably going to break. You can also reference documents explicitly and not just, you know, define references. So if you have an install doc and a support doc you can just say, you know, doc support and it will generate the proper reference for that document. And then Sphinx really adds all of these concepts for software. You know, environment variables, objects, classes, file names, man pages, rfcs, peps, all of these concepts that only make sense for documenting software is kind of what Sphinx adds into the restructured text kind of base language. And so that's why it's really this amazing tool for documenting software is they built this entire kind of vocabulary and this markup language that is specifically used for documenting software. The other big one is it does syntax highlighting, right? So Pigments is, I'm sure most people here are familiar, it's just a syntax highlighting library, I think GitHub uses it. It's used pretty much all over the internet and it just gives you syntax highlighting which is nice for your user's output. And the other big thing is it itself is extensible, so Sphinx actually has its own set of extensions that do all sorts of really wonderful things. And then you can also write your own, right? So you can hook in to the build process of Sphinx and really make it do whatever you want it to do. You can actually test your documentation examples with the doc test runner. So if you have code snippets in your docs, you can verify that they work and execute properly with the doc test extension. There's coverage, so you can actually see which of your API modules or how much of your Python code is actually covered by your documentation. There's all sorts of, you know, graph is support to-do lists, all sorts of other stuff in there that really just makes your life a lot easier when you're writing documentation. And Autodoc is the other really, really big one that Sphinx does is you can put your code and put them into your documentation. So the really interesting thing it does here is it actually allows you to mix prose content with auto-generated content. Like a lot of tools like Java doc or all of these other, you know, language doc just give you kind of a full reference in some kind of relatively ugly HTML output. With that you have no control over. Autodoc actually allows you to write written words with documentation generated from your source code. You'll see this with Django a lot, right? Django doesn't just have a huge list of functions and classes and stuff. They have descriptions and contextualization of those functions. Then an example that's pulled from the source code and then more prose mixed in. And this is just a much better user experience. Users can actually understand your documentation as they're reading it rather than just having a huge reference that has no context for them. So Sphinx is a documentation generator. It's main thing is it takes restructured text files and turns them into all sorts of other outputs. And it adds a lot of really, really nice ways to just document software specifically. So Sphinx was existing in the world. It's a really, really amazing tool. And so Read the Docs came along and there was something that I helped create. And it builds and hosts Sphinx documentation. At this point, I think it's kind of the de facto hosting provider for most Python documentation. We host Django's PDFs and EPUBs. We host requests, Fab, Brick, PIP, all this kind of stuff. It was actually created in 48 hours in the Django Dash in 2010. And it provides a lot of stuff kind of on top of Sphinx, but it's mostly hosting documentation. So I think the origin story is actually really interesting. The Django Dash is basically a 48-hour coding competition that used to be held every year. It hasn't happened in a few years, I think. Charles Leifert and myself and Bobby Grace, who's a designer spent 48 hours and just kind of built the proof of concept for this project. We looked around and we're like, it's really hard to host documentation for Python. We have a bunch of HTML and they just host it. You have GitHub pages, which is basically the same thing, except instead of a zip file over HTTP, it's, you know, HTML over Git, but it's just static files sitting on a web server. And I was running my own cron jobs. They were just pulling down my repos every five minutes and, like, automatically building the documentation and hosting them, right? Like, we're web developers. Like, we have tools that solve GitHub. You know, we have all these tools. And so that's really what we did. We were just like, all right, we should be able to commit something to GitHub. It should have a web hook that automatically pulls it down and builds our documentation every time we update it. It shouldn't run every five minutes, and it should automatically just pretty much work. And so that's pretty much what we built. We just had this super, super simple proof of concept. It was like, you know, GitHub, HTML, basically. And it was open source, so the code today is still open source, actually. And so fast forwarding to today, it's been kind of an interesting experiment in open source and kind of community web sites. I think we are almost at 6,000 commits. We have a bunch of other kind of crazy random stats there. The one that kind of blows my mind is that we do 15 million page views a month, which is basically one of the largest sites on the internet. It's kind of insane that we're just kind of hosting software documentation, but I guess there's a lot of software that gets written on the internet. But yeah, it's kind of this crazy thing that's kind of become a real thing that exists on the internet. This is kind of our Google analytics, which I always put in this presentation just in the spirit of open source, open stats. I love how you can see Christmas in there every year. That's kind of cool. We are relatively US centric, but I think China is the second biggest market that we have. Software really is a global phenomenon, which is really cool. We actually have users from every country in the world that visit the site every month, which is kind of a cool random stat, at least according to the little map that Google Analytics gives you. So why are people actually using this? What is kind of the value of the thing that exists on top of Sphinx? The big one, I think a lot of you have seen, is the theme. We used to have a really ugly default theme, and then we pushed a new one, and then someone said it's like the Python world got a facelift overnight. We started auto-building all these docs for all these projects, and then it just magically became pretty. You might have seen this theme somewhere. It's used by a lot of Python docs. So that's kind of our theme that we actually built for Read the Docs. This is versions. So all your tags and branches from your version control can then be hosted as documentation. You don't just have one version of your docs online. You actually have every piece of software or every version that you've released. Your documentation is built and hosted so that if somebody has three versions behind, they're not just totally screwed because there's no docs for their version on the internet. Again, post-commit hooks. So you push your code. It lets us know we automatically pull down the updates, rebuild it, so your documentation is always up to date. We actually recently added markdown support. While I have railed on markdown a lot in this talk, it is actually useful for some things. If you're not referencing your source code a lot, if you're just writing pros with links to other websites, it actually works really well. So you can actually do that now by just having an RST or a .md file extension. We do do translations. So you can actually... Sphinx will generate getText output that you can use with like trans effects or anything else like this that actually allows you to translate your docs, which is cool. Localization as well. Readthedocs.org is localized into ten different languages. So if you go to the site in China, it's actually presented in Chinese, it said eight there, so that must be the real number. Search. We use Elasticsearch for everything. It's amazing. So we actually index every document that we have that we host so you can search across all of them. We do CNames, so lots of people use Readthedocs and you probably don't know just because they're on a separate domain. Fabric, for example, uses us, but they have it on their own domain. We generate all these multiple formats automatically on every push. So you have PDFs and all that good stuff for your docs. And we host everything in a reasonable way. I think we've never had substantial documentation hosting downtime because pretty much everything is served directly from Nginx. And so we never actually have Python code running in the serving of documentation. We just have this crazy nest of simlinks that exists on the file system to support that. But yeah, it's pretty much always there, which is incredibly important for infrastructure services. Because like this room would be very upset with me if it went down right now. There's lots of little small things, like build failure emails. We have Python 3 support. We install your requirements and we go virtualM, all that kind of stuff. So using Readthedocs, it's pretty easy. You register for an account, you give us a URL that has Sphinx documentation in it and you hit build and we pull it down and we build it and we host it and it all pretty much just works in theory. For most of the time it just works. So hopefully today I've kind of convinced you, or at least given you some more thoughts on why you should be writing documentation for software. Hopefully you understand why restructured text is kind of wonky and looks a lot crazier than markdown when you actually go to write it. But there's a lot of power that's there and that's there for a reason. And hopefully you've seen kind of why people are using Readthedocs. So Readthedocs is an open source project. It is still predominantly developed by a very small team of people. So this is something you're interesting and please come help us, you know, triage GitHub issues, help write docs, really whatever you're interested in doing. If you have a large company that wants to get back to open source, we're always looking for sponsors and we are trying to do some kind of sustainable business model around private hosting as well. So this is something you've always wanted that we host for you but it's actually private. We're now trying to do that as a business as well. So I'd like to finish this talk with what really shows the kind of the value of documentation to me the most and I think it's been referenced a couple of times here at the conference but it's, you know, I can't say I'm self-taught. I've been taught by the people who wrote the documentation. Thank you.