 I'd like to introduce Mason Egger who's here to talk about continuous deployment of documentation. Take it away. Awesome. Thank you, Jeremy. Hey, everybody have a good lunch? Yeah. Yeah? It's pretty good. Let's give the PyTexas organizers a round of applause for that. Awesome. So this is my talk, Building Docs Like Code, Continuous Integration for Documentation. This is my first time giving this talk as a talk. This is usually a very angry rant whenever docs don't work the way I think they should so I have to like make it more professional and more good for a professional audience. So I'm Mason Egger and here we go. So quick, who am I? I'm a site reliability engineer on the cloud platform team at HomeAway. We're based out of Austin that we have a downtown office and a north office in the domain that I work at. I'm also a volunteer educator with a program called TEALS. It is a Microsoft philanthropy that pairs industry professionals with teachers that may not have traditional computer science backgrounds but are being forced to teach this class inside of the classrooms because they don't have access to, you know, the knowledge that people want their kids to learn this stuff. So if you're interested in that at all, feel free to come and talk to me at any time during the conference. I love talking about that. And as anyone on my team has ever been able to tell you, if you ever ask, I'm a documentation fanatic. I very much think that bad docs are the reason why we have bad code. So we're going to talk about how to have good docs. So who is this talk for? This talk is for pretty much everybody, but this is a good talk for open source maintainers, junior and senior level developers, program and project community managers, DevOps engineers, educators. The list goes on and on. The TLDR of that is anyone who writes, maintains or manages a product that they intend to share with someone else, this talk is for you. So Act 1, the conflict arises. How we manage our documentation. So a common approach to the way we manage our documentation. And it's a tale as old as time. There's a developer sitting up at his desk at 2 a.m. in the morning banging out some code. The developer writes the code, the developer becomes happy with said code and commits the code to a version control system. And then once he is happy with it, he sends it off to his coworkers and they do reviews and tests. And then we come around to this decision. Is it time to release this code? Well, if it's not, the developer rejoices, goes back to writing code because that's what developers like to do. They like to write code. If it actually is time to release the documentation, they've grown internally. They kicked themselves because now somebody has to write these bloody docs. So now we're doing that. Someone could be the actual developer themselves. It could be a technical writer if your company is lucky enough to have technical writers. I have worked with some. They're amazing. But that was my last job. My current job does not have technical writers on staff. It could even be a different developer, a new hire or an intern. Because I have seen common things before. It's like, oh, you're the new guy. Go write the docs, which is a terrible idea. And we'll get to that later. Issues with this approach. The documentation in this approach is almost an afterthought. Your code is great and you need to tell people that your code is great. And if it's an afterthought, you're not really getting, you're not optimizing that. There are long release cycles where things can be forgotten. We all try to do agile methodology, most of us. Two-week sprints, iterative release, blah, blah, blah, blah. All that stuff that we say we do, but we don't actually do. But I worked with it. My first job was at a company where we shipped a physical hardware device every six to eight months. There was no iterative development on that. I didn't get up every two weeks at the end of my sprint and go, OK, let's ship a thousand more blade servers. It didn't work that way. So I would get to the end of a development process, eight months down the road, and they come and ask me about something I wrote on day one. And I'm like, oh, I don't remember how that works at all. Let me go look at the code. And then we're in this really vicious cycle. The more layers of separation between the implementer of the code and the author, the more likely you're already get for an accurate docs. Technical writers are great, but they didn't write the code. Like, they don't know all of the nuances of it, so they're likely to get some mistakes on it, especially if you're not working with them to help them get the code, get the docs right. And one of the issues approach with this is that the developer dislikes documenting. And that's actually a really big problem. Like, why is developers do we not like documentation? Like, why don't we like it? Developers enjoy writing code. Developers enjoy talking about their code. So why don't we like writing about the things that we're talking to all of our buddies about? It doesn't really make sense. So the real issue here isn't that developers don't like writing documentation. The issue is, is what they dislike is the workflow that we have enforced upon them to write their documentation. A developer has a very finite or very nuanced workflow that they like performing. They like writing their code in VEM or VS code, committing it to get, testing it on their laptop and all of that. But every time we ask them to document something, we make them break this workflow. We make them context switch out of their preferred environment to go use somebody's clunky UI, some, you know, amazing what you see is what you get editor that actually is not what you see is what you get. And then, or that also has a search feature that I would be better off asking dev you random for a result than actually the search feature working. So how can we integrate our documentation process into a workflow that developers will enjoy? What if we treat our docs like code? You know, the Occam's razor in the room would be if developers like writing code, then why don't we just make the docs code? What if we have our docs living, instead of living externally to our code, they live right alongside it, whether this be in the same exact repository as the code or in the same, perhaps, GitHub organization as the code for maybe long-form docs or different types of docs. And what if we used a markup language or a markdown language that developers already know that they've been using forever that is, you know, way better than what you see is what you get and doesn't add an extra level of complexity so even like new technical writers who may have never used markdown before could actually use it because it's relatively simple to use. So let's treat the docs like code. What do we mean when we say treat docs like code? Docs source files are stored inside of the version control system, which means if you were using Git or Mercurial or anything, all of your docs live alongside it to get a Git clone from my docs to Git clone from my source code. We build the docs, the document artifacts automatically. Documentation is just as much of deliverable as your wheel file is or as your executable or any of that things. Docs are artifacts and they should be treated as such. Ensure a trusted set of reviewers meticulously reviews the code. We spend hours doing code reviews to make sure that we don't accidentally break our code base, but we don't spend hours reviewing our docs to make sure that we don't hand off our code to somebody else to make sure that they don't break their entire system because we didn't document it properly. Why is that? We also have to make sure that our docs are tested both for accuracy and for functionality. There is nothing I hate more in the world than going and doing an example from a documentation and then trying to type it out and then it doesn't work because it's wrong. The source code in the docs wasn't correct. So accuracy for it doesn't actually work and functionality, do all of my web pages render correctly, do all of my links, if I have hyperlinks in there, do they all resolve to an actual place? We should be testing our docs for all of these. And we should be publishing our docs without much human intervention. We have CI-CD pipelines for build artifacts. We should be using all of these to also publish our docs to our web pages, to our wikis, or wherever you want to post your docs, we should be doing this also. What do we gain from this? It promotes collaboration, which is a great thing because not only are we writing our docs for our customers, or we're writing our code in our docs for our customers, we are now writing them with our customers. If, you know, maybe we're not the greatest at English or we don't have like a degree in education and know how proper things need to flow to make it where the docs are understandable, but maybe one of our users does. And they want to, you know, submit a PR and fix the docs for us. Hey, that's great. That's work I didn't have to do. I love work that I don't have to do. I never do work I don't have to do. Okay? You can ask my boss. I'm really good at it, you know. I think there was the old Linus Torvalds quote, is that, you know, intelligence is the ability to appear to be doing nothing, but yet still getting everything done. You know, it's often working at this morning is often a very first step for somebody to collaborate into an open source project. So if we have our docs out there, we're more likely to maybe start fostering more open source contributions into our projects because we have our docs in there. We can track documentation mistakes as bugs. In my collegiate years, I spent a little bit of time playing around in the BSD lands, and I actually have a defunct YouTube channel that is me doing nothing but BSD tutorials. Don't go look for it, it's awful. The open BSD community treats every bug in their documentation or the lack of documentation as nothing less than a critical or a P2 bug. And that's how we should be treating incorrect docs. We treat code, you know, if my code loses me all that money that it did in one of our talks earlier, that's a pretty big bug, but if that was an on-off switch and we improperly mislabeled the on-off switch in the docs and, you know, we cost everybody else to lose it, that's the same level of bug We include docs in our code reviews. This is actually really nice because, you know, you write a new feature and you write the docs, and when you're reviewing the code, you can review the docs, and it just, you know, you now have more sets of eyes on the documentation. It allows us to make our docs more beautiful. We have all these wonderful static site generators like Sphinx and MK docs and all of these other things that can make wonderful documentation where I know that most of y'all probably don't have art degrees and you don't like dealing with how things look so let somebody else do that. We have tools that can do this for us. It allows us to leverage our current workflows that we already have. We have amazing workflows for building, you know, software, agile process and all of this. Now we can apply all of these same workflows to our docs and it empowers the developers to document. If the documentation is closer to the source code, I have found that the developer is like 10 times more likely to actually edit it. And I have a case study for this that we did at HomeAway. I guess a case study and, you know, they were my guinea pigs. It's kind of a six-in-one-and-a-half dozen and another. There was a new team that was formed at HomeAway to build a brand new product. It was inside of my organization. We had a new GitHub org, new team members and everything. The first thing I added to that org because I was on this team was a base documentation repo that was allowed us to do all of our long-form docs, all of our architecture decisions, all of our readme's, all of our how-tos getting started all in this repository. Throughout the entire time that this project was active, these docs were the most up-to-date and well-maintained in the entire department. They were giving retrospectives every week, every two weeks at the end of our agile process and they had new docs every two weeks. Whereas another team that I was also working with at the time updated their docs every six months, every quarter, you know, basically whenever the boss yelled about it because he couldn't give it to somebody else, that's when the docs got updated. I've seen it work where people will actually work on the docs more because they're already in there. They're already in their Git repository. They're already in their source code and their code editors, I use them personally. And it's not that difficult just to open up the file and change it real quick and then commit it with your source. So how does this change the workflow that we currently saw earlier? So now the developer writes code and docs. The developer commits the code and the new docs to the repository. The code reviews, code reviews and testing and docs reviews, we get all of that same stuff. Same process. Is it time for a release? No, haha, hooray, I'm back to writing code. And if it's yes, the artifacts are published and the developer never has to go to that side of the screen. It's that side because I can't do things in reverse. So they never go over there anymore. We don't ever have to worry about taking them out of the process for writing code because all of that process of publishing them and building them and all of that has become automated for us. So act two, a hero emerges. Who can tell that I'm excited for Avengers? Anybody? Yeah. So CICD for documentation. Quick definition for those person people in the room. Continuous integration means that code is continuously tested, integrated with other code, changes and merge. Continuous deployment means that code is continuously deployed with each patch to the entire code base. This is very similar. You do the same thing with your docs. Your docs would be continuously tested. Your docs would also be continuously deployed. What does this mean for docs though? It's a little bit different. It means that every time that we do a patch, we are building a full version of our entire documentation. So if you have like a whole giant web page that posts all of your API docs, you would build this web page every time that you build the patch. You are continually testing the content with each patch. And there are some pretty interesting documentation testing tools that you can do with this. Or you also, if you have technical writers, they will be able to review this and actually read it for you. You are publishing automatically with every release. You are versioning your docs. This is probably the most important thing you have to do. If I am using, you know, my library version 1.0 and my library version 2.0, but the only docs that you publish for my library docs, oh, I don't know which version I'm using. I think that's my favorite part, like read the docs because it has the little version checker in the bottom. That is the best part of it. I mean, there's probably a lot of other amazing parts, but I like that part. So a quick introduction to just two different types of documentation because I saw a talk similar to this once and didn't know this. There's two or three major forms of documentation you'll deal with, long form documentation, which is user guides, getting started, FAQs, all of those things. These are the kind of docs that will live in a separate repository inside of your organization. Like these necessarily don't live in the same exact repo as your code base because they really aren't tied to code. This is really more of an overview of the product, maybe an architecture decision, things like that. Then we have the functional documentation, which is the documentation that actually lives inside of your code base, inside the same repo. And these are RESTful APIs, SDKs, man pages, things like that. Those are the code that you would see, the PyDoc, the inside of Python, you would see your documentation above it. That's the kind of documentation this is talking about. So we have some really amazing documentation tools that can help us with this. There's three different types that I've kind of roughly classified. There are static site generators. These are the ones that are good for your long form documentation, your FAQs, your runbooks, all of those things. Your source code base documentation generators, docs that live inside the code, PyDoc, JavaDoc, the source code that, basically annotations, code annotations for your documentation. Some even generate clients for testing, like Swagger, if you've ever used the REST API document editor, Swagger, you can actually generate tests on your docs based on this. It's really cool. And then I put in system documentation generators because I found a really cool package called RON, which is a markdown based man page generator. And I think the name is hilarious because the markdown format is raw and it's raw and it's the opposite. I think we're mostly with people I was talking with last night. Every single name of every package in the computer science is a developer laughing at his own inside joke. So that's all that is. I thought it was hilarious the first time I did it, the first time I saw it. And it gets less and less funny every time and that's exactly how that joke should go. First tool I'm going to talk about real quick is MKDocs, since this is the static site generator type of documentation. These are markdown based documentation with a YAML based config file, relatively straightforward. The time to hello world on this is probably about 30 seconds. Like to get a working running implementation of this, it is insanely simple. We use MKDocs a lot at HomeAway for our long form documentation because the developers at HomeAway really like markdown and all I do is write markdown and then it gets out of their way. It is easy to configure. There are many different extensions and many different themes that are supported. It is Python based, so if you don't like any of those extensions or themes, you can easily extend your own. It just uses Genja 2 and some templating files. It's really all it does. I think one of my favorite parts about MKDocs and UML, and this is just a personal pet peeve of mine, is that you can now support flow charts and sequence diagrams with the right extension in markdown and it will actually render the flow chart and sequence diagram inside of your documentation. Because the amount of times that I go to an old page and like, okay, who made this diagram? Was it in Gliffy or did we use Google Docs? No, we used the other one, but who had it? Well, that was the guy that was here three years ago, but he quit and now he like, you know, heard sheep in Montana. We're never getting that thing filed back. So now I have to rebuild all of these diagrams. And I'm tired of that. I spent an entire night one time building a diagram for a meeting that we just happened to have tomorrow. So now I'm like, now it's in the source code. Everybody can edit it. We never lose track of these diagrams again. That was a personal pet peeve and a rant of mine. If I only get like two or three rants in this talk, it's going to be a good talk. So documentation tools, Sphinx. This one is another one of my personal favorites. I find it slightly more difficult than MKDocs, but that's because, you know, I think that markdown in YAML is really easy. It's restructured text-based tool that does support markdown. It's the most common tool for creating SDK documentation form and code documentation. I forgot to put in Python. So this is the one that most of you who've ever done a Python SDK or something, you very likely use Sphinx. It can out port to literally any form of media that you ever ask it to ever, including latex. People like, which one's harder? Sphinx or latex? Well, Sphinx writes latex, so it's obviously smarter than latex. And it might be censorship. I'm currently uncertain. I'll get back to you and I'll let you know. My favorite part of Sphinx is, and this is the extension they have for doc testing, which is where I can write the code inside of my, inside of my, or I can write source tests inside of my source code documentation, and I can actually run the tests on this documentation to ensure that the code that I put in my documentation accurately runs. And if it doesn't, this build will fail and it will say, hey, your code examples are wrong and you're lying to your users. And I love that because the amount of times that it won't, it will talk me on more occasion than one, so I'm not as good at this as I think I am. And it also catches my teammates and then they get angry at me and I just giggle. So it's hilarious. Just a quick thing, because I have to plug her on because it's such a hilarious tool. Markdown-based man pages, I think are awesome. As an assistant engineer, I enjoy and love man pages, so I'm just going to go with that one. So, Act 3, the final battle, where we have our demo. So here was the issue that I was trying to solve with the open-source project that I have started called Unlocked EDU. I need to be able to create many open-source texts all with a similar format that is production ready to go out of the box. I don't really want to worry about building the texts. I don't want to worry about what the format is or how the page looks. They should just appear. I don't want to spend all my time setting up this pipeline every single time. I want a workflow that jump-starts doc-writers. And the secret one that I don't think I put on the slide is I've implemented this at every job I've ever worked at. I'm tired of re-implementing it every time because I implement it once and forget it. And then three years on the line I have to implement it again. So now I just open-sourced it and next time I go and get a new job we're going to have it ready to go. So my solution, the author will generate docs. The author writes docs. The author will publish his docs and then they're automatically published to a hosted solution. This is the workflow that I want to develop. So the tools that I chose to use this are cookie cutter. Cookie cutter, if you don't know, is a Python-based project generator. It is more often than not used to generate Python projects but it actually can generate anything you ask it to because that's what it does. A text editor, whatever you want to use to edit these actual source files. You're going to do a git commit and push and then it's going to publish directly to the GitHub pages. So now let's see if the recorded demo will play today. It is. The first thing we're going to do is we're going to generate a docs pipeline with our cookie cutter. So this is the source repository for the cookie cutter that I have built. Everybody is able to go get it and I'll there will be a link to it at the end of the slides. So the first thing we do is we just run cookie cutter. It asks me if I want to redownload it because I've done this and it took me like seven tries to get this demo properly recorded where I didn't hate it. I fill in a little bit of boilerplate information. I fill in the name of it, the site repository. This is that description. This is what it has in GitHub. Apparently I decided to take time there. The site author. The GitHub username. This is to make the build badges work and things inside of the code. You can choose whatever license you want to license this under. That was just me being bored and enjoying open source licenses. You select your documentation engine, GitHub pages and what CI you want to use. So what did you get from this? Automatically you get a repo built inside of your directory and you get a docs directory, a pip file that has all of your dependencies, a license file, a docker file, a make file and a readme. I use docker and make for almost everything. So if you don't end up having make on your system, you can use just choose the raw docker files. The thing I missed was the real quick configuration file. So this is just what the base docs look like. It's just a PyTex's demo boilerplate. How would I use this locally? I'm a slow type or sometimes. So the first thing you have to do with this, you have to lock your pip file because as I was doing this, there was a version of ginger that was out of date and it caused a security error in GitHub so I did it. So now you have to generate your own pip file. And then I do a make build and I have now built the docker container that I have to run all of this. And then I do a make run and I can go and view it on localhost and my docs automatically appear and now I have them ready to go. So if I want to edit this, I would just open the docs again and I would add some more text to it. Let me make run it again and we go over and we refresh and now we have, you know, PyTex is awesome. It's right there. It works pretty simply. Other thing you can use, I use something called Biobu which is basically just a fancy wrapper around Teamux. So I can actually leave this docker file running and constantly edit my docs on another screen. I'll get live updates on the other side as much as I want. So I don't have to constantly change the image, re-run, check my docs and do all that. So now I come over here. I've got my docker file running. I add woohoo with an exclamation mark. I'm bad at markdown so it went on the same line. So now I'm going to go fix that and I cut that part out of the demo apparently because I'm not good at editing. Set up your get repo and push. Just a quick little thing on, you know, make sure you give it the same name as you gave your project otherwise it's going to make Travis really unhappy with you. So we've pushed it up and we've likely detected our docs so we go to the build demo or the build badge icon that you get and you open it up and Travis is waiting and thinking and now Travis has detected it and my builds run really fast because that's the power of editing instead of making you sit here for a minute and a half build. Oh no, my build failed. That's not good. What happened there? I failed to deploy. What? I don't know. I trip over every time. I always forget to put my get hub token inside of Travis so I can actually access this and I actually tripped over this three times before I decided I wanted to put it in the demo so I was like if I'm failing on it this many times maybe everybody else would enjoy watching me fail at it too. So I go to my personal access tokens and I create a token for those of you that think you're going to grab this token it's already deleted. So I worked for a security company I know you don't lie to me. So then we just go in and then we add the get hub token as an environment variable in Travis it automatically says it won't display in the build log which is nice because the last thing you need is leaking credentials in build logs been there done that we restart the build it goes really fast because again on an amazing editor everybody's builds run that fast everybody would love it and then you go back to your repo and now this is automatically published on my personal get hub pages it actually just is completely continuous so I add another statement to it which would be I'm going to put one more time I did this a couple hours ago I still don't remember what I did we rebuild it it goes refresh it automatically works sometimes get hub pages is slow now we have a full automated process so I cut out the times me sitting there clicking refresh for like 20 seconds before it actually showed up and nope oh apparently if I hit the play button it starts over again let's go next can you try this yourself? the open source project that I created is called Unlocked EDU it's an open source project dedicated to creating free and open educational resources such as textbooks, curriculums, worksheets and stuff for use in public schools because as I'm a TEALS volunteer now I am finding out more and more that the availability of computer science curriculum for schools that does not cost an arm and a leg for schools to purchase is actually really low so this entire project is built around that all the books that I have built inside of this are built on markdown my brother and I are writing an AP computer science a Java book for this course and a couple of other things the cookie cutter is part of it so all of the systems code and stuff that I'm going to be building for this project will be open source along with the books themselves so you'll have access to them and you can just visit it here I'll put some stickers out on the table if you like that logo because I think it's a cool logo I've got a really cool graphic designer if they run out or if you want more I have plenty I bought a lot so sources that I use for this Docs Like Code is an amazing book if you are interested in this concept of actually being able to treat your Docs Like Code and get more examples inside of the book this is a great resource to use I did footnote a couple of times because I just shamelessly stole from this book but I cited it so it's not stealing it's research so that's how you do it it's a great book I highly recommend it final thoughts every job that I've ever implemented is workflow app both the developer experience and the user Docs have vastly improved it's been amazing how much better it is a short amount of time the Docs can get and the user experience can get stop making Docs as a punishment don't dump Docs on your intern because that's how you get terrible Docs they don't know what you did they don't know the architecture of the system but you want them to go out and write all these Docs like in reality the person who writes the Docs is the person who has to understand the software the most because you're trying to explain it to your users in a way that they can understand and you're not trying to make them drink from the water hose or the fire hose blow their face off so this is the third bullet is probably my favorite bullet if your Docs suck people will abandon your project if you had a lot of potential that died because they didn't they didn't do proper Docs nobody knew how to use their code and in the age of read me engineering where if it takes me more than an entire read me page to actually figure out how your stuff works I'm not going to spend that much time going into it to figure out what you actually did versioning Docs is great we should do a lot more of that we should do tons of that my slides will be they are already available on my website it's really easy to find I'm the easiest person on earth to find on the internet if you know my first and last name and luckily they're both five letters a piece so ten letters it's like a phone number and that's my twitter account if you want to see me tweet cats in useless crap that's where I go to do it how much time do I have left I have time for questions I will take questions fantastic I normally talk way too fast and that went great any questions I mean man everybody at my job said I wasn't going to hit it I said I was going to go over by ten minutes my question is related to public versus private doesn't define the difference in public and private methods but you only really want to document the public methods in an API so Sphinx has a built in thing I think it's called undock methods and you can say hey I don't want to document these methods inside of there so when you're laying out your restructure of text file you can say hey don't document these methods or only document these methods you can either be inclusive or exclusive and then that's the way you would do that alright I have no in for this so I take a small bow