 All right. So I'm going to be talking about documentation in the Ruby community. And the first, I guess, who am I? I'm Lauren Siegel. I blog. I tweet. I use Ruby. I guess that's not surprising. And most relevantly, I wrote a tool called Yard, which hopefully some of you guys have used, if not seen, call at, you can get it at yarddoc.org. It's a Ruby documentation tool. And so why am I here? The truth is I don't actually want to talk about Yard today. And I'm now actually going to get into the technical details of how Yard works and stuff like that. So if you have questions about that, I'm around and you can feel free to ask me about that. What I do want to talk about, though, is I want to talk about documentation in the Ruby community in general and why we don't take this seriously. And what I want to get out of is hopefully maybe some of you guys will start taking documentation more seriously. So it's really not just about Yard. I don't care if you use Yard or Rdoc or whatever tool you use. The point is I want people to start treating documentation as part of the development process rather than just something they do after. And so why? Documentation is important. The first reason it's important is obviously because your users can figure out what the hell you're doing. The second, there's an interesting side benefit to all of this in that when you try to explain someone what your code does, it helps you realize that your code sucks. And it helps you actually think about your API and maybe make it a lot easier. And so I guess there are situations where people would design classes or methods. And when they start explaining them and documenting them, it takes them like five or six lines to explain. And the rule is if you can't explain it easily, then really you should start questioning the design and you should start questioning the implementation as well. The other thing I want to talk about is good documentation is hard to find these days. There's this little problem in the Ruby community that we like to avoid in that our documentation is really not up to par, especially in the Ruby core and Ruby standard libraries. So this is a signal exception class, which is some exception class. And basically if you look at the actual documentation, it's really just cut and paste from the superclass exception. And so we would know as Unix users what a signal exception is, but it's really not obvious to someone who's not a Unix user what this exception does. And so this kind of stuff makes me sad. And so here's another example of the DBM class, which I don't think many people even know about, but in fact is packaged with the Ruby standard library. And as you see up there in that empty box there, those are the methods for DBM. Yeah. And so it's not documented. We need to get on that. We need to start documenting our standard library and our core much better. The other thing I want to point out is that when you screw up documentation, you really screw up bad. And so this is an example of a security vulnerability in Rails last year in about June where you have the HTTP digest authentication. And basically they told you if you return nil to the HTTP digest authentication, it won't let the user through. But in fact it turned out that when you return nil to that block, it lets the user through. So they had all this code because not only was it documented that way, but they also had examples saying use this block of code in your code and it will work. And that code let people through. And so this really is bad when you have these apps making into production code. And so they fix the problem. But the real question is what is their risk avoidance strategy in this case? What is their plan to make sure that this doesn't happen again? And so generally the consensus is okay, we'll pay better attention to our documentation. Hopefully we'll review it better. But that's not really good enough, is it? And so Yard actually has some things that I'll talk about later that make this easier to detect and fix. So a quick show of hands. Who here writes libraries or frameworks that other people use? Okay, that's a fair number. Who here puts emphasis in documentation when they write it? Okay, a smaller number of people. So what I found from my experience is that the number of people using Yard are relatively small. The number of people writing RDoC is a little bigger, but this doesn't say anything about the actual quality of the RDoC. And then there's Ruby. Wait, there's Ruby. And note, this is not to scale. This is purely from my own experience talking to people who write documentation, using code, reading documentation, and stuff like that. But the question is why not? Why don't we do it? One thing I usually hear not explicitly, of course, is people get the feeling that they don't have enough time. And so my answer to that is make time. And by the way, that blue is the same color as Jessica Alba's eye shadow. But documentation should really not be something that you say, well, if I have time after implementing all the features, I can sort of get around to explaining it to the users. And in the meantime, they could sort of sift through the source code and figure out on their own and et cetera. This should be part of the development process. As you implement features, you should be documenting them. And that's how it should work. Obviously, this is a best case scenario. But this is how we should really be serious about this. And one of the reasons that it matters is, well, not your users will like you, but your users won't hate you. And not pissing your users off is a lot better than if they're just sort of happy. And the reason is all those people that will get pissed off are the people who won't read your source code. And the truth is, they shouldn't have to. Because if they have to, if they have to go and read your source code, they might as well just develop it themselves. And that brings me to my next point is a lot of people like to say my code is self-documenting. To that, I say no. The reason is self-documenting doesn't scale. So when you refactor your code and make it look pretty and stuff like that, what you're doing is you're not reducing complexity. You're displacing it. You're moving that complexity down a layer. But the problem is your users don't understand any of those layers. So they still have to read, as the last presentation said, from bottom up what that code does. And if you hide your ugly metaprogramming crap under a nice DSL, you still have this little nice shell that doesn't tell them anything about how it works. So this is an example of what people usually do when people pass in options hashes into methods. They'll usually have this parse-default-options method that parses out the default options for the method for the class. And you usually have many of these methods that parse these default options. The problem with that is this method is usually hidden somewhere in the private methods of the class. Sometimes it might be mixed in from another module. Sometimes it might be inherited. Sometimes maybe both. So expecting your users to find out all this information just to find out what the options are is really not, doesn't make any sense. The last thing people usually say is that documentation is hard. To that I say, fair enough, it is hard. And so we'll get back to why it's how we can make it easier in a second. But first, I want to talk about what makes documentation good. So I generally have three rules, three major rules that sort of talk about how to make good documentation. The first rule is consistency. So documentation is like code. You pick a style, you stick to it. Otherwise your users will get confused. If, for instance, you start telling people this method will raise an exception, and then in the next method, you don't say that, but it does, you're confusing your users because they don't know whether you didn't document it, you forgot. If it doesn't raise an exception, that information is not clear. So pick a style, stick to it. The second rule is correctness. Documentation, again, is like code. It can be wrong. We don't even know it'll be wrong if it is. And so just like code, we generally have this acceptance in the community that the reason that we test is because we sort of have this assumption that our code will be wrong. But there's this weird disconnect in that we don't make this assumption about documentation. We sort of assume that, yeah, of course I documented it properly. Of course I explained it right. But in fact, documentation has to be reviewed, has to be audited and tested, just like our code does. The third thing is coherency. And so documentation, again, is like code. It always makes sense to the person who wrote it. It never makes sense to anyone else. And so the lesson here is that documentation has to be reviewed by other people. The worst person to write your documentation is the person who wrote the code because they know all the bad things about that code and nobody will care about those details. So, yeah, so hard, yeah. And so that's kind of why I came up with the art, to try and make some of these things easier. Yes. And so Yard's goals are sort of in the same vein as those documentation is good stuff. It tries to go for consistency, correctness, and coherency. The extensibility comes for free through the way Yard was designed. Consistency. So Yard basically adds metadata to your information in the form of Java doc style param return tags. And so this is found in Objective-C, Java, Python has this, JavaScript, some documentation, starting to use this documentation format. So this is not new. And so this makes your documentation consistent because Yard knows exactly where to find this information and it can pull this information out consistently into your HTML. And that matters. And so you'll notice there's also type information there. That also helps with consistency and correctness, which is this, Yard basically through the extensibility part lets you write tools to get at that information that we just added. And so we can actually test all of the stuff that we just wrote. And so when we were talking about that Rails vulnerability before, what they could have done was add a return nil tag or return nil or false and actually written a test for that and tested their documentation against that. Datamapper is actually doing a really good job of this. Their API documentation is being really well written. And Dan Cub, who maintains Datamapper, actually wrote a tool called Yardstick, which is sort of like a lint style thing for your documentation, making sure you have the right parameters, right return tags, examples for your public API methods, et cetera. And you can extend that too to have your own rules. Coherency is mostly up to you, but basically coherency is up to how well you would describe your method, of course. But Yard does try to make it easier for you to do this. And so it makes it easier by doing the templates in a very easy to read format. And so here are some examples, hopefully, yeah, you can see that. On the left, you see the class list and it's hierarchical. So you actually have the namespaces, the namespaces split up so you can actually go inside the Yard namespace to find out where everything else is. And there's also a search there at the top, so you can search through classes, methods, and et cetera. On the right you see, at the top, you see how the parameters, the yield tags, return tags, and stuff like that are formatted in a consistent and fashion, so it's a lot easier to read. You can also have references to other classes, so there's a lot more information that you can throw into a method and it'll be very easy to read fashion. And of course, at the bottom, you see the full inheritance tree for each class, unlike our doc, which just shows the parent class, which is not always as useful. And of course, on yardock.org, there are live docs, just like our doc.info, if you've ever used it, yardock.org has live docs. And in these live docs, you can actually have live inline comments. And if you've ever used PHP.net, PHP.net has this functionality, arguably one of the best features of PHP. And yeah, and so there's no reason we shouldn't borrow at least their good ideas. And so extensibility. So yeah, so you can extend yard, and so the way we did the correctness and the consistency stuff is through extensions and plugins, three plugins that I recommend looking at, yard R-spec, yard Sinatra, and yard Pigments. The first one throws your R-spec specifications alongside your method docs. We'll see that in a sec. Yard Sinatra throws your routes into your docs, so you can actually document DSLs with that with yard. It has a parser API. And yard Pigments, if you've ever used Pigments, allows you to do syntax highlighting for other languages. And so this is what yard R-spec looks like. As you see under the Returns tag, you can see specifications for that method, and you can actually go into the code for that specification alongside your docs, which is useful if you're doing proper R-spec. You can actually have self-documenting code in some sense. The next one is Yards Sinatra, which shows your get routes, as you see, this is actually yard.org site, and so you can actually document each of those and show them properly and figure out what your API does. And if you have a REST API for some other service, you can actually publish your docs like this, maybe change up the template to make it custom, but you can show your users how to use your service like this. So what's the conclusion on all this? The conclusion is we need to try and work at better documentation. Like I said before, it has to be part of the development process, and we need to take it more seriously. I guess the reason people don't do it is because it's hard, but with Yard and with other tools and with plugins, it can be easier. And so I guess what I'm getting at is hopefully if you write or maintain a library or framework and you didn't raise your hand when I asked before, hopefully you'll change your mind and start maybe you got some ideas from this, and start thinking about documenting your code in a more serious manner. If you don't write a library or framework and you have some time to give back to the community, the community needs better documentation for the standard library and core. So I highly recommend if you're interested, either talk to, come up to me and talk to me because I'm interested too in writing better documentation for the standard library. And this is the Yard information you can install, Jim install Yard, Yarddoc.org, and you can host your docs just like Yarddoc.info at Yarddoc.org slash docs. I guess I'm a little short here because I thought I was gonna go longer, but so that's about it. Any questions? Yeah, you on the left there. Yeah. Okay, we'll talk later. All right, next, you on the other side of the aisle there. Theoretically, you need to write a parser for it, but theoretically you can. Oh yeah, yeah, it's parsing Ruby. There's actually a Ruby parser. There's also a C parser to handle the C extensions, but that's not really a parser, it's just a Reg X stuff actually. A lot of it's taken from Yarddoc, borrowed from Yarddoc. I wanna improve it to make a better C parser so that people can extend that too. Yes, you? So right now there's no explicit way to do that. However, it's very easy to write code because all this information is really just stored inside like a Marshall dump. You can actually just load up that Marshall dump, add that information in. It would be a little hacky, but you can actually just do that and save out the database, and then when you generate the docs, it'll actually do that. So you can write a nice wrapper around that kind of thing if you wanted to. Yeah, you? Okay, so there's actually a way to do that. You have to have a, well in Yarddoc you can actually do this with a .document file inside the directory. Yard has supports the .document file, but it also has a .yardops file so you can specify options to pass into Yard when it runs on code inside that directory. So yeah, it has to be implemented. Yeah, that's kind of unfortunate. It has to be done like that. Maybe there could be a way to run Yard on a website while passing in parameters that you would control. Yeah, that's true, yeah, yeah. Yes. So there's actually, I actually wrote a couple, I gave a couple talks back in Montreal where I'm from on Yard, and there's actually a little example I wrote in like 15 lines. If you go to my github, github.com slash lsegal, same as my Twitter username, lsegal. You can check out, there's a repository called Yard Examples, I think, and it actually has a way to run specs on your doc examples. So you can actually grab all the example tags or even just any embedded code block and run specs on that, assuming you follow like code, hashtag, comment line, that kind of syntax. You can actually do it like the same way Python does. Any other, yeah? Probably. Okay, well, you can write a custom template. It wouldn't be too complicated to, basically if you were just wanting to get all the methods of that class, you would basically just do, so actually maybe I have some time for a demo real quick. So that would be the doc string. And so you can actually get at all the methods that are included and inherited. And so if you could actually write your own template to basically, hold on. And so you could basically do that and if you could customize your template to have that information for all your class and then that would be that method plus all the modules that are mixed in. And so yeah, you could write the template to do that or you can, if you wanted to, you can have your own hosted live docs to just do that as well. If you wanted to build out a special Rails doc site, maybe the Rails guys would be interested in that. Any other questions? Yeah, it's a basic Marshall format. I was actually, right now it's actually sort of like a set of Marshall formats like stored inside the directory each class has its own file for like optimization purposes. But yeah, it's basically just a Marshall dump. I was looking at ways so you could theoretically write adapters for like SQL so you could port all that stuff out to SQL. You could also, if you didn't wanna do that, you could actually just take the Marshall dump and do it after sort of. You could grab the Marshall dump and just read out all the stuff and put it into SQL afterwards. But I actually, I was toying around with the idea of having different back ends like SQL and stuff like that. It makes it the design of the models a little bit more complex. Any other questions? Nope? Oh, there's one over there. So actually I did play around with auto-linking inside code of use source. But I guess the real answer here would be maybe the code should be documented better but I won't get into that. But you can in some cases auto-link some methods. You can definitely auto-link constants which I actually have a branch of that does that. And so methods are a little harder because of dynamic dispatch and because of the dynamic nature of Ruby. But in some cases you could sort of get that. The only reason it's not in yard right now is because it makes syntax highlighting a lot slower and that's something people generally don't want. But yeah. Yeah, okay, so thanks all.