 Welcome to Cascadia Ruby, and welcome to my talk, where I will tell you in six parts why I love boring technology. First things first, hi, I'm John, like you said, I work at New Relic, I'm a senior software engineer, and a crusty old Unix curmudgeon. And back when I used to be assistant man, we had a saying that standard is better than better. There's a lot of reasons for that. It's stable. Everybody already knows how it works. It's really easy to hire for the standard technologies. In fact, you probably already have. If you hire a bunch of Ruby programmers, you kind of get a bunch of Rails programmers for free. It's easy to look up problems when you run into them, but you're not going to run into as many because it's stable. It's also going to be the default assumption of every other tool you ever use. If you get a new spam filter, and you're using PostFix as your mail server, you can count on that spam filter working with PostFix because it's the standard. And that's really the bottom line. It just works. You go to the standard technology, you can pretty much drop it in, and it's going to sit there being boring, being stable, being productive, and being awesome forever. On the other hand, it's easy to assume that a new technology is going to solve all of your problems, but usually that stuff is so alluring because you don't know what the downsides are yet. You're going to find them out, but there's a few you can count on. First off, nobody knows how it works yet, and I don't just mean under the hood. I'm talking about, like, basic APIs and usage. Speaking of those APIs, they can change right out from underneath you. We've already run into that problem at New Relic. We're using Docker, and we've built a big infrastructure around it, and right in the middle of doing that, the APIs just changed, and we had to rebuild a bunch of stuff. There's not going to be standard usage patterns, and I'm talking about things like, what do you name stuff? What directories do they go in? What are the methods called? You're going to have to get everybody to agree on this, and that can be really challenging. You're not going to have tooling for the new stuff. For example, if you go with HTTP, you get Curl, you get a million client libs, you can point your browser at the endpoints. If you go with something like Thrift, you don't get any of that. You've got to roll it yourself. From a security perspective, it's much more of a wild card. All the low-hanging fruit is still there in terms of vulnerabilities, and it's probably had a lot fewer security savvy eyeballs looking at the code than a more established technology would, which is really just a subset of the fact that it's going to be buggy-er. It's going to be weirder, and maybe worst of all. Now, I don't want to be unfair. I think Mongo actually is good for a very specific set of problems, but those problems were not well understood when it first came out, and a lot of people just got swept up in the excitement of this fancy new thing, and they just jumped on that bandwagon. Down the road, it turned out they found out, oh, gosh, our data actually is fundamentally relational. I have to do these joins across these different types of data, and they don't nest well at all sometimes. Having to make that transition just killed a lot of projects. They just couldn't do it. Even the ones that did successfully make the switch to just a standard relational database paid an enormous cost to do so. That said, I think there are some times where it makes sense to use something newer, and the most important consideration there is understanding that it depends on the context. There's a whole spectrum of acceptable risk that you might be willing to take on in different situations. On one end, you've got personal side projects. Go crazy. Use everything. Use all the fancy new stuff you can possibly find. You're going to have a lot of fun. You're going to learn all kinds of new ideas, approaches, things you can take back to your day job. When you're doing this, I think you should push it farther than you think you're going to really need to. Make sure it actually solves the problems you think it does, and think about how you'd test it if you were using it in the real world. Bear in mind, as you're doing this, that you're not just going to be an advocate for this technology. You're going to be a teacher to the rest of your organization. So you want to prepare yourself for that. Closer to the middle of the spectrum, you've got things like internal tooling. It would be like an employee directory, which by the way, if you don't have one, they're awesome. If you have more than 10 employees, I highly recommend an internal directory of those employees. I love the one we have in New Relic. And for these, they're great for just trying something out, maybe see if it deploys easily onto your servers that you're using at work, whatever else you're doing with that. I would explicitly make it a trial. These sorts of tools are usually pretty simple. Consider just rewriting them in whatever new thing you want to try out every time it comes along. That's not always a great idea, because sometimes they do get kind of big. And also bear in mind, you are inflicting this whatever it is on all of your co-workers, so don't go too nuts. And then of course, at the far conservative end, you have customer facing production code. And in my opinion, if somebody is paying you to build a system that's going to be stable and maintainable by other engineers, maybe after you're gone, you need to have a very good reason to deviate from a standard technology choice. You have to have an actual specific problem you have to solve. And being a better fit, that's never a good enough excuse. If somebody says this to me, I will challenge them on it. If you are tempted to say it, I encourage you to think a little bit harder about what is it precisely that you mean by that? What is the actual problem you're trying to solve? If you do, you're going to come up with a much more compelling argument for why you need this new technology. And if you can't, maybe you shouldn't be using it after all. So what are some situations where I think it is warranted to use something new and exciting in production? The basic one is when there's just no boring solution to whatever it is that you've got to do. At my last job, we were analyzing network data for patterns that looked malicious. And one of our clients was Comcast. As you might imagine, it's a lot of network data. So we built a Postgres-based solution first, and we just could not keep up. It was just so much data coming in. So we tried out Hadoop. And we paid the price for that. It was a lot of work. We spent man months of time building infrastructure, figuring out how Hadoop works, getting it all wired up. We ended up building our own language on top of this thing. But at the end of the day, it let us do something that otherwise we just would not have been able to. So for us, it was worth it that time. Other times, the new thing is just so great that maybe it's worth using, even though there will be downsides. Best example of this for me is Rails. I know it's your grandpa's framework today. It's kind of old and boring. But there was a time it was the hot new thing. And I tried it out on some side projects. I was one of the earlier adopters, not as early as some. But still, I found that I could use it so much faster to build out new products that it was worth the costs. So we've decided that you're going to use something new. What are some ways to limit the damage on that? The biggest one, I would say, is don't rewrite the whole world at once. You want to roll it out slowly. This is really good advice whether it's new or not. Try to do it incrementally and have a rollback plan in place for each step along the way. See if you can get everybody using this new technology in the same way. You don't want to explore the entire possible space of problems that might arise. Provide some guidance to funnel them all into the same patterns of usage. Try to duplicate any emerging standards if there are community standards for whatever this is. And consider writing some tools, maybe some client libraries to help them out. Also, just to make it easier. I have made this mistake. I once replaced some roll-up models in a database I was using with some solar indices. It seemed like a good idea at the time, but I can show you the scars. Really badly. Bottom line, if you are using a new technology in a way that the creators didn't expect and nobody else is doing, you are going to have a bad time. You are multiplying all the downsides by 10. And so, just tread lightly on this, or better yet, just don't do it. You don't want to roll out more than one exciting new technology at a time. If you already know the quirks and problems of your other technology, it's going to be a lot easier to narrow down the problems in your new one. Also, exciting tech tends to do better in a boring context than an exciting one, because all the assumptions that it makes about how everything else works around it are more likely to hold true. So, let's talk about some examples. We're going to walk through a specific technology decisions for an imaginary application, which I will describe to you in a moment. Some of these we're going to zip through pretty fast. Others will take a little more time exploring more detail. But before we get started, a quick warning. These are my opinions. I didn't just make them up out of thin air. They're based on long and, in some cases, very hard experience, but reasonable and intelligent people could disagree, and in many cases they have. So, that said, let's talk about our new startup. We're building the next generation multi-homed cloud-based potato storage solution, taterbasis.com, which I didn't actually come up with that pun, and it's also not a real website, but it is serious business. This is your new job, this is your startup, so we're going to treat this as client-facing production code, important stuff, and we'll evaluate the decisions accordingly. Let's get started. Before we can do anything else, of course, we have to decide what language we're going to use. Now, I'm going to go through this from pretty fast. I think we all have a good sense for what the mature, correct language for a web application is. No, it's Ruby. And also Ruby is plenty boring at this point. It's been around for almost 20 years. It's a very mature technology. There are other good options that you could go with. They're all basically fine, but if you don't want to write a web app in Java, I'm not going to blame you for it. There are situations you want to use something else. The most obvious one, if it's running in the browser, you only get one choice. Coffee script is kind of an alternative, but they're all almost the same thing. There's other stuff that you could use. These are kind of the new exciting ones. Me, I don't think I would yet. Go is probably the closest to these. I know there are some people using it for real stuff. I would wait another year or two. So, but back to databases. We got our language, I can write code, but we got to store our potato data. So, that brings us to databases. And there's an old saying about databases. Just use Postgres. You could also use MySQL or if you enjoy lighting large piles of money on fire, you could go Oracle. The main point though is that you want a standard relational database 99% of the time. Every once in a while you do need something different. We already discussed one of those scenarios. You've got lots of data. You can't keep up with it. It fits well into MapReduce, so maybe it's some sort of aggregation function that can be paralyzed easily. Statistical analysis of some sort. Maybe you're looking for malicious behavior networks. Hadoop is a good choice there. On the other hand, a different scenario would be that your read operations are too slow. And brief aside here. By too slow, you need an actual reason why they're too slow. An actual problem you're solving. Though to be fair, the pages could load faster on a commercial website that's customer-facing. It's a totally good reason. Amazon has done a bunch of studies on exactly how many thousands of dollars per minute. It costs them every tenth of a second slower than web page loads. And it's a high number. So, that's totally legit. If it's your internal employee directory may be not so important. But, okay. For some value of too slow, your read operations are too slow. You don't have an enormous amount of data. Mostly, not, probably not much more than your RAM. And you're not doing any sort of fancy joins or something. It's maybe a few hashes, arrays, whatever. But nothing that's gonna span tables. This is a good option for like caching. By which I've probably given away that I'm talking about Redis. By the way, if you're using rescue for your background jobs, you already have Redis. So you don't have to worry about the extra overhead of like rolling out a whole different component to your servers or anything like that. Alright. Last database scenario maybe your write operations are too slow. Again, for some value of too slow, that means an actual problem is occurring. But something has just got this giant fire hose of data pointed at your data sets. And I mean like big, big amounts of data. Like multiple data centers of data. And if you have multiple data centers, you can obviously afford a full-time database hacker. Cassandra pretty much was custom built for that scenario by Facebook. It's something you could try out. I might have been exaggerating a little bit by saying you must have a full-time Cassandra hacker to use it. But to be fair, most of the companies I know that successfully are using it do have at least one. But back to our databases. We have a language. We have a database. That's enough to start writing some code. But where are we going to put it when we got it? So in other words, what platform are we going to deploy it to? And I have to admit that there are a lot of good reasons not to go with the way that we used to do it back when dinosaurs roamed the earth. Because we used to just build our own servers. But you are going to need a sysadmin for that. So it's true that it is more versatile and a heck of a lot cheaper in the long run than any other options unless you count the cost of a good sysadmin. Those folks are not cheap. But you probably aren't all going to want to build your own servers. So when are you going to want to use something else probably most of the time? If you don't have a sysadmin, you don't have enormous performance needs yet. We know you're building the next Twitter. Database is going to be big. But it's not yet. And maybe you like awesome stuff. I do. I would go with Heroku in that case. It's easy to use. It's easy to migrate off of later. It's basically just totally awesome for a small shop that's just getting started. In fact, a lot of people would say that Heroku is the boring solution these days. And I would be hard pressed to argue with them. There's other ways you could go. Several of which are displayed behind me on this large screen. But they're all going to take a sysadmin. So just ask your sysadmin. They will have opinions about which one you should use. All right. Taterbraces.com. It's up. It's running. We've written it. We've deployed it. It's looking awesome. Adding a lot of new features. This sucker is growing really fast. In fact, so fast, it's starting to get a little hairy. And we don't want hairy potatoes. So let's extract some services. Before we can do that, we have to decide how our service is going to talk to each other. And I'm speaking specifically of like service to service communication here. And I got to warn you, I have a definite opinion on how you should do this. The boring option, the best option, the right option, dare I say, is Rails style restful JSON over HTTP. For so many reasons. First off, there is so much stuff out there that you get for free by going with this option. You already, like I said, you get curl, you get a million client libs, you can point your browser at endpoints, play with them, see how they respond. On the server side, Rails pretty much gives it to you for free. And even if you're not using Rails, you can use something like Grape or some other API-based library that makes all this really super easy. HTTP itself has a lot of good stuff that you just get for free. Caching, build it right into the protocol. There's a ton of load balancers available. You get a solid encryption framework in SSL. If you're exposing your APIs to third parties, and I think you should, they're all going to need, know how to speak this already. You're not going to need to write the client libraries. No, you might want to anyway. It's not a bad idea, but you can defer that. You can get it down the road if you want because they'll know how to consume JSON. Speaking of which, if you have any kind of fancy front end JavaScript framework at all, you're pretty much going to have to have JSON HTTP backends. Side note concerning those fancy front end JavaScript frameworks, if you're using Angular, I would recommend that you modify Angular to use the restful, sorry, the rail style rest conventions because Angular's are pretty wonky and it's not hard to do. On the other hand, if you're using Ember, I don't know why this doesn't work out of the box. I wish it did, but it doesn't quite. I would tweak the rail side to match Embers. Embers are basically pretty reasonable, and it's just a lot easier to go that way than the other. So when would we want to use something other than JSON HTTP? There are very few cases in my opinion, but the one that is sometimes legitimate is that you just have got to have something going faster. Before you do this, though, make sure that that is actually your bottleneck. Like, get rid of active record first completely. Maybe not as much of an issue as used to be, but just consider pulling data straight out of SQL queries and generating your JSON from that. You can even trick your database into emitting JSON, but that does get a little hairy. You're basically going to shave a few milliseconds off of this by switching protocols and a not inconsiderable amount of bandwidth, but it's still one of the very last things I'd optimize. Like, maybe you've already got it down to 8, 9, 10 milliseconds and you just got to have 5, 6, 7. So in that situation, maybe protocol buffers would be a reasonable way to go. They're used by Google. They've been around for a long time. There's a lot of great documentation, libraries and basically anything you would want to use it with. When you do this, though, I wouldn't replace all of your endpoints with protocol buffers. I would leave most of them JSON HTTP. You think of it this way. On the one end, you've got your JavaScript endpoints that have to be JSON HTTP. Now, on the other hand, you've got these high performance endpoints, whatever they are, that have to be protocol buffers or something. And then in the middle, there's all of this gray area that could go either way. I strongly recommend you take all that gray area and go the JSON route with it for all the reasons that I just explained and only replace the protocol buffers or the endpoints with protocol buffers that absolutely need it. So what's another scenario you might need something else? Maybe five milliseconds is too much milliseconds. And also, you've removed the portion of your brain responsible for fear. All right. Zero MQ. It's an extremely lightweight message queuing system with basically no features at all, which is what lets it go incredibly insanely fast. Again, I would only use this for the specific endpoints that have to go that fast. You hook this end of the fire hose to that end of the fire hose. You leave everything else alone. And keep that fire hose really simple because you're shaving a couple milliseconds off here. If you do anything at all, you're blowing your performance edge anyway. So keep it really, really simple. Or just use JSON HTTP. I would just do that. Okay. There's other stuff that you're likely to run into, but honestly, I don't think I'd really recommend any of them. Like thrift, it's getting a little more popular these days, came out of Facebook. As of today, I would say it is not mature enough for prime time. The documentation is really inadequate. And the tooling basically just doesn't exist. It's got exceptions built into the protocol, which is kind of nice. But in my opinion, they don't make up for the shortcomings. Another technology that you are likely to encounter at some point in your career is soap, which I include mainly as a counterpoint to my overall theme that even though it is incredibly mature, it's been around for a long time, used in a million different contexts. It's terrible. It's so painful to use. I just really cannot recommend that you avoid it hardly enough, enthusiastically enough. So maturity isn't always everything. There are other factors you need to consider as well. But okay, we know we want JSON. We know HTTP requests are what we're making, but where do we send those requests? In other words, what's our service discovery mechanism? And my friends, I have good news for you because the boring option for service discovery is awesome. It's DNS. That's right, DNS. Simple as possible case, you have some servers that have some services. You point your servers to your services, DNS records, IP address, done. But that's not all. It gets better. Maybe you have more complicated requirements in this. Let's talk about SRV records. The SRV, by the way, stands for service. There's a DNS record type specifically for services. This is what they look like, and I'm not going to go into all the details here, but the interesting parts for us are that you can specify the type of service, which is just any old string you want to stick in there. You can say what port it's served from. And you can have multiple of these, by the way. If you've got a bunch of servers on a bunch of different things, or you've just got like 12 different unicorn instances on the same server, but they're running on different ports, totally handle that, no problem. And if you've got those 12 different things, maybe you want to do some simple load balancing, no problem. You can specify a priority in a way for each one of those services. And this is enough to take you into the dozens, maybe even hundreds of services range. It'll take you a long way. And if you talk to me for any length of time, you will know that I really like DNS. It's great. DNS is basically it's a distributed, high-performance, key-value database optimized for read operations. You have TXT records. That's text. And by text, I mean anything you want. You can just put it in there. You get this incredible distributed database. You want key-value pairs. Redis has nothing on this, except for write operations. DNS is terrible for write operations. But it's fantastic for lots of other things. Also, the protocol itself can do a lot of very simple query response services, and they will just drift right on through most firewalls. This is kind of good, kind of bad. Network administrators have some legitimate reasons for not wanting you to do that so that they can actually see what kind of traffic is on their network. But dang it, if it isn't really convenient for developers sometimes. So that's an option, especially actually if you don't completely control the environment you're putting this client into, it's going to somebody, some other company's network or maybe you're building an appliance that goes in people's homes, whatever. It can be great for that sort of stuff. So when do you not want to use DNS? There are sometimes, and the main one is just that you just outgrow it. You just get huge. You are Facebook. You are Twitter. Something like that. And you just have hundreds of servers. And when you have that many servers, sometimes they go down. Hard drive fails, whatever. And you don't want to route any request to that service until you can get someone over there to fix it. And also with virtualization these days, it's really easy to just spin up servers, spin them down, change them to some completely different type of server or whatever. And while DNS is totally awesome, I do have to admit that when you have a lot of records and they're changing frequently, it does get really cumbersome to keep up with all of that. So Zookeeper will handle that for you. That's the Apache project for service discovery. It's a big clunky beast of a service discovery mechanism. Again, you're going to have probably a Zookeeper guy or gal, and it's going to be a lot of work. But again, sometimes you just got to have something that runs at that scale. So, Tata Races is basically working, it's basically going, but I have to admit that I have glossed over kind of a lot of decisions that you probably had to make before you get to this point, like maybe some gems. You probably have some of those. So for all the myriad of decisions involving gems, I'm not actually going to discuss a particular gem because I think we all know what gems are. We've got the basic idea, but I'm going to discuss like, how would you evaluate the maturity of a particular one? And the questions I would ask are things like, how long has it been around? Usually the longer it's been around, the better. The more chances had to mature. How widely used is it? This is not always going to be obvious, but you can have a basic idea. Omni-off is very widely used. That gem I wrote last week, probably not so much yet. You can look at GitHub and see how long issues stay open. But I'll give you a good idea for how responsive the developers are, and if there's any really bad issues in there, that's not a good sign. When was the last commit? Just give you an idea for how active the development is, but I want to be fair. There, some things don't really need active development. Again, omni-off is this multi-purpose off-library. There's a lot of new stuff coming out all the time that they need to keep up with. If you're writing a gem to build IRC bots, that's a pretty stable technology. It may not matter so much if it hasn't been updated in a while. At the end of the day, it's a judgment call. I mean, everything I'm talking about right now is a judgment call, but I think gems are in a especially gray area. You may be wondering, how do I find out all of this stuff, to which I respond, Ruby Toolbox. It is the way. It is awesome. It'll give you basically all that data, not the GitHub stuff, but you have GitHub for that. So, to wrap this up, I would love to actually continue this conversation with anybody anytime. Just come up and talk to me about boring technology. I really do love it. But there's one thing I really want you to take away from this talk, and that is that non-standard technology choices are enormously expensive. They have a huge upfront expenditure of time and effort, and then they put a multiplier on everything else that you ever do with that system. Experience and familiarity, some homegrown infrastructure, those things can all help, but they are not free either. You are literally going to pay for this technology decision for the lifetime of your system. I think some people seem to give this talk, and they feel like maybe I'm being a little bit of a wet blanket. But I don't see it that way. I don't want to hold you back. I want to set you free. I want your crazy, awesome, incredibly brilliant, but insane idea is to actually get out there into the world, and I want them to work when they get there. I want them to be making the world a better place, changing people's lives. That's why I get up in the morning. I want to make stuff. I want to build things. I want the code to just flow out of my fingers out into the world and start making a difference. And it can only do that if it is built on the most stable. Well understood, solid foundation possible. Your crazy ideas will work better if you build them on things that are boring. So if you're going to make a non-standard technology choice, make damn sure it's worth it. Or better yet, just go with the standard. Embrace boring and be awesome. Thank you.