 This is my talk on development to DevOps. All right. Slideshow is not what I want. That's me. Here we go. So this is my talk on development to DevOps. Hi. My name is Justin Burris. I'm typically a Ruby developer, and I had the opportunity recently to do about three and a half months of DevOps work. And so kind of the synthesis, I worked for Neo. We're a product development company here in Singapore. So kind of the synthesis of this is that I'd never done DevOps work before. And I was kind of curious about this new kind of exciting area for me to work in. And what I mean by DevOps is basically taking your application and deploying it to production. And all the things associated with that. And so this talk is kind of all the things that I wish I had known when I first started doing it. And maybe some things that can help you now if you're doing DevOps work. Yeah, but first why? So while kind of starting on this DevOps work, I encountered a lot of tools. I encountered a lot of methodologies. I encountered a lot of different thoughts about how to do this kind of work. And it's really important to kind of keep in mind what your goal is. And so throughout the talk, I hope to communicate that. So I first want to start off with automation. And basically automation in this context means how to make it to where your whole pipeline for taking your code from your Git repository or your local machines and deploying it to production can be fully automated. And this is really important because the less human interaction that you have in deploying your code to production, the less chances you'll have to make a typo or fat finger or something. So my first piece of advice is that you want to automate early. Some people might have some issue with this in that you're going to have to change your code as the situation develops because assumptions that you make early on will not hold up later on in the process. But if you automate early and you have your whole process built up in an automated manner, all you have to do is tweak your code that does your deployments. Let's turn off Wi-Fi. So in the vein of having an early automation, you want to make sure that you have easy to use checkpoints. And what's nice about this idea of checkpoints is that if you do have a change in your whole deployment process, you can just change that single checkpoint. And so you don't have to change your whole automation flow. And additionally, what's nice about these checkpoints is that it makes it a lot easier to debug. So whenever you have problems in your code, you can just think like, okay, this is part of this section and this piece. And so I'm just going to flip a flag and I'll be able to debug that section. And believe me, you're going to want to debug. So in that vein, you want to make your whole system modular if you can. So you want to have pluggable pieces. Our particular setup was pretty complex. We have at least two data centers that our code operates in. One of them is behind a firewall and we don't have direct access to it. We have to have a pretty interesting FTP transfer of the files to it. And so as a result of this, we have a lot of code that needs to run in multiple places and do similar things. And so by making it modular, it made it really easy. Also, you want to make sure any long-running jobs you have are very visible. This is mostly a development suggestion. So whatever you're doing, this kind of DevOps-y work, like you're building up a server from scratch or you're running a new Ansible script or a Chef deploy or something like that, it can take a long time sometimes. And if you have an error while that thing is running, you want to know as soon as you can so that you can either fix it and start it again or debug or dive into it. I suggest like pings on your computer, anything that you can do to make it visible. Blank the tab, make a noise, whatever. So this is part of the automation flow. You want to make it as painless as possible to build from scratch. And by scratch, I mean from an ISO that you download from a website, all the way up to your production box. And so the way you can kind of do this, I'll describe later in the process section. But starting from scratch to production makes it really easy to change any part of your system at any time. So finally, create an assembly line. So whenever you do your automation flow, you want to have distinct pieces and parts as your system builds. So for example, you'll have the phase that takes the ISO and converts it into some sort of image. And then you'll take that image and you'll deploy to a box. And you take that box and you deploy all the core system utilities that you need to have on it. And when you think about it in a assembly line process, it can make it very easy to parallelize. So process. So it's really important that you give your developers what they need. So this is kind of an interesting concept for me because I'd always thought of DevOps as like just building the servers to get the code to run on production. But you're actually working to get code on production and to satisfy the needs of all the people that are building the app. And it's really important to kind of think about them while you're building out your various automation flows and et cetera. And to this thing, you want to make sure that any diagrams you have of your ecosystem are multiple layers of levels of fidelity. And what I mean by that is that you should have one diagram that has your whole data center or data centers on it with all the connections and all the information. But when you have so much information on one graph, it can make it really hard to kind of narrow down and see just a small piece of that. So you want to take that diagram and then kind of build a pyramid out of it to where your bottom level has the most fidelity and your top level has individual small sections. So it can make it really easy to see at a glance what's going on in your system. Write down everything. And I really do mean everything. So one of the things that I started to notice is that with these kind of complex systems and interactions, all of these things sometimes aren't documented anywhere. And it can live inside of someone's head. They might have a very strong vision for the way that all these servers will be built and deployed. But if it's not written down, it's not that useful because you have to constantly go to them. But if you have everything written down somewhere that's accessible, it makes it really easy for the whole dev team to be able to get into it and see what's going on. So Bjorn talked a little bit about pairing. And we did some pairing with our DevOps system. And I would suggest that you pair often on anything that's a configuration or a discovery task. So if you're trying to learn the best way to deploy a particular software package or to build an RPM from scratch, things like that, I'd really suggest you pair with another developer. But that said, you should really pair sparingly on things that are just tasks that you automate. So if you're building out your Ansible playbooks or your Chef recipes or whatever else other crazy things you do, you don't really need to pair on that stuff. Because it's just kind of like, once you figure it out, you just do it. So borrowing from Python, explicit is way better than implicit with this stuff. Whenever you have these kind of implicit assumptions that are hidden somewhere in some piece of your code, it can make it really hard to kind of know the whole flow at that point in time because there's so many complex systems. So for example, in our setup, we had about 36 distinct servers that all ran like different pieces of software and different data centers. And so for me to like get my headspace into this one server for things to not be explicit would mean that I have to hold a whole bunch of extra context in addition to the context or just that one server. Also, validate what you build. So this is really important. And what I mean by validate what you build is that when you're building out these servers and all this code that lives on them, just due to the nature of it, you're going to be making assumptions about, you know, what the dev team needs in terms of the software packages, what your business owners need in terms of like the connections that these servers make with each other. And so any time you have an assumption, you should definitely validate it. An example is we thought that our packages wouldn't have access to some specific software utilities. And so we did a lot of work that we didn't need to do to kind of work around that restriction. But if we had just like communicated with the business owner about this assumption we had, we would have just been able to knock it out and not even worry about it. So upkeep. During the nature of all this kind of automation and process and all these software things that you're using in your gluing together, you're going to have inevitable upkeep of them. So I really recommend that you have a utility belt pulling straight from Batman on this. So a utility belt is basically just your software tools that you use in order to build out your environment. And what's great about thinking about things in terms of a utility belt is that you try and use these tools to the best of their ability. And you only start to branch out and write code when you have to. And this is like a really important point because the more code that you write, the more chances you'll have to mess things up. And so our utility belt consisted of our VMware setup and a tool called RBVmoney which allows you to script to VMware environments with Ruby. I really love this tool. It's awesome. It's a little obtuse in the way you configure your boxes. But essentially what this enabled us to do is to pass a few flags and we could spin up our whole environment with a single command at the command line. Ansible is a tool that we use to actually build software packages on the box. Packer is a tool we use to take an ISO and turn it into a VM, an image that we could use to deploy with VMware. And then for a little while we were using Trello to kind of manage all these tasks that we needed to do. We've since moved to another tracking system but just have a tracking system. What do you use now? We use Mangle now, which is a ThoughtWorks product. So you want to make the pain development trade-off every day. And so what I mean by this is when you're working on this stuff you're going to have a lot to do. I mean just tons of work that you have to do in order to get this stuff spun up. And so you only want to start developing code when it becomes painful. But you need to make that decision every single day. If you find that you're doing something like more than once, you know, a week or whatever it is good for you, you want to automate that. This may sound like really plain to the people who have done DevOps before, but do not assume your app is going to copy and paste into production. It won't. There's going to be all sorts of like strange differences between like your Mac box that developers work on and then like the Red Hat Enterprise Linux box you're deploying to. And then even between your different servers, if you run different operating systems, there's all sorts of crazy things you can run into. Even differences as subtle as like Red Hat 6.1 versus Red Hat 6.4. These things all have different packages that they bundle up. And sometimes they have different patch levels and there's a lot of things you need to consider. And so that's why you want to get things to, you know, box as soon as you can. So whenever you do this, you're never really going to face problems in your debugging cycle. And so my suggestion in that is whenever you have a problem, you want to profile backwards. And what I mean by this is that you don't want to immediately dive into the deepest level of your stack. So you don't want to just a bug down in like the actual box where the commands are running. You kind of want to take a higher look in my suggestion and kind of make sure that everything is working from the top down. Because typically the actual system utilities work just fine. And you probably have some assumption somewhere in your code that's not actually true in reality. So inevitably when you write all this automation code, you write all these glue scripts together and you know your assumptions are validated, they're invalidated. There's going to be pieces of your code that are just no longer relevant. And much like a garden, you need to weed these things out. If you don't, you're going to end up having a lot of extra baggage that you have to maintain and think about whenever you dive in. So TLDR, if you take only one thing from each of these points, if you're going to automate, you want to create an assembly line. If you're going to keep a process, you want to validate what you build. And finally, you always want to keep that garden tending. Thanks. Any questions? Ah, good question. I should have included a slide about that. This is maybe a controversial opinion, but when you're doing Ansible deploys, I kind of believe that Ansible does your testing for you. So if you think about Ansible as maybe like cucumber-ish, you say like, okay, box, I want you to have this piece of software with this configuration and this setting. And then Ansible goes on and does it for you. There are tools that enable you to test things like Ansible, but we just found them really redundant and they didn't help out very much. Our Ruby code that we used is glue. You could definitely test that, but once again, we didn't find it very useful because if it didn't work, it just would blow up and then you would know immediately because you have a whole automation flow with this thing. So once you actually have all of your servers up and running, there's a lot of really good tools that enable you to monitor them, such as Nagios, Splunk, et cetera. And so if you're not getting information on your Nagios logs or your Splunk logs or whatever logs you decide to use, also like Graphite, then you're probably going to have a problem somewhere in your chain. It's not so much going out and testing. It's more like something that continues to check in that it's always working. Because the thing is too, it's not just your code that's going to be living on these boxes. It's also going to be the app that's running. And so essentially, if you think about this whole system is set up to make sure an app runs, as long as you have something that checks that makes sure that app is running, then you can be reasonably assured that things are okay from what I've seen. What about making your code more agnostic, in a sense, without the framework like Hadoop or Bezos, to manage all these cluster servers' requirements and make that transparent for your developers? Yeah, so for the most part, our developers don't have to care about what hardware or software stack they run on. They live pretty much on OSX all the time and they have no idea or maybe don't even have to care about the fact that we're on Red Hat 6.1 or 6.4 or anything like that. Wouldn't such a framework ease you into the whole deployment cycle where you just don't care? And all you need to do is think about just resource education and that scripted already on the framework like this one? I don't know. Look it up, look it up. Yeah, I'd have to look that up. Have you taken a lot of what I'm hearing and a lot of what I've run into in my own situation is parity between developer, developer system, test, deployment, production and keeping everything the same, the pipeline if you will. You've got some tools that you keep on top of VMware which we actually just finished getting rid of in a couple of weeks ago, finally. We're now on Docker for depth run Docker tests run against Docker containers because the container is a container, the container is a container. Have you guys taken a look at that for your stuff? Yeah, so when this project first started out there's a lot of discussion about whether or not we should use something like Docker. There are two main reasons why we decided against it for this project. The first one being that at the time Docker wasn't in a stable state and wasn't really production ready. Additionally, due to the nature of this particular project, software like Docker wouldn't have been possible. We don't really control one of the data centers in terms of the physical machines or the VM system that runs upon it. Now, a lot of this stuff is with that kind of taken in mind. So if you're deploying to something like AWS, you have a lot more freedom in terms of what you can do. But one of our data centers we do control but we still wouldn't be able to really use a system like that unfortunately. But that said, I do feel like that kind of development style is like something that I would have loved to have had. Because in that case, a lot of these issues I ran into where the system wouldn't work in production due to some assumption that a developer had made due to the underlying utilities built into the system Docker would have solved. Thank you, Justin. Oh, thank you. Okay, so let's do the...