 This talk is Outro in the Cloud, this is going to be more of an experience report of my experiences. Introduction, this is me, if you've seen my picture on the board, hello. My name is Mike Moore, and my handle I go by is Blomage. I work at a company called Bloomfire, and we have a really great team and we're doing lots of fun things. We really enjoy our time at Bloomfire. And also we're hiring, so if you're looking for a job, it's Kobe, don't talk to me. One of the things I do kind of known for is I run Mountain West Reconference. So by show of hands, how many people are planning on going to Mountain West in two weeks? Excellent, thank you. So I'm super excited for this finish here at Mountain West. But if you look at our logo, like every time I look at it, it's like just so literal, and I just don't like it. So hopefully we're going to change that and hopefully we'll have that by the time the conference comes around. But I'm not here to talk about Mountain West, I'm here to talk about the Cloud. So let's talk about the Cloud. Everybody loves the Cloud. I like this Cloud because it's got a butt in the lower right hand side. The Cloud offers an awful lot. It makes it super easy to stand with a new application. It makes it easy to provision new hardware and to grow kind of scale. But I know the majority of you are like, yeah, it's fun for work, but the best thing about the Cloud is that's where Rainbow Dash lives, right? And of course, Rainbow Dash is the coolest out of all ponies, am I right? And how much cooler? 20% cooler. All right, so show of hands, are there any bronies in the audience? One. Everybody else, you might want to go get a beer because it's not going to get me better for you. All right, so this is a really simple kind of diagram showing an information architecture, systems architecture. On top we have the web server, on the bottom we have the database server. I'm going to color coordinate those, color code them so that they're easier and so they just pick out. So blue will be web, red will be database. One of the things that we end up doing a lot is we want to put multiple web servers in case one falls over. And when we do that we have to put a load balancer up there in front of it, and that will be green. So, and this is the very typical approach to building an application, putting out available for people to hit. And then you can actually want to scale, we'll have like, you know, maybe a master's play database, and eventually we're going to have multiple web application servers to accommodate the load. And we put them all on the cloud. And when I started at Loom Fire, this is actually where we were. We were on the cloud and it worked really, really well for us. So we have a user out there on the internet, he'd request something from our load balancer. And we put the balancer and had it off the web server, web server would ask the database, scale the data, package it up, send it back to the user. And that worked really well and had no problems with it. Almost. Well, I'm actually a big believer in PDD. Is anybody else here a believer in PDD? You guys know what I'm talking about? It's pain during development, right? I don't like solving problems until I actually have a need when I need to solve it. So the first pain I had with this was dealing with customer domains. We sell service, our customers want to put that service on their own domain. And that works really easily, right? You have a user, they ask for something on loomfire.com and we respond to it. But then we have a corporate user and they ask for something on their customer domain name. And, you know, that should work in most cases. But you guys know when this doesn't work? Because it doesn't work when you have to use HTTPS, right? The reason that doesn't work is because you have to do everything over SSL and you actually have to send the certificate before they actually say what the request is that you're looking for, right? Before you get the domain name that you're looking for, you have to send the certificate signed with that domain name. Which means that we need to have a separate IP. And that was our problem. So, your users come in for loomfire, they go to one load balancer. Users come in for customer domain name, they go to another load balancer. Unfortunately, our cloud provider did not have a way to have multiple IPs either on a single instance. And the cheapest single IP product they offered was like $80 a month. Which is pretty steep when we're only charging $100 a month for the service. You don't want for this of what revenue you have coming in paying just for the load balancer. So that was a pain. And we needed to solve that because we weren't going to earn the same business very long as we kept doing that. The second pain we had was customer white lists. And again, corporate customers, they want us to go and to hit their services usually for authentication, but for other things, other types of integration. So we have a corporate user and then in the upper right-hand corner you can see that's the corporate network with the evil smoke stacks. So the user makes a request into our cloud, our product offering. We go off to the web server. We have a database server, but then we also have to go about and we have to talk to some external resource that's behind the corporate firewall. And we can do that because they white-listed our IP and they let us in. But as we get more and more popular, all of a sudden we need to add capacity. So we had to add a new web server. And because all of our servers are in the cloud, we've got a public IP and when a new request comes in, hands off to that web server, we hit the database, then we hit their corporate repository. We hit their white list and then it just fails spectacularly because we can't get in because we have to update that white list. So every time we add a new customer, every time we add a new server we have to be able to update their white list. And as we keep adding new, more and more customers and we keep scaling and adding more and more servers that was going to become a real problem. So that was our second pain. And this isn't really a pain. This is just mostly a concern. This is just something that I don't like. So yeah, it's a mental pain. It wasn't anything. We're all out there on the cloud and we can have requests come into our load balancer and that's just fine. But you can actually hit, directly hit our web server. If you knew the IP and knew the port that we were using from the load balancer and even worse, you could hit our database. And there's just something about that that I'm just not very comfortable with. So we wanted to make a change. We wanted to kind of grow beyond what we're currently doing on the cloud but to do that we really had to master five things. We had to make sure that we could do these five things. The first was networking. And I'm not going to belive in this point because I think this is a self-selecting group and we all probably know everything. You guys probably smarter about networks than I am because I'm totally faking it. But the first thing is we had to understand subnets. And what we really want to have is we want a small logical group network that nothing can access unless it goes through a router. We specifically give it commissions. And we also have to make sure we understand how DNS looks if we're going to change our cloud offering. So if you're looking to make this type of change you need to make sure you understand those two things. The next thing is security. We were able to just let our current cloud provider take care of the access for us. But if we're going to make a change then we were going to have to control that access and it was going to be our problem. So we needed to really make sure that we covered all of our bases. The third thing we need to understand is availability. To me that means that you need to understand how to monitor your application. You need to understand where your packets are. Disaster recovery, things like that. Again, you can rely on cloud providers to give that for you but if you want to make this type of change you need to understand these types of things. The fourth is automation. To me that means you're either using Chef or Puppet hopefully deploying with Capistrano or Vlad. And then the third option here is to really understand what platform you're on. So if you're on a Debian-based platform look at Debian packages, right? I've worked at an organization that we packaged up all of our Rails apps as a Debian package and we used that to deploy instead of Capistrano. And that's a really interesting approach and there are a lot of advantages to that. So you need to own your own automation. And the fifth thing is we'll talk about a little later. So my first thought is we have these problems, these two pain points of customer domains and white lists. What I want to do is I just want to go to the data center because data centers are safe. Data centers are safe because you have a private network that you can configure the way you want and what I really wanted was a DMZ. I really wanted one subnet that was accessible from the outside and that subnet could talk to all of the other subnets but you couldn't go all the way through and talk directly to other servers or day-to-day servers. But in order to have your own data center you usually have to buy some sort of low-balance router and those are expensive. And not like, you know, kind of expensive, like Karogu expensive, but they're like tens or hundreds of thousands of dollars expensive. It's crazy, it's a lot of money. And so when we were looking at that option we looked at it from our approach in the cloud and we thought, what? We're just a startup, we don't have that kind of money. So what we wanted to do was we wanted to take all of the strengths of the data center and all of the strengths of the cloud and together we would rule. So this is what we did. We decided that we were going to stay in the cloud but we actually moved all of our infrastructure onto a virtual private cloud on Amazon's AWS service. And I like to think of a VPC kind of like this, like a little piece of cloud that you cut out for yourself and it's safe and it's warm and it embraces you. So this is what it looks like. We're on the cloud but in the cloud we've got our own private network, right? And within it we've got our different subnets so we can control access to each subnet with the router and that's provided by Amazon. So within our public subnet we have our load balancer and this is actually an elastic load balancer and this is a service that's managed by Amazon. And this was really great for us because that's one last thing that we have to do so again we gain some of the benefits of being on the cloud of having services that you can rely on without actually having to manage them ourselves. And then everything else is just a typical EC2 server, right? So if you've done anything on Amazon or Engine Yard this is the same as everything else, only it's within a private network or private cloud. So here's our network, when a request comes in it goes to the load balancer, the load balancer hands it off to the web server, the web server goes back to the database server, it works just like everything else did. But we solved our domain problem by having a second load balancer which is on a customer domain. And it's a new ELB instance, it's managed by Amazon and the great thing about this is that the customer certificate lives on the load balancer. So we don't have to manage any of the certificates on our own web servers, we don't have to worry about deploying new certificates to all of our web servers. We put it directly in the load balancer and we move on. And that's a terrific solution. Request comes in to customer, hit our data for web server, database server, just like everything else, just like before. So that was our first problem is just having multiple domains and load balancers, these ELB systems like 18 bucks a month. So they're much more cost effective. The second problem we had was adding new servers to our environment. And again, Virtual Private Cloud comes to our aid here where another service that runs within the cloud is an app server. So all of our external connections coming from our back-end subnets go through the app and that app has a single public IP. And so all outbound connections are coming from the one IP. If we add a capacity and then traffic across to those new servers they're going to go through that same app and they're going to be the same IP. So we can scale internally within our private cloud as much as we want. We look exactly the same on the outside to all of our customers. So again, the five things were networking. We solved that by having our own subnets. And we solved this concern slash pain that we had. So, the thing I want to really highlight is that all of the IPs on each of these subnets are private IPs. So we have a range of IPs that go within each of these subnets. And that really helps for security because if you have requests coming in from the outside you're not going to be able to access your servers because it's a completely private network. So the request is kind of bounce out. You can't port scan or database server anymore. That kind of opens up the question about what do we do when we have to access these servers to remain then so just kind of see what's going on, access log files. The only thing we have in our public subnet that we don't manage is a Bastion server. This Bastion server allows SSH connections to come in and then we SSH pass those keys back on to the backend servers as well. So request comes into the Bastion and from that Bastion we can access our web servers as well as our database servers. So the third thing we needed to make sure that we understood was availability. To us that means upstart and monitor. So we're monitoring all the processes and make sure that they're going to stay up when they don't stay up that we get notified. The other thing it means is backups and so we take EBS back snapshots of our database server and some of our other servers but we also do dumps to S3 just to make sure that something really, really goes bad and we can always go back to a backup and recover and then we also replicate outside of S3 case Amazon that was completely ticked off. The other thing that has really helped for availability is New Relic and Kingdom and New Relic can save their bacon more than anything. I think that if there was one one thing I would recommend to upgrade to was upgrade New Relic because that's just an indictable for us. And then lastly an availability thing I kind of want to highlight is a really powerful combination of tools is cron and rake. And we've used this we've had so much success just using cron and rake. Simple rake task is going to go and look for maybe if the job queue is getting out of whack or whatever we automate that check and if something goes bad we have a five minute window or we're going to check and maybe send an email. So that's a really powerful way to solve the media problem without having additional infrastructure to do really heavy monitoring. So the fourth thing is automation and this is where we spent a lot of our time and we decided on Chef and we love Chef. Chef's fantastic. And as an aside this is the guy I work with his name is Jason Rolofs. This Twitter is James Kilden. I don't know why his name and his Twitter handle are so completely different but he's my hero and he has done a really tremendous amount of work on automation on our new infrastructure. So much so that he actually wrote his own AWS client called Simple AWS. You can get it here on the GitHub and it rocks my socks off. A lot of the other libraries they didn't support absolutely everything that Amazon was supporting and quite literally we were rolling to production features that Amazon would release a week beforehand. And the only way we could do that is by just having our own library. And so that's what he did and he truly checked it out. It's not a lot of code but it was a lot of functionality. We've automated almost as much as we seriously can. In order to connect to any of our servers we have a simple script that takes care of all of that fashion SSHQ forwarding. So, you know, we have multiple private clouds one for production, one for staging, a couple others for a load balancing and stuff like that. So we have a script that we tell it which environment we want to go to and then which type of server we want to go to whether it's utility server, web server, database servers. We need to add new capacity. It's as simple as just a simple raycast right here. So you see the server's new web, server's new dv, however you want to do it. That hooks it completely into not just the virtual private network but also makes all of the load balancers aware of it. And then also to create a new load balancer we automate that as well. This is something that I think is super cool and I haven't seen any other library be able to do. But we run a raycast to create the new cert files. We then replace those cert files with the actual certificate we get from our customer. We then upload that certificate to Amazon so it's aware of it. And then we create a new load balancer for that domain certificate. And it's as simple as that. And it's pretty freaking awesome. The fifth thing here is is really courage. And by courage you have to have individual courage to try something new. But you also need institutional courage. You need a company that's going to allow you to try something like this. And I was really grateful that I'm able to work at a company like Loom Fire where they would take a chance on a two-band team going on and making a radical change for infrastructure. But we solved all of these pain points without actually making any changes to our application. We solved it completely on our infrastructure side. And with that we're still so hopefully we can take control of our own cloud and feel like rainbow dash who seriously kicks ass. So this is an experience report so we have lots of time for questions and I'd love to take any questions. Why did you rainbow dash? Why didn't I rainbow dash you? I was expecting you guys to be drunk, you know, to be a lot funnier. I'm using that with IPv4. How do you plan on expanding IPv6? Using the NAP? Yeah. We have no plans for IPv6 right now. And that's one of the things that we actually I would love that if Amazon would support if they went to IPv6 because we also don't have multicast within our private network. And if we had that that would solve a lot of other issues that we have with some other pieces of our infrastructure that we have now and that we know we're coming later. So it's not all roses but we're much further ahead than what we were before. And also we're spending less money than what we were spending before. Everyone? Another one? Question? Our automation tools it's like out of the box chef solo. That's great. Yeah, the chef solo and rake. I mean that's how well everything is. So there's there's not a whole lot to open source other than give you the keys to the kingdom and we're not going to do that. Yeah, yeah. So we're not we're not using the chef paid package. It's all chef solo. So we have a get repository that has all of our chef recipes on it and the first thing we do when we stand up the new server is we install a SSL SSH SERP so we can do a talk to GitHub and we pull that down and we just do you know get pulled and then re-apply all those recipes. So it's all chef solo and we share configuration between machines. So when we set up a new server we go and we tag each server within Amazon's infrastructure so we know you know this server is a web server this server is a database server and whatever else. So we use that to to label each server and then we have all of the configuration mapping within within chef solo. In your chef solo recipes are you doing something to simulate the data bag stuff that's in posted chef? No We've got a very very simple approach with that so we did want to pay for chef we just wanted to use solo and so we haven't really looked at what paid chef would offer us but so far we don't really have any pain with using chef solo So a concrete question would be how do you list the users that you create? How do we list the users meaning the user accounts on the box? Yes We've only got one user account on each box and it's the one that we stand up with so in Amazon when you stand up a new server you run that you have a script that you can run and we just use that server to that script to create like our our right the running user that's going to run NGNX and everything else and so when we SSH it to it we just use that same user account so it's not it's not a full root user but we can sue that one to it Thanks What are we using New Relic for? Yeah so New Relic is added server monitoring so you can point New Relic to any server that is running within your environment even if it's not running your application so you can take the monitoring of all the processes on your database server the CPU load is on any server within your environment with New Relic and so we're using that just because it's a little bit easier than stitching together a bunch of other tools at the moment but I think New Relic is more than paid for itself we really rely on it and we've been very happy with it so far the only issue we haven't like with New Relic is error reporting so we're still using error break for really better needs we don't want to film any pain with it the cost difference between running it ourselves and running it in the cloud well we were on engineer before and even upgrading New Relic and running the cloud and all that we're actually it's less money for us having one 10% less doing it ourselves even paying for the additional services thank you everybody