 I have like 10 slides and the reason I have some slides is so that we can talk about why I'm demonstrating what I'm going to do. This is a talk called the question next evolution of infrastructure management. Hands up who manage their infrastructure via code. A few people. Excellent. Excellent. The rest of you are really hoping by the end of the session you actually want to start managing infrastructure management. I have to say I'm delighted with the last talk. The keynote by Timothy was fantastic because everything that Timothy said about continuous delivery for product, I can actually reiterate for continuous delivery for infrastructure. So it was like perfect timing. We're actually going to talk about here how we can make our infrastructure better and more dependable. Remember that phrase Timothy said? Stage and environment are bad. The reason they're bad is because environment is in general. Quite me. I'm an infrastructure engineer. I'm a reformed ASP up next teacher and developer. I do not like product development anymore. I fell out of love with it because I find it boring because it lose stupid code views where you get people telling you that your libraries are very good. I'm a DevOps extremist. I love everything to do with DevOps. I go to many, many conferences and speak about these types of practices. DevOps infrastructure management. So where we got to is in the beginning we have what are now known as Snowflake server. We have these long going infrastructure servers. Usually Windows. I have to admit it's usually a Windows problem where the server was created once in 1999 and we continually patch that server. We continually make manual changes to that server. And we probably can never decommission the application. Not saying that that's a bad thing. It happens and it's very, very common. But unfortunately it makes our lives a little bit more different than it does. Has anyone in this room still got Windows 2003 production? We didn't. Some of us actually didn't because we have these long run servers. I've never been able to do it before. By the way, Snowflake servers is a coin term by Mark's father in 2012. The idea is that each server is individually unique because so many manual changes have been made to that server. You cannot guarantee it's exactly the same as the other one. So Snowflake is a fantastic way of voting. Usually Snowflake servers follow a run book. Have been a member of a company and had a word document as a run book. It was 56 pages of a word document. The screenshots were for 2003. The servers were 2008. And then after that 56 page document there was another 26 pages for security. This was painful. And I mean painful. Because when we were configuring our servers we would know and say, oh well we can skip step 17 and we can come back and do that later. Seriously. And instead of going and updating the run book we would go, it's fine. We know that. But the next person to come in and configure a server doesn't do that. And they don't know that step 17 actually has a dependency on step 42. You know the mum ever had a nice set of china plates. Now there are pride and joy. And we could always get the china plates out when someone important came around for dinner. And if we chip the china plate or drop the china plate what would happen is that we would try and we would actually super do it back together. This is exactly what we do in our long running approach. We manually handcraft it and we really try and glue everything together and get it working every time we need to. This is extremely important. This is operating. So that was in the beginning. This is what we did. And now we move to an era of configuration management. Hands up we use a configuration management tool. So chef, puppet, ansible, salt, CF engine, Windows SCCN. There's loads and loads of configuration management. Big step forward for our industry. Huge step forward for the industry. And if you are doing it, nice work. Really good work. But keep driving that forward. So for those who don't use configuration management tools, there are tools that allow you to describe the state that a server needs to be at. And then the configuration management tool will keep it at that level. So for those nasty sys admins who have very pride hands and take great fingers, who SSH and RDP in the boxes and change things around, the configuration management tool will report that and it will change it back. Big win. This would allow us to develop what's called Phoenix. Because we know the state of the worsen machine should be at. And it's automated. We can destroy a machine and bring it back up and know that machine will be in the exact place. This is another term that was coined in 2012. Martin Foster worked for Filworks. It's the chief science of Filworks. This one was actually by someone called Linus Sietzman. Again within Filworks. But it was made public in Martin Foster's blog post about Phoenix servers in 2012. And it helps us move away from the idea that when we test applications, it works fine. But when we push it to production, it's broken. Timothy talked about it in the last keynote. He talked about the fact that our environment is also purr, that we usually have to build broken dependencies or busted packages. Well, when we push code there, it can never really guarantee that it's going to work. But what do we do? We scrub our shoulders and go, we know production is fine. We'll just push it to production anyway. This is a really vicious, vicious thing. So parts of the industry have proven that. This is not a Docker trove. If you're expecting me to include Docker today, I'm not against Docker. I think Docker is very good. I do not use it for what I'm going to do. So we moved on to an era of what's called a non-construction. And this was actually a thought process written in 2013 by Chad Fodder. Chad Fodder worked for Sixth Winderkindle. He was acquired recently by Microsoft. And they actually built a Wunderster. So, absolutely fine. This is the product that we, lots of the teams made it. And what he said is, is that we should be able to trash our servers, burn our code, and immutable infrastructure at this point. Just how Muammar and Fodder have said, it's fine. It's something to focus, you've read about it, that they would love to be able to test companies' way of being able to automate the process and come back online. If they could walk into someone's data center with a chainsaw, a baseball bat, and just start beating everything and cutting it off. Because if you're not in a situation where things are automated and come back up and you have a very big disaster, you're going to be in some trouble. EC2 nodes are a good way to think about immutable infrastructure. We spend them up, supposedly, for a very short amount of time to do a job. And then we'll refer to it. Application, just thinking of messing around, it'll be us. The cloud in general, it'll make... This whole piece of news came from the fact that in September 2015, I wrote a post on this, a blog post. How the realization came to me that I was treating my cloud infrastructure in the same way that I used to treat my physical beings. I used to be really old-alternative and things were given friendly names. I used to refer to servers, my names, and then I realized it doesn't matter. Because those servers are broad on purpose for a very short amount of time for a lifespan. So the fault-processing of immutable infrastructure was a much better way. Fortunately, any functional programmers in here? I believe some functional programmers were like, well, it's not immutable infrastructure. Immutability is not that way, which a lot of people were like, actually, they got a real good point. So last year they decided that it was going to be called disposable infrastructure. Because now it makes a much better sense. A great way to think about disposable infrastructure, if your long-running servers are nice China plates, disposable infrastructure is a big stack of paper plates. We know that they're built in the same way, we know that they're roughly exactly the same, and that we know that we can use them once. And because they're so flimsy, we don't want to use them more than once, because they're good by 14. So, demo time. That is pretty much all a slide. That's less than 10 minutes. I am going to use the hash product tools. I am not here to sell you on their products. There are lots of products out there that do exactly the same job. Cloud formation is more. Sparkle formation is more. I am going to be using a tool today called Terraform. So, how are we going to use this tool called Terraform? Terraform is a tool that will allow us to specify our infrastructure. This is like all of them. If this happens again, you're just going to have to pretend that you can see the screen. And because there's a camera there, clap if you don't want to make it feel good. That's good. Right, demo. Use infrastructure as code tools. Firstly, everything, if said in the main keynote this morning, everything should be an artifact that we keep in source. Our infrastructure is as important an artifact as our builds, our packages, our actual CPU processor. If you cannot see the change log against the production infrastructure, how can you understand when a break in a change is to be made in the right environment? Because what we usually do as system administrators in the cloud is we usually log on to the AWS console and we usually manually get everything. And then what happens is that one person builds up a knowledge base of how you do something. And that person is probably offered a lot of money by Apple or Google or one of the big companies and they leave. And then that just leaves your company stuck as if they go, what do I do now? That person is gone and therefore I don't know what to do again. So with Terraform, we declare a provider. There are providers for many things, I will show you later. The providers I will use today to demonstrate things are AWS, DigitalOcean. And if I get some time just because I'm feeling a bit adventurous as you are. But we've only got an iron house and we may not even be able to log into the drill for them. Apologies if anyone works for Microsoft. This is just a long one in drill for two. So you give it a region, you tell Terraform what region you want to build your infrastructure in and you give it an access key and a secret key. That's your I am code. You go to AWS or you can do it by the pipeline and you can give people access to do specific jobs within an environment. Or they can be an apportioning type of thing. I have these programmed into my system right now so that we can actually see what it's going to do. From there, we're going to create an internet gateway because your boxes need to be able to talk to the internet. And we're going to declare a VPC. Because I like all of my systems to sit inside a VPC so that they can communicate with each other and it adds that layer of security. I'm going to give the VPC a specific side block and I'm going to just give it some tags just so that you can see it freely in the AWS console. I'm going to prove to you that there is nothing in the AWS console apart from the default VPC. If that refreshes, you'll see that there is only a VPC in place. Let's layer this demo. What we can see right now is that there are some files in here. There's one which is provider, two which is an AWS VPC. And then for the purposes of the demo, I'm giving them numbers to show the order in which I'm going to layer them. In real life, which I'll show you the proper example of how this code is going to look and how we use modules in order to reuse it across the environment, it's going to look vastly different and vastly different. So if I just say terraform plan, it's going to take the local TF files, it will form a graph of their state. I'm going to work out that it needs an internet gateway and a VPC and it will give those values and it will compare those against what the state is in AWS. And in this case, because there's nothing currently in AWS, terraform is telling us, I am going to create an internet gateway and I'm going to create a VPC. So let's apply it. Terraform apply. And what that's going to do is it's going to take the plan, it will actually go off and create it. And this is interacting with AWS EC2. If I run approximately my system, you can see the calls going through and you can actually, if you could have the right certificates, you could actually see my interactions. And we can see that the VPC has been created and the internet gateway has been created. No, that's interesting. Because what it's done is it's created the VPC first and then the internet gateway. But look, in my file, I have actually specified the internet gateway first and then the VPC. But if you notice on line two, I'm actually referencing the VPC ID that gets created as part of the resource. So terraform is building that complex graph of the interactions between our system. So we can actually have a look at that. Terraform has a way of being able to do that. You can graph your infrastructure and you can actually get the output of what the graph looks like as it's building that graph. And I wrap everything in make files just because I like to think of myself as a sysadmin when I'm not really a sysadmin. And I make files as a sysadmin. And I can just run the command, make graph, and it's going to actually run the command terraform graph, pipe it through a package on my mic call dot. And then save it to the output of the VPC. And if I open graph, you can actually see that internet gateway, it depends on VPC, which depends on provider. As you keep layer in your infrastructure and flex, you can keep graphic. You can start to build up the picture of how your infrastructure is. This is extremely useful because if you have a longer running network infrastructure, what's the first thing that you usually ask when you join a company? Does anyone have a network topology diagram? And the answer is usually, yeah, but it hasn't been updated in a while. Because we don't like updated things. We don't like documentation. Code is our documentation right here. It's generating work. It means I don't have to do something. My life is easier than I get to have an extra company. So let's actually add in something else. Now if I actually say terraform plan, it will compare what's on my local machine against AWS and it will say no changes. Everything is currently being built. It should. So it goes off fresh as a state and then it'll say, we're changing. So let's pretend that we're a nasty piece of work and somebody has gone and gone. Do you know what? I just need to make a quick fix on something in our infrastructure. So they go on to the AWS console and you can see, actually, VPC is right here, which is good, I forgot to show you that. I'm sorry about that. And what they do is they go, I'm going to pretend that this is like Cytoblock or something that they need to change, rather than just a tag in AWS. And they could just say, so they've actually gone into the console and changed that. If I refresh it, we should see what it says. Now the next time I run terraform, it's actually going to tell me that someone has gone manually made a change. That is not current. I told you at the start about configuration management tools, keep in everything a known level. Terraform is doing exactly the same. Now this is not, I'm not saying that this supersedes the need for configuration management tools. No, not in the slides. It can work very much in combination. This is for infrastructure. You can do package management and deployment. But this specific example is, this is actually maintaining the state of the level of impact. And if I say terraform, apply again. Any question? Anyone want to wait? I don't know what day it is. I actually do. I think I've only slept for like 500 since I left a little bit of Wednesday. So, no. The nice thing is, obviously we've got a conference with a lot of people, on the same Wi-Fi as I'm on. But it's actually still proving that it's a decent speed. So what we can do is, we can actually say terraform destroy. It's going to destroy the infrastructure that we're getting into the area now. Fast implementing environment. No need for a staging environment anymore, right? That will make it just spin off an environment really quickly and test some systems. This has multiple, multiple benefits for your company. One, cost management. You do not need to continually keep long running infrastructure in place. You can actually spin, if you're in the client, that is, if you're in physical data centers, we have a different scenario and we can talk about that. We can actually talk about that a little bit. But, if you're actually in the scenario now, where you only have to keep the staging environment, you can call it X. You can call it whatever you want. It doesn't have to be the worst staging. If you're in that scenario now, you can send it up quickly, test your system. This is extremely useful for QAs. Any QAs in here? I was going to interrupt that again. But this is really good for QAs, because the one problem that QAs continually have is that A, operations of blockers for their environment, just because everyone in the company is asking for operations, right? And the per QAs are always the ones left behind. And B, they can spin their own environment. Anyway, so let's bring in some of those. Number three, public subnets. So, we're inside our VPC. A typical VPC topology looks like three private subnets in a public subnet. Many of them, as you want, it's entirely up to you. I feel really valuable. For me, I like three private subnets. I like to keep all of my instances, all of my databases, any of my infrastructure in private subnets. I only ever put low balancers and stuff like that in the public subnets. Therefore, I'm using a 22 private subnet. That's not really a huge amount of IP addresses. A slash 16 is 65 closed and 500 closed. So IP addresses, I believe, slash 22 is somewhere in the region where two are closed. Correct. And we have three. And then from there, we're going to create a row table, because each of the subnets needs to be able to do a tool to go via the row table to the internet gateway. And then we actually create the row associations to tell it that the subnets are actually connected to the gateway. Really easy. So what's that going to look like? Oh, it's got a little bit more complex. We can straight away see that obviously the provider is the center of our universe and the VPC is very closely behind it. But we can now start to see the interdependency that the subnet private public and the secretary public and the tertiary public and the internet gateway are all pretty much at the same level and they all depend on VPC. But the row tables are actually required for the row associations and the row associations are created after the subnet. So you can actually see how that works. Oh, and everything at this level right here. See this, this, all this level? That can actually all be run in parallel. Terraform is very clever at being able to work out because they are not interdependent. I am going to run them in parallel. You can pass in parallelization equals 20 or 30, whatever you want. Quite a fault, I think it's four. Four things happen at the same time is okay. It means it works at a decent speed. Now we can actually run it. Now let's plan it first. And you can see plan nine to add zero change and zero to destroy. We have VPC, subnet, subnet, subnet, a row table association, a row table association, a row table association, a row table and an internet gateway. Now we can terraform and apply them. And now we roll out. So that will take a few seconds to roll out. Let's go and have a look at the next part of the group. Now we are going to need an SSH feed. Because we do. You are going to need at some point to have the SSH feed into one of the boxes. Let's pretend you have a Bastion box that you want to jump around in your internal network. You can associate an SSH feed with that box. Sorry Windows developers. There are ways of doing this with Windows. And hopefully the new Windows server of open SSH is going to be brilliant. So then everyone plays through. But for right now we are only available in Windows boxes. Just because my work was done. Now you will see that it is out of the main resources from before. We can go in and we can see the VPC is there. Here is the VPC. And if we go into the subnet tab. You can see agile in there. Public subnet, public subnet, public subnet. You can change the tags or whatever you want. Anyway. Now. This code is saying create a key pair with the name agile in there. And the content of that file that I want to upload will go and do a lookup on the directory of SSH slash and then agile underscore in there dot pump. Here is the SSH directory. There is nothing in it right now. It is an empty directory. Let's see what Terraform does on that. Oh, it is going to warn us straight away. If you want to try to change the infrastructure. That something is missing. It is validating what we are trying to do. In fact, you can come up here to see it in the back. It is actually validating what we are trying to do before it uploads it to production. It is telling us that that file does not exist. Let's make it. Have some sample. Who does that? There is a key gen of type RSA and calls it agile in there. And then we have a look of SSH. You will see that it actually now has agile in there public. Let's go and see what Terraform does now. It is going to tell us everything else is good except for the fact that women are public key. Public key. It is a public key. It is a public key. It gives you the exact contents of the public key that is going up on the floor. You can see it and check it before you actually apply it. So let's have a look at Terraform. Apply. That is just the proof that there are no keys. It might be, it might have created it just because I'm waiting on it to refresh. Okay, good. There are no keys in here apart from the song that was given last night on the same hit of Desmond as me. And you'll see when I refresh in a second that I work with that key. There's a lot of hints. It's just profiling to see the speed in the front. I'm getting about 1.4 seconds. So it's a little runcy. You'll see that it adds one resource. The apply is complete. I might go and refresh at the edge. We'll see how it ends. No problem. This is brilliant. We're building up a layer of infrastructure. Let's go and add the next piece in. Security groups. People laugh at my accent when I say security. I've already messed up accents between Irish and American. So they're not security. But security groups are very useful in the fact that you can restrict access to your piece of infrastructure in certain locations and certain side of ones. So locations being on the end or within your network or your home computer or your office and then other security groups can be layered to say, but inside our network only these machines can talk to these machines. So because sometimes you don't want to like grant everybody every application inside your infrastructure to talk to your database. Because if somebody tries to run a destructive action and they don't realize they're running a destructive action, they can cause a problem. So security groups would lock that dive to only apply this specific action to the database. So I just have two security groups, three security groups, four security groups that rules like. The first one is a node. I apply a security group just for this demo at every node in my infrastructure. In production I've never seen this. Ever. Ever. I apply the correct security group. Then I have a NAT box. I'm going to rule out a box that we can SSH it to and I'm actually saying allow SSH access to that box from anywhere as long as you have the key. That's the ingress port 22, side of block 0000. And then allow SSH port 80 and port 443 because the boxes that we roll out later in the private subnet need to be able to talk to 80 and 443 to download things. Then I have a web box which we can talk to in port 8080 and then lastly we have an ELB which we're going to use later or again on port 80. No problem. It will work. So if I set a plan it's going to show me four security groups and it's going to show me all the ingress and egress rules You don't have to worry about that. It builds a hash so that it remembers the number and the order of what they need to do. But we'll just let this apply and then we'll go on and have a look at the next part. Do you want to apply? Done. I'll just let that apply in the background. So, number six. Actually let's have a look now. We'll come back in a second. Number six, a NAT server. Three December 2015. We had to create our own NAT names that we didn't know yet. They didn't have a way of doing it for us. And what a NAT server is. It's a box that every other system talks through to get access to the outside of them. Because sometimes you want to stop malicious traffic coming into your system and also leaving your system. So you would actually control it with a NAT server. NAT gateways are a new managed thing from AWS. We'll look at those a little bit later in a different part of the demo. But this is the root, bog standard, old way of doing the NAT boxes. If you're still doing it this way, that's fine. I'm not saying you have to go and change. This is the way some of my production line structure works. Right now. I haven't needed to migrate because it just introduces a risk that I don't need to. And then you can see the config in a second. It's going to roll out an instance based off of AMI that you can go and look up the specific AMIs that you want. This is using EBS backed SSD 64 bit AMI. From AWS, there is a site that you can go to that I'll show you if you really want to that just lists all the AMIs. It's really boring. But it'll allow you to go and pick up all the AMIs you want to. We're going to show later building our own AMIs. Because I think that's a key point of how you change our infrastructure going forward. Next, we're going to say a specific instance type. AWS has instance types. We're going to go for Team 2 My Group because I'm really cheap and I get them free as part of my entire thing. I don't like paying for things if I don't have to. We're going to give it a subnet ID because we want to actually say take this box and put it in a subnet. I'm going to give it the primary private subnet. So US West 2A for me. We're going to give it some security group so that I can SSH in. We're going to give it an availability zone. Again, US West 2A. We're going to give it that key name, Agile India key name that I created for you there. And we're going to say associate an IP address. So Terraform then has this ability to create a connection to that box. Oh, it's fine. So I'll just describe this part right here. Terraform has the ability that we can create a connection to the box and we're going to say create a connection to the box using the user ability and using the key file that's in the SSH folder. The one that we've created is part of our setup. And then when you've created that box, when you connect it to that box, run these scripts on them. So these are just really simple scripts that's set up now. So it's IP tables changes from IP down to route masquerading. And then it just forwards all IP for config requests. So it's going to go through that. And you go back and we can see that it's ruled by security groups from before. Let's have a look at what that stands for our graph. You can see it's got even a little bit more complex now. You can start to see all the interactions between the systems. See in the provider. And you should actually be able to see a little instance right here. Again. And a terrible one plan again. And all it's going to do at this point, all it should do at this point is create an instance. And again, while this is rolling out, I'll come back and I'll show you the next part which we layer in pieces of the image. It's just going to add one thing. We'll see it right at the bottom. We want to add zero change. It's going to give us that update every time. So let's apply that. Let's hope that works. I'll come back and check that in a second. And then let's go and look at number seven. Number seven. Private subnets. So private subnets are slightly different. There's two things different about them. Firstly, we say map a public IP on much equals false. These do not get public IPs. There's no need for them to get public IPs. So turn that off. And then secondly, the longest part of the graph allows us to introduce something called the pendulum. So these subnets depend on their map box being created. The reason they depend on the map box being created is that in our route table we say that for the route 0000 so all inbound and outbound traffic we should pass it through the ID that gets created on that instance. So we want all that infrastructure all those requests to go through that map box. So that's the only way that they're different and that's why we said they're different that we probably could correct. And then again we set up the route table associations in the same way. So here we go. It's not how you try to connect to the box. So it's actually brought the box up. It knows the box is up. And it's going to try a few times. So Terraform works in a retry state. And I think it'll connect on the fourth try. I don't know what I actually want. I've not tried from India to Oregon. It might be a little bit slow on the network. But it's going through the first, second, third. It still has a minimal connect. Up there you go. Connected. Right at the bottom you see connected. And now you're going to see it's going to run SSH scripts. Hopefully. It might just be running in the background actually. So it's connected around the scripts as we'd expect. So that actually we have a full map box full map instance. And we can go back to AWS console. One from zero instances to one instance. No problem. Great. We can see that it's running. It's initializing. And it has the correct security group, etc. This is brilliant. So we'll come back. We'll run the public subnets in a second. But let's also introduce three more instances. So I told you before about how Terraform can be used to do employment. If you really want to. We have written a very rudimental package of a small application. All it does is basically on the browser. It goes to the hosting and the ID machine. And it's all back to the web server. And what we're going to do is we're going to create the first one. We're going to create to it. Excuse me. We're going to connect to that box. But look. We've introduced a bastion host. We're going to say connect to that box through the bastion. Because it's a box that gets created in a private subnet. So Terraform won't be able to get to it unless it tunnels through the bastion to get there. And then when you get there, install Ruby. Because I like to be a bit of a sadist and it's going to take like 20 minutes to install Ruby. And then go off to Dropbox. Because that's a really... I'm going to create that as my package repository. Because I can get to it. Download the dev. The dev is like a very small package for Ruby 2. Install the dev. Install a gem. And set it... Basically start the server. Start server. There's going to be three of those Ruby. And then we're going to output the IPs. We can actually output pieces of the infrastructure and the artifacts that Terraform created back to the console for us. So Terraform is going to output the IPs of those systems back to us. So we can actually see what they are and we can use those elsewhere. And then lastly, I'm going to roll this out all at the same time. We're going to create an ELB. Because I have to load balance my applications. You never really run one application production or one instance of an application production. You run multiple instances of an application production. You load balance. So we're going to roll out an ELB. And we're going to say the instances in that ELB will be private one, private two, private three. Every instance will lessen on 488. I will have a health check to make sure that the application is responding on 488. And then we're going to output the DNS of that ELB. Notice that I'm actually creating this in a public summit. Because we like to confuse things, instead of saying internal equals true for private and then public equals true for public, we actually have to say internal equals false. So that's a documented limitation of ELB. And if we make the graph again, we can start to see how this is going to narrow. So this is the infrastructure that we've actually built up as part of today. You can see that it can get complex very quickly. And we have an ELB which sits right at the top. And then we have the instances, the root table associations, and so on and so forth. We couldn't draw this by hand. There is no way out there. It just doesn't know. So let this go. Let's go and have a look. What it's doing right now is it's created the public subnets. We can see that they're creating. And it's finished creating the root table associations after the subnets. And now it should actually start to draw on the instances themselves. Because you can see that it's instantiated. It actually says create that instance and that goes off and creates it in the background. And then we can see that it's trying to do one and two. I don't know if you can see that right here. AWS private one and AWS private two. It's doing those in parallel because they have no need to be run one after the other. Traditional configuration management tools do things seriously one step at a time. This will do as many as it needs to at the same time, which is great. And it'll connect to the box at some point, hopefully. Again, it might be a little bit slower. But we can actually go to the AWS console and we can actually see that there are no four bars in there. You can see private instances, one private instance, two private instances, three private instances. Now, smaller the box, unfortunately, the longer it is to initialize just because the box doesn't have as much color in order to do its initial checks. So it's usually a little bit slower to do these types of private instance checks. So it's still trying to connect. It will connect at some point. You'll see connected, connected on three, two, one. Now you'll just see it installed in the box. There you go. It's installed all sorts of rubbish in there. So while that's happening, has anyone got any questions? So this is going to take a few minutes to install. So we'll come back in a moment. So the question is, is the retry mechanism only available through Terraform or via the AWS API? So the AWS SDKs themselves have actually got retries built into them if you can use that part of the API. The Terraform has just its wrapper code written and go around the APIs and it's taken care of the retries and the status checks for you. So I didn't have to go in and say my Terraform code, retry this until Terraform knows that the state of the box should be available or running whatever the status is and it will continually try to pull that The question is, does it equally work for Azure? Yes. Tentatively. No, it doesn't. I actually wrote the Azure provider. So if it doesn't work, you can play with me. Yes. In your Terraform file, I saw that you defined three different tasks. I think the metadata for all of them is the same, right? So why not just like say, count equals to zero? You can do that. We can. What I wanted to do right now is show you the simplistic, the most the literally most simplistic setup that you can possibly see. Some reason some of the boxes have failed. I'm going to go and look at those. But what we want to do is just show you if you're a beginner with the infrastructure you can lay it out this way. But what we're going to do is we're going to refactor this and actually move it into what's called modules and use it constantly. So stick with me. I promise you that we'll get there. We will get there. Now, let's have a look at seeing why that didn't work. Failed to fix. Awesome. Ruby. Amazing. Okay. So let's pretend that one works. And we'll skip on because I'm rapidly running out of time. So again, you can... The question would usually be, okay, is but I have just created all of that infrastructure in the US West too. What if I wanted to do it in the US West? I don't have to copy and paste all that code, do I? The answer is no. You refactor what's called modules. Let's close that. Let's tariff on the streamer. One, two, three, back and forth. So let's start with what's called a module. I'll show you some sample as your stuff. Hopefully if I get time. So modules are packages of tariff on the streamer. We've got mostly developers in here. I think a module is like a helper method. A way of just being able to manipulate some data and actually push it out in the other way. It's actually a module. And if you look at the setup of our VPC module, you'll see that it takes a name, a set of blocks, public subnets, private subnets, and inviability cells. So those are the variables that we're going to pass into our own map. And then you can see interpolation. It's the whole way down through the system. But there is the use of count. So we no longer have to say public subnet, public subnet, public subnet, private subnet, private subnet, private subnet. We're saying private subnet using a count of the public subnet that you pass in. So if you pass in two, it knows it's going to have to create two. If you pass in three, it knows it's going to have to create three. And so on and so forth. And then each cider block, we can actually say take the index and the public and private subnets and pass in the list of the IP addresses for the public subnet, iterate over that list and when you get zero, one, and two, and those be in the count, call out the numbers like those two and a set. So the interpolation is a little bit complex, as you can see. But it means that it's all packaged away. Lastly in here, we've done things slightly different. No longer have we created a management, we've actually now created new resources called AWS Math Gateway. Math Gateway is the managed service that you can go and sign up to if your account has access to it. And you can actually say for every one of our subnets, that's this syntax right here, aws-subnet.public.star, it's called the splat syntax. Go and get all the ideas of our public subnets and allocate each one of those subnets to all of the elastic IPs that we've created above. Because you need to give a Math Gateway a public IP, and the same way as I did my Math instance, the one before a public IP. So you can go and create an association that way. And then at the end, private subnets, public subnets, availability zones, and the name of the VPC ID inside of them. So we can reuse all that information that comes back from our module. So how do we do it? I declare a provider, again, and I give it a region, where you can let your system use the region. I'm going to create an SSH key, which I ran the generation just before I came in here. Just to show you all that. And here is some code I prepared. We declare a module. The point, we actually say the source of the module is right now I'm keeping my modules here and my Terraform code here. This module path could be a gateway. It will go off, and it will download. So if you have a central registry of your modules, it will go off and download those. The name of the VPC is agile in there. Side of block is 10-000-0-0-0-0-slash-16. So you have all 65,000 IP addresses available within that side of block. The public subnets and the private subnets are this. And let's use availability zones 2A, 2B, 2C. Now we'll do it. So here we go. There is currently no VPCs at the limit, apart from our default VPC, because we've torn everything from in here. Everything's gone. Terraform plan. It's actually going to tell us, because right now if we look at the code, let's say you're a developer and I'm an infrastructure engineer. Actually, that is the way, right? But the way it works is that sometimes you, as a developer, you don't care what the implementation of what's inside the VPC is. You don't care. All you want to do as a developer is spin off a system that has a VPC and you want to give it VCC. So I would give you a module path, and then when you run Terraform Client, it will actually show you everything it's going to do. And it's quite a complex layer. You know, it's going to add 28 things. But as a developer, you've just had to declare eight lines of space. Eight lines of code for space. It's pretty easy. So let's roll this up. Let's make graphics first. You know, as a developer, if you're given that small package of code, it's nice to actually try, if you're interested to understand what it is, the graph command will treat it properly. So it's going off and it's doing its thing. I will finish pretty shortly. And it will give us everything we want. But if we look at the VPC right now, I will actually see it there. Is that a VPC? It should have six subnets at this point. Public one, public domain. Public three, two, one, private three, two. They're all there. They're all associated. They have the correct side of box. And you can see that they have the correct IP ranges. Two thousand and forty-three available. And a thousand and eighty-five. Because you keep something in reserve. I'm right at the bottom. And it likes to finish. I don't know what law it has to do with it. So before we create it, I'm not a fan of that. Excuse me. I'm not a gateway or a bastard box. But you can SSH in them and SSH in the inner room. I'm not a massive fan of that. It feels to me like a single point of failure. But if it goes down, you then have to roll out a new one and you have to move it around. So what I'm actually going to do is I'm going to actually roll out an open VPN that we can connect to the boxes. We can actually connect to our VPN and then talk to any boxes inside without having to tunnel through and through a specific box. So we've written some module code for this. It's not. It's actually available. But there. Open VPN. You can go and have a look at it. So it has a security group that allows the address rules on different ports, different protocols. It has an instance. It gives you an IP. And again, I have written some code that does this already. So we're going to declare the AMI that we want. Open VPN. I've already created AMIs that are available in the AWS workplace. So the marketplace. Excuse me. And we can grab one of the specific AMIs for the specific region. We're going to, again, give the source of the module. We're going to give it the VPC ID. Look. We're actually saying, get the VPC ID for module, VPC, module, VPC. And then get the VPC ID. So we're actually using the state that comes back with one module, which we use another one. We give it the public subnet. We give that AMI. We're actually passing this AMI into it. And we pass the key name in. And then we're going to get a VPN IP. And then we're going to set up a compiler. You can see that it's actually created the first one. It's got a graph again. I'd like to see how it's going. So the graph again, it's got a little bit more complex. And it has actually added in the VPN stuff. It would be really helpful if I saved the file. But it would be really helpful if I would just put it up. So this is just going to come back and say that there are no changes required. You can see that it's, again, it's a little bit more complex where we have actually got our open VPN, our security groups, which talk to a provider on SSHT, which then talk to our actual AWS provider. And it's going to go create that. Question? What's important is different. The reason I like Terraform is because it's entirely ad imported. So you can run it as many times. It doesn't matter. If it fails, it will just run it again. So the statement was, which I agree with, is that the reason that he likes Terraform is because it's entirely ad imported. Yes and no. Some actions are yes. So let's say, for example, you declare a security group. And inside that security group, you give it the rules. But then you actually have an external rule, which is a security group rule, which has to add to Terraform. Then you get into the cycle of Terraform thinking that it has to apply the changes, remove that external rule, and then reapply it. So if it's in there a little bit of a cycle, that is something that's actively being worked on. But yes, it is out of code. And you can see that if I run it, and it doesn't need to do any changes, it doesn't need to make any changes to my infrastructure, which is good. Very good. I would just do one point here. Of course, in instances, now it is, when you apply it, then it will get create those instances over the cloud. Then making the part of specific domain, making sure that other configurations to be done on the system. Will it happen with the same model, or does it need to be done with some different tool? Let me see if I can understand the question. The question is, once the instances are up, and they're part of our domain, or they're part of our system, and they're actually into, you know, so say you've brought up instances that you're actually going to run tests on, and you're asking, would Terraform roll this as the code out as well? So that would be other systems, like for the CICD part of it, which will come into the picture, but here, about joining the part of the domain. So as of now, it is doing all the configurations related to the security network addresses. Gotcha, gotcha, gotcha, gotcha. So the question is, is that if we're in something like a Windows environment, when the boxes come up, would Terraform take care of making the part of the domain? The answer is yes, I believe so. You would have to write some extra code in the Terraform config for it, but yes, it can do that. It's just because we're not using Windows Boxes right now, it's very simplistic because the Linux Boxes just come up, and the rest of the stage isn't going to work, right? But yes, I know exactly what you're doing, and you can do that. So there are methods in Terraform that will allow you to manipulate vSphere, for example, and because you can then run external scripts that can actually join the domain. Okay, so we've added our VPN, and now we're going to actually go and see if we can connect to our VPN. So ssh-i, because we're going to pass the identity file that we want. Agilindia, it's test keys on Agilindia. The user is open to the end. Yes, at the IP address that I created, I should now be able to, ssh-n, verify the identity, and we're going to go configure our VPN. So there we go. So you're in open VPN, as I say. They haven't created this box, so all the configuration has all been redone for you, which is excellent, and you have a license that can't be agreed to. You can read that. Who brings licenses? Should be licenses. And then it'll ask some initial questions. I have the answers here. If you want to read about the answers, I don't really want to go through each one all the time. And it creates a user called OpenVPN. I'm just going to change the password for simplicity sake. Sudo, password. Let's just change the password through password. Don't ever do this in production, please. And what we're going to do is we're going to connect to that box in our browser. We have no tier, so I'm just going to trust that we will go like, certificate. This is like a very rudimental demo. Let's go to it. Give us an OpenVPN client. So we're going to log in. We've downloaded the profile, but we now have a profile in which we can SSH. So I used a small SSH, so there's a small VPN client called Viscosity from Mac, and you can see at the top. I just clicked it, it's opened it in Viscosity, so I can connect to that client. So let's come out of this box. I'm actually again inside our VPC. We have a good OpenVPN word. We brought up some private instances before and we deployed an application to help. But our way of deploying the application is going off the drop box, downloading a file, and doing a DPKGU install of that dead file. Nasty, nasty, nasty, nasty, nasty, nasty. What we're going to do is we're going to build an AMI on our application. I am going to use something that has a UI called console. We have been doing this for a while. We have pre-built AMIs and pre-built scripts that do this for us. So we use a tool called Packer. Packer is a way that you can declare how an AMI looks, not just an AMI, but also vagrant forms, or Dockerfile, or any of these systems. But how it's configured, where it's built, and then lastly, after it's built, you can export that to different cases. For example, Packer can take an image and build it in US West 2. When it's finished with the AMI, you can copy the AMI across to US East 1 and 2. So for one build, you can copy that exact same artifact, which comes back to the keynote first name this morning, and let everything be in an artifact across to different locations. Excuse me? Yes, that's the question. Yes, you can build as a business with this as well. It's an awesome thing. It's really, really good. If you Google Packer Windows, then you'll actually see a huge amount of configuration already around Windows Server, and you'll be able to use a lot of available source templates for zero. And inside this build law it's going to do is it's going to... This uses Amazon Linux. It's just going to install the EC2 tools. It's going to go off to a pre-configured box and it's going to install... I actually created this this morning because it takes about five minutes or six minutes to build an AMI, depending on how complex it is, where it's copied it to. So I have an AMI ID already. I'm going to take this AMI, which ends in EV4. It's already written under this. It's going to declare that console API, AMI, which ends in EV4. It's going to go off. It's called console. As I... It's going to use the console module, which spins up three boxes, low balances in three boxes, gives them an elastic API, gives us that DNS in the ELB rack, and then it actually will configure those instances. Okay, let me rephrase it. What it'll do is it'll go off and spin off the instances, but the instances are based already off the console AMI. So therefore all work in the instance has already been done. Because it's already been done, the time to start the application is just the time for the box to boot, which is much faster. This is a lot of the way that Netflix used to do their deployments in the cloud spinnaker and those types of systems, where they pre-vaped their AMIs with their applications, and then just launched the application. Just launched the AMIs, and that way they could actually use them. It's got a launch config, and it'll actually give the children self-healing. So let's terrify the form, apply, and make sure that it starts to work. So I'm going to show you the graph now. It's got a lot more complex. Not really even federal experience. And it'll get even more complex as we start to add more systems in place. It's actually pretty nice. Now as part of this, we want AMIs to have roles and policies, because we don't want everything to be very inside our system. So the console module, which is open source for Terraform, has actually got the correct policies inside it. Now when I say policies, I mean this. All policies. We want the server to be able to get, and be able to describe the auto-scaling groups, describe the instances in that group, describe the availability zones, instance data, instance tags, and so on. Because there are scripts built into this console AMI that require us to look up the other servers in the cluster so that it can form a cluster. Excuse me, the other servers in AWS to form a console cluster. So it doesn't look up on the auto-scaling group, grabs the instances, and then for each instance it goes off and gets the IP so that it can actually form the correct console. But if we go to AWS, we shouldn't I see. So we have one console server. I see the third one. Well we can actually see that it's agile in their console server, agile in their console server, agile in their console server. We see the IPs. And then at the end we actually get an ELB address. So everything has happened right now inside the private subnet. Let's try and connect to that ELB. Or let's excuse me that instance. And it's 4.8.500. And it doesn't resolve. Because it's an internal action. And what I would usually have had to do with the VPN client is I would have the tunnel through into my bastard box and then curled to make sure that I could get it. Let's just connect to our VPN. This is why VPN solutions are a little bit better. You can see it's actually loading in the background. And it's actually forming a cluster at this point. It's a little bit slow, just because I'm VPNed in the Oregon. So it's a lot different to our service of this. But at this point we can actually see that we've just built a three-node cluster. Which is great. So what I'm going to do is I've said that this is backed by the auto-scaling group. An auto-scaling group is away in AWS. You're saying that I need this group of servers to have a specific world, you know, a specific number of servers. We've told console we need through. Let's go into AWS. And let's actually delete one of the instances. The equivalent action in Azure is an available example. Because you would actually give it a specific number of machines and availability set and it would then bring the box back up to magic. So let's terminate one of the boxes. I'm going to go back to console and refresh. I'll just take a second because the box hasn't finished showing up yet. And look, one of the boxes has actually gone unresponsive. The agent is not live or it's unreachable. And we'll come back to that in a second and it will actually spend a second on it. So console is a very specific application that does a specific job. Anyone use Elasticsearch? Let's put up an Elasticsearch cluster, right? Because that's a little bit more visible. It has a little bit more of a configurations rather than just anything else. So, what we can do is we can go back to our packer strips and build an Elasticsearch cluster. So the template that Jason has to say, build an Elasticsearch image, give it the specific time that you want it to be, but you know, give it the name of the image there, my name. An instance type that you want it to be based on. No, we have three types of provisioners at this point. Packer is made of it's made of builders and provisioners. And then it's got a third one which is package already. We don't need to package because we'll build them in AWS and we can actually take care of using the AMI off the end. But first one is we install some dependencies. So that's like git, Ansible because we're going to use Ansible to show that you can actually use packer and our configuration management tool together. We're actually going to install Python DevTools. We're going to install pip and we're going to install pip install Ansible, Jinja2, Netface, Photo, AWS, CLI, etc. When it's finished run the Ansible playbook called Elasticsearch, which just installs the full of Elasticsearch. And the full of Elasticsearch looks like install Java, accept Java 8 license. This is a really good hack around actually having the it's a little bit slow but it's fine. Have the Elasticsearch repo, install the Elasticsearch package itself, copy some configurations, Jinja configuration around install some plugins so you can actually give it a list of plugins to install and then lastly, the most important thing is when the box boots set Elasticsearch to be enabled you can go and you can create that packer or that AMI. I've already done that because it takes a little bit longer because it's using it's using Ansible so not only does it have things called PEN2 it has things called Ansible, so we have a pre-made AMI that does this right here. If I go back to when I go back I can grab the Elasticsearch configuration root and it's declared in the same way. I'm going to run the module use that module, pass in specific credentials like me give it an instance type we'll give it an M3R the Elasticsearch instance is going to be quite junky and then at the end, I'll put it in the OB address I haven't run anything just yet it has just replaced that console root I don't know if you saw actually I've never seen a second console AMI so console currently has a cluster of three nodes one of them is done. If I refresh very short it should get a fourth there's a fourth node so it has actually used auto-scaling to bring the cluster back to a healthy state really useful auto-scaling terms if you're an AWS and you don't want to use them it will save you a lot of time with the guarantee so lastly let's just make sure I save the file so that's one. Let's have a look at the graph the graph has got even more complex you can start to see how your systems interact with each other because of security groups it's going to take a few minutes for the Elasticsearch stuff to roll out so let's go back and answer a question I asked before why would we use a module? we didn't want to have to copy and paste the code when we went from EUS 2 to EUS 1 all we actually have to do is actually we use a sample so if I create a new folder here called EUS 1 then here then inside main if I just take those two those three declarations right there place them all that's the configuration exactly use an exact same module just changing four lines five lines of code in order to roll out the same system I worked at a startup until very recently and at that startup it actually took me six minutes to deploy our entire infrastructure to a new AWS region the reason it was six minutes is because it took six minutes for the Amazon RDS instance to come up because it's a database and it's a little bit faster and a little bit slower and it has to go through all its checks that was our disaster though that was our six-minute done so it's building the Elasticsearch instances which is great if I refresh I should see three new instances Elasticsearch from there on the way you can see it first comes in there will be a gem called terraforming so there is this gem called terraforming it will give you your exact AWS cluster in terraforming like files you can just leave it at DFI in terraforming terraforming is a gem I already have installed you can actually say terraforming this is to go and get the existence if you don't manage your infrastructure right now hit on the S with terraforming and you just manage it manually you can import it use the terraforming it will actually build it for you if I say terraforming EC2 it's going to go off get the all the instances that are currently having my account by extending to pass in a region no? it's just a little bit slow right now but you can actually see all the instances that are there written as terraformed you can take it and manage it the elasticsearch cluster should be running right now the elasticsearch it might be a little bit slower because it's a Java cluster it's true and the most important part of that is that it has three nodes it's easy to just launch an elasticsearch cluster in like 50 seconds maybe even more so quick to do and again we can go into AWS console we can destroy it when an Amazon would replace it because of the autoscale group and because of the Ansible code that we wrote I can actually say plug-in or marvel which is one of the plug-ins I've actually installed a lot of plug-ins as part of it just so that you can see that it actually goes off and installs some plug-ins and this is the marvel plug-in which gives you an insight into your elasticsearch cluster which is brilliant absolutely amazing so, what else can we do with it? we've done AWS we're not going to have time to probably show some demos of the service it's a little bit slower but the code looks in exactly the same way you declare a provider you have some authentication you need to do that with Azure to get some credentials back you declare a resource group because the new way of doing things with the new Azure resource manager is that everything gets grouped inside a resource group we can create a virtual network we can create some bets in the same way as we did before and if you go and have a look at the docs to providers 0RM things you can manage to CDN, DNS network resources resources, SQL resources storage resources just last night we actually merged virtual machines so virtual machines is going to be in the Terraform 0614 which is really good which means it's a complete set of being the manager to use your infrastructure some of the other providers I actually use are DigitalOcean I love DigitalOcean because it's a really simple plug a provider that can spend box stuff very fast really quick it's like super quick as well and they say that you can spin up a DigitalOcean and drop it in 50 seconds so let's put that to the test the only thing you need for DigitalOcean is you just need a DigitalOcean as an indication token and I'm going to go into DigitalOcean there shouldn't be any droplets here right now there's no droplets otherwise just to show you that Marvel works it's there and you can actually see that Marvel is in place and it's coming let's go back to droplets this is a lot faster because it's the droplets that are going to have a much smaller effort so that'll just go on in the background for the next we should actually be able to see that the droplet has been created and it's almost finished so where do we go with it what is this use case so I have 4 minutes I promise I won't keep you long in development circles we're usually asked a question the question is designed by Mary Poppeday this is fantastic how long would it take the organization to deploy a line of code a line of code and developers constantly think about that that's why you go back to the previous keynote where he said about deployment pipelines and about automating all your systems brilliant because we can take advantage of these two like Terraform, CloudFormation, and MarvelFormation etc we can ask ourselves as infrastructure the same question because there's no reason why we shouldn't what we've actually built right now is disposable environment not only have we built the infrastructure itself the nodes and the ALVs but we've built an entire disposable environment your QA teams will love you if you can help them implement this type of change with your company we absolutely love you after our own machine what a disaster recovery can't talk to you works in a company that requires big documents for disaster recovery doesn't that code look like your disaster recovery we've practiced what we've actually done anyone ever heard of the Netflix simulation anyone ever heard of Chaos Barilla it's probably an alien to be found that arrived what it does is it simulates what happens in an environment when a node goes down what are we going to do we destroyed one of our console nodes we've actually practiced being our own Chaos Barilla there's another tool that's part of the group called Chaos Barilla what Chaos Barilla does is it will simulate the vulnerabilities of the node so that all instances in an availability zone will make sure that the system actually acts as normal and then you've got Chaos Comp which is what we've looked at a little bit rudimentary is by what would happen if an entire region was done what would happen if the entire of the EU West want data center for AWS disappeared and what we can do is we can just go look at the small conflict changes spin up our entire cluster in US West 2 point our DNS at it so this is really useful this is stuff that's going to help I'm not saying that you must go back and start using infrastructure in this way there are many ways to manage infrastructure this has been very successful for me quite a while I've been able to manage hundreds hundreds and thousands not hundreds of thousands I'm not at that scale but I've been able to manage hundreds and thousands of boxes in this manner very simply when you can declare such simple things as autoscaling groups that are baked off AMIs at the top life is a lot somewhere any last questions 30 seconds I thank you all very much for your time if you have any questions at all I'm awful at email you can try and email me but you then may have to tweet me but please do get in contact I'm happy to share all this code will be made available so everyone can see it and you can go and try and let me know thank you all very much thank you all for the expensive and exhaustive workshop