 Pet and Kettle servers. So in this terminology, pets are mainly those services that you kind of nurture, you kind of give names to, you shed a tear when it's time to retire them. But Kettle are those services that basically are dispensable and easily replaced. And this is only possible when they are provisioned and deployed using automated tools. So this is where infrastructure as code comes in. It allows you to do a lot of reproducible builds. It allows you to do a lot of peer review, as though it's just any part of application code. But in this case, it's concerning the infrastructure. So OK, just a quick demo context of what I'm going to be demonstrating. So I would like to provision and deploy Jenkins on Google Compute Engine. And it will contain one Jenkins job to build a single container from a remote repo. So just to explain what it's going to look like, this is the desired outcome where it's deployed. And it's very simple. This is not production ready at all. It's just as a demonstration for what this process can do. So I can sign in. And as you notice, this is also reproducible on my local. And I find that it's very important to have some very quick feedback loop, especially when it comes to developing the tools that our people are using. Because if there's any bugs in the tools that I'm building, it's going to disrupt the workflow of everyone else. So there's going to be a very negative, larger scale effect if, let's say, I redeploy something that is wrong. So as you can see, this is the one which is deployed. If I run this job, as well as running this job, it should be able to build a single container. So Jenkins is just an example of a service that I wanted to demonstrate. Because traditionally, it is one of those services that people look after. People configure manually through the UI. They add in the different plugins. And basically, it causes a lot of havoc. If, let's say, you need to recreate the server, or you need to redeploy the entire thing. So in this case, it's running Docker, and it's building a container. So this is pretty much the desired outcome of the provisioning exercise. So actually, the first thing that I would like to emphasize in this talk is that it's not so much about the tools that you use. It's not about understanding the problem. And after that, finding the right tools that basically solve the problem in a way that is specific to this use case. So the first thing that I want to do when it comes to converting maybe a legacy service that manually configuring or manually managing for many years, I need to understand what the service requires. So firstly, it's a service state full. If it is, can this state be stored remotely, maybe by making up a DB, or maybe through some kind of snapshot thing? And the bigger question is that it's retaining the state even necessary. If let's say I redeploy a brand new Jenkins server, do I need the historical accounts of all the jobs that have run? So if let's say we don't need to store this state, we don't need to retain this state, it just becomes a bit of an inconvenience, but it's not a deal breaker. So I can demonstrate that by basically tearing down everything. So if I destroy this on my local, and I bring it up again, while this is loading, I can show what will happen when we recreate the server. So basically, in this use case, we will not be storing the state, but we need it to be at a usable place when we recreate the server. So the next thing I would normally ask is what kind of mental configuration is traditionally required? So in the case of Jenkins, for example, I need to firstly install Jenkins. I need to upgrade the plugins. I need to be able to do some configuration on startup. So these things are normally done through the UI. And the next question I'll ask is whether it can be automated using any tools or any of the inbuilt systems that the service has. And the third point I'll ask is that what kind of internal or external dependencies are required for the service to function? So for example, if let's say it runs on Java, can it run on Java 8 only? Or can it run on Java 11 or anything else? And whether or not the minor versions will affect it. So this determines how critical version locking is of the internal external dependencies. And if anyone has run into any issues with, for example, the plugins' versions shifting when it gets automatically updated, do I need to implement version locking? Because this finally determines how reliably I can reproduce this server. If let's say I provision it today, provision it 3 months on the road, by not locking certain dependencies, what kind of impact would that have? Preferably, it would not have a huge impact. But in some kinds of services and some kinds of projects it might have an effect. And finally, it's something that I feel is quite underrated. It's about being able to replicate the provisioning process locally. Because to be, to need to deploy these things just to be able to find out whether my changes have taken effect is way too slow kind of feedback loop. And it might introduce some kind of other unrelated issues. If let's say I'm deploying it and expecting it to work on the cloud. So OK, some of the tools we're using in Wego right now. For this provisioning step is Hacker together with Ansible. And locally, we are using Vagrant to reproduce the provisioning steps that we normally do on Compute Engine images. So let me just jump into some code. So this repo is open source. I've put it on my GitHub. You can refer to it maybe after this talk with you. So the first thing that we were trying to solve is that we wanted to have a level of extraction when it comes to running some kind of automated builds. So regardless of the targets that I'm trying to deploy to, I would still like to reuse some of the components. Because we are in somewhat of a special situation where we are on both GCP as well as a different cloud. And we would like to reuse some of these, some of the modules, and a lot of the processes when it comes to provisioning and then in these machines. So what was your motivation to become those tools, not something else? Why not? Jeff, why not? I don't know. Paul, why not? Yeah. So at least when it comes to the tools, these were the ones that we evaluated. So I guess there's a whole bunch of different ways that we can do it. And the good thing is that, for example, Packer. It's the secret weapon. I'm not sure how many of you guys have heard of Packer. But it's probably one of my favorite hashicot tools, mainly because it allows integration for Chef, Ansible, and basically a whole bunch of other stuff. And the whole point of it is that we can use the same, OK, so this is what I used to provision. It uses the same provision as, which in this case is a file provision that I'm copying over something. It's running a shell script, and finally it's running Ansible. But as we can see, it can build two different types of images, one vagrant and one Google computer. So this is great, because very rarely we can talk about cloud agnostic tools. So in this case, this is basically cloud agnostic. It doesn't care what you're deploying to. You can deploy to AWS, you can deploy to Google Cloud, you can deploy to vagrant, or build images for those platforms. And the only thing that you need to do is to configure the builders, which would determine where these images are being built and stored. So in this case, let me just run through what this is like. So the very first thing is doing, OK, let me just give an introduction on what Ansible is as well. So Ansible is quite, it covers quite a lot of ground with, maybe, Sol, Chef. It's basically a way that you can automate the different steps when it comes to installing things, when it comes to configuring things, when it comes to templating things. And the reason why we chose Ansible is mainly because it's a bit more declarative than the other languages. So when it, or the other scripting frameworks, we used to use Chef. It got slightly difficult to maintain. Ansible was a little bit more straightforward because it was less expressive. So it's a bit counterintuitive, but we found that it made it a bit more simple to understand and maintain because it had quite a strict syntax. And also one of the things that we are able to leverage on by using something like this is that we can use other dependencies, which, in this case, comes from Ansible, Galaxy. So if I want to install Jenkins, I just need to use an open source Galaxy role. And the great thing is that this is a fairly popular, fairly well-maintained role. And it's quite battle-tested. And there's enough tests being run on it. So we don't have to take on the responsibility of testing out these scripts. So to see in action, this is building a vagrant machine. And this is using the same steps to build one for Compute Engine. So these steps are pretty much the same, except the steps that the provisioner will go through are exactly the same, except they're targeting different builders. So this kind of ensures that you are able to test it quite reliably locally. So within this shell script, the first thing it's doing is installing Ansible, because the Ubuntu machines do not come with Ansible. And it's a very simple shell script. It's just running an update, installing Python, and installing a specific version of Ansible. And beyond that, Ansible takes over the rest. It also gives the opportunity to... So Ansible comes with some inbuilt features, such as the vault, where you can kind of encrypt whatever variables you need. So in this case, I am encrypting the username and password for this. And when I'm deploying, I can copy over the vault key to be able to encrypt it. And the actual job to be able to install everything is fairly simple, because I'm using this open source row from Ansible Galaxy. And all I need to do is to specify the Jenkins version, and a few other things, such as the plugins that I need. And after that, since I need to manage the jobs as well, I'm able to template and basically have a C job to be able to run any other jobs that I need. And within this, like 60 lines of code, this is all that is needed to provision Jenkins, which is notoriously fairly heavy and difficult to configure. All right. So as you can see, these are the things that are being done. So it's installing Ansible. And after that, it's just running through the open source script, which helps to install Java and any other dependencies that are required. So actually, the idea of this is mainly to highlight the importance of being able to reproduce the builds when it comes to VMs. So in the world of containers and everything, it's kind of taken for granted that you are able to package up. You are able to have at least some kind of reliability when it comes to reproducing the builds. And I think the same concepts can also be brought over to VMs using tools like this, where it's able to basically spin up the machines that you need, installing everything you need, and save it in the cloud platforms itself. So as we take a look here, when it comes to deployment, there's a few other things that we need to look after. But this only involves creating the machine images. So now the next bit. I'm also going to build it from scratch. So let me just head on everything that I have. So when it comes to deploying this machine, it does not live in isolation. There has to be some kind of other things that we have to build around it. So there's a few requirements when it comes to deploying this. I would like it to be self-healing. If the instance goes down, I need it to come back up in a state that is usable pretty much immediately. And I do not want to do any kind of UI-based configuration. I don't understand this itself or through maybe the console. And when it comes to security, I would like to reduce the text interface. So definitely no direct SSH access, because there are false ganners everywhere. And the web UI is only to be exposed by a low balancer. And the rest of the parts are not. So just a quick infrastructure overview. When it comes to how we would like the rest of the infrastructure around Jenkins to react. So it will live within the VPC with a public and private subnet. So in the private subnet, there would be no IP address assigned to the instance itself. So you wouldn't have to worry about firewall rules when it comes to blocking out from the internet. And the only way that we are able to access it is through the low balancer from the internet. And the way that Jenkins will be able to access the internet is through the net gateway. And within this, this is from Google Stops. There are a few different components as well. Because this is just a bit of background, because I'll be running through the Terraform code. So what I found about Terraform was that it's more important to understand what you are trying to achieve rather than the syntax. Because the syntax, you can get lost in it quite easily. So it's a lot easier basically to plan out the components that you need, even down to the smaller levels, such as this, because all these are considered resources in Terraform. But Terraform is great, because you are able to maintain a state of your infrastructure, and it can do incremental changes. It's able to detect whenever you've made something. And it pretty much knows whether to destroy, whether to just update resource in place. So I've destroyed this, and let me just destroy everything else. So if you're starting from scratch, it's a brand new project. So while this is running, let me just explain about the different components. So it can kind of be broken up into two main things. So inside my dev, this dev environment, I would like to set up the network and Jenkins separately. Because if I lump everything together, if let's say I would like to tear it out, or introduce something new into the network, I would have to mutate quite a lot of other resources. But in this case, by separating the network, which is surrounding Jenkins, and Jenkins itself, which requires certain things to function, I'm able to operate on these two things separately. Sorry, perils of doing a live demo. OK, but basically, the different components that we need can be found here. So just now, when I mentioned about the different services that we need surrounding it, we have to create a VPC, which is a virtual private cloud, to kind of isolate the services. And within that, I would have the two different subnets, the public and private subnet. As you can see, it's fairly simple. And we are able to assign different side arranges to it. So we do not overlap. We can keep them separate. And finally, for the instance, to be able to talk to the outside world, we're using the fairly new Google managed net services. So the great thing about listing our code like this is that we are able to reference, when someone is doing a brand new PR, they're changing something in the code. We are able to see that this, we're able to find out, we're able to trace at what point has the code changed, and what, if let's say, a bug was introduced, because of that, we are able to do all the regular software development practices, get by second, and find out who changed what, and what kind of effect it had. So without destroying the rest of the stuff, we can see that if, let's say, we want to create something brand new for Jenkins, all the different components that were inside. Well, we had just now, which was a backend service. In Jenkins itself, when it comes to the load balancer, all the firewall rules as well to allow the load balancer ingress, it can all be represented as code. And we are able to reuse these functions and the different modules, and even parameterize it, such that if, let's say, we have a brand new environment, so this is the dev environment, if we have one for staging, all we have to do is just make a copy of it and change the bus. If, let's say, we need to change the instance type, can be traceable here as well. And everything is managed through version control. We can just apply it. And basically, it's able to detect the dependencies and just build out whatever it needs. So this is just a quick demonstration. I didn't want to dive too deep into the tools. I just wanted to show a bit of an approach that can be taken when it comes to managing infrastructure with code. Of course, there's a lot of different comparable ways of doing it and comparable tools. But in this case, it was fairly opinionated, and it was just to show an approach you can take when it comes to solving the problem of having reproducible ways of provisioning the VMs that you need. So these are the code bases. If you would like to take a look, there isn't a read me on that yet. I'll just do one later. But basically, everything that you need to be able to reproduce everything that can be found on this. So, all right, thank you. Good example, but you have a preset case. You have an application, application requires environment. And in terms of the infrastructure as a code, imagine situations that you're running a project like in one month, you need to provision 200 virtual machines. Second month, you need to provision 20 virtual machines. Everything else in different configurations. So it's not a static case. What would be your approach to code this kind of thing? Do you define everything in a code in Terraform, or do you have some kind of way to simplify your own stuff instead of every single time change the configuration settings such as for this, I need X and says, what's the right way to reduce your own new workload? So at least for this example, it was for those that do not need to scale very dynamically. While we actually run in production, we use Minica together with Packer to be able to basically define the auto-skilling group in large conflicts. So they are variable when it comes to the minimum, maximum and desired. But basically what happens is that when the code is pushed from the CI, it's able to build an immutable image containing all the code, all the configuration into like a machine image and that gets put into the template of the auto-skilling group. So at least this slams down the requirements of needing to provision everything by hand. So we only provision the different services that we have. If I say we have 12 microservices, it's just 12 different configurations for this and a lot of them can actually share code. So there isn't a huge need. If I say 200 servers, but there are only 12 services, we only need to do the configuration for those trial services. Any other questions? You did mention that you did not assign the IPs to the VM but you were still able to target those. So how did you connect to those VMs when you are a deployment? So for those VMs, at least for... So something like this. So in the privacy of net, the only time that we would need to target the VM is maybe if you need the association to it, which we normally have, you normally have best-in-hosts or jump blocks on the public subnet, it's reachable. And you're able to jump into instances in the private subnet. But at least for Jenkins, the only thing that it needs to receive from the outside world is the web request. And I go to the UI and access it. So that's true with the load balancer. And on the load balancer, you're able to target. And over here it's called back-end service in Google Cloud. So there will be instance template as well as the instance group manager. So this instance group manager kind of ensures that at least one instance is running at all times. And you're able to target it. So it doesn't need to know the IP because we are not accessing the instance directly. It's managed through the mapping from the load balancer. So I think, yeah, there's a few more components in there, which is kind of from here, on how it's able to find it. So it's true, you know, the URL map and URL map, sorry, the back-end service. And finally to the instance itself. So all these things are kind of managed by Google. Thanks guys, any other questions? Thank you.