 I come from many backgrounds, but one of my backgrounds is Puppet. I did a lot of Puppet, and then in the last six months to a year, I've been doing a lot of Terraform. And then it dawned on me that actually there is a ton of similarities between Puppet and Terraform and between how Terraform is now facing problems that Puppet had in the early days and how they have been solved. So I figured I would present my ideas about this and see what you guys think. Is there anybody here who has no idea what Terraform is? So Terraform is a tool. Actually, I can show you the next slide. So this is actually the first commit to Terraform, which is surprisingly only four years ago. And here at the bottom it says Terraform is a tool for building and changing infrastructure safely and efficiently. And that's basically what it is. So Terraform is basically a configuration management style tool where you write code to resemble your infrastructure. You see a little picture here. That is Mitchell, who now runs HashiCorp together with a bunch of other people, and he's actually in the room over there somewhere, over there somewhere hiding. So, yeah, I'll go through a bunch of different things that I thought about when thinking about the stock. A bunch of good things. The main specific language is DSL. The puppet uses the puppet DSL. HashiCorp has the HCL, which is used for a bunch of their products and a bunch of their configuration files and also for Terraform. And that's quite nice because they thought about it well and it has a whole bunch of things, except for a few small things that we'll get to later. But having a DSL makes it very easy to write Terraform. Once you get your head around the first two resource types, everything else becomes easy. Everything is a resource. So Terraform is used to bring up infrastructure. And infrastructure, mostly we're talking about infrastructure that can be brought up dynamically. So we're talking mostly about public clouds. There are many different providers for Terraform to talk to a whole bunch of different things. It's getting more exotic. There's now also a GitLab provider, for instance, so you can bring up GitLab repositories. But most, I would say the lion's share is currently doing public cloud, either Amazon, Azure, or Google Cloud. And in there, things are a resource. So I'm mostly familiar with Amazon, so I'll talk about Amazon, and I'll try to replace that with whichever other public cloud you're fond of. In Amazon, for instance, you have an EC2 instance, which is basically a virtual machine. And that is a resource. So in Terraform, you talk about a resource for an EC2 instance. But you also talk about a resource. An RDS instance is also a resource. So every single individual unit of things, for instance, in Amazon, is a resource in Terraform. And that makes things easy. And Puppet has exactly the same. In Puppet, you talk about a file, a service, a package, a user, a group, et cetera, et cetera. So if your mind is adjusted to this thinking in resources, then that becomes easy. Modules is a bit newer in the Terraform world, or getting a bit more popular over time. In Puppet, it was fairly early on. I remember talking to one of the Puppet employees right after they made the Puppet Forge, which is the public Puppet module repository, which was basically made in a plane overnight to a customer engagement. And then from there, things started growing. And I just checked there are like 5,500 modules on there now. The modules in Terraform are slowly maturing as well. And the language is ready for using modules, except that we need a few more small things. And I'll get to those in a minute. Yes. So personally, I run a consulting company. So for me, I think it's a very good thing. Talking to enterprises, if you want to get them to adopt your open-source software, having a commercial entity backing your product is a much easier conversation than not having that. And I'm very happy with the business model that both Puppet as well as HashiCorp chose, where there is an open-source version that is fully usable. And if you're a business and you require some kind of support agreements or the enterprise UIs, then there is a business that can deliver that for you. So in my opinion, that's not a bad thing. It's actually a really good thing. Not so good. I talked about modules. So at the moment, both products are on opposite end of the problem. So the Puppet Forge currently has 448 modules matching the word MySQL, which is they're not all actual MySQL modules, but they are all doing something related to MySQL, and that is about 447 too many. But that's the way it is. The Terraform module repository, on the other hand, has zero. Now, that's not entirely fair because of course there is an RDS module, but if I wanted to build, for instance, a MySQL instance on an EC2, that's not a thing in the module repository now. And I would presume there are enough people out there that have reasons to do that. And so these are opposite ends of the spectrum. The best module ecosystem I've seen is, regardless of the quality of the modules and the quality of the product, is actually the Drupal ecosystem because there, if you want to start a new module, you'll have to answer questions of why you think this new module needs to exist. So in Puppet, it's always been the Wild West. If you have a module, you register an account on the Puppet Forge and you just throw your module up there. And with Drupal, you have to actually, there's a central group of people. I don't know exactly how it works, but you have to actually make a case for, I have this module here, and I think it's better than module XYZ, or I think this needs to be there. And only then will it get accepted. And that's actually, it makes for the fact that the Drupal module ecosystem is quite good quality. You don't have to wonder which one of the 48, 448 MySQL modules you need to be using. If you're using Terraform in a bit larger setups, you will fairly quickly run into the problem of dealing with someone else's code and wondering, I'm looking for this IAM policy or this EC2 instance. Where the hell is it actually defined? You might in the beginning think, okay, this is actually a naming problem because there are no real naming best practices or standards. However, if you think about it a little bit more and I reached back to Puppet for trying to think of, okay, how did Puppet solve this problem? And that's actually organizing code. One thing that Puppet has that Terraform does not have is classes, and they seem less relevant because a class is a one-on-one in Puppet. In the beginning there were classes and you could define a class for them in a file called config management. That was totally fine. However, after a while they realized, okay, now nobody can find this class anymore because if I see the name of the class for them somewhere, how do I know where it is? Exactly the same problem as in Terraform. However, in Puppet this was solved by using what they call the auto-loading mechanism and that means that if I see a class name I can tell from the class name in which file the class is defined. And that's really useful. Where a class is referenced, the full class name is referenced and from that class name I can tell 100% guaranteed where the class is located. Unless you use multiple module paths. That's it. Let's not go there. In Terraform there is no such thing because an EC2 instance resource is an EC2 instance resource and there is no way of telling where that is being defined. If you're referencing, let's say, in an EC2 instance resource you're referencing a security group. There's no way for you to tell where that is actually defined because someone put that in a file somewhere. You can guarantee that it's not in a module because then it will be referenced by the module name but as long as it's not in a module you cannot tell which file that's going to be in. And in small repositories it's not really a problem. Larger repositories is hoping you have a good editor that can search through a project fast. So in the end that all boils down to the fact that in Terraform the only way to logically group things is currently one-on-one relation to files. However, you cannot reference a file name anywhere and so you have a problem where you cannot actually reference the logical place of where a resource is defined. Is that clear? It's clear in my head. One of the other issues if you've used Terraform quite a bit then you'll run into the problem that even though the HCL language is quite a nice one one of the problems is it's missing an if statement. And this is a problem that there is a sort of a workaround with a count where you can set a count to 0 or to 1 but that's not really a full solution for this problem. It's hard to explain and I tried coming up with code examples that show it but then I massively run out of my 25 minutes. So the problem, the issue is that because you don't have an if statement standardized modules get really difficult. So if I have a module, let's say I have a module called GitLab that brings up a GitLab instance. I put all my code in that GitLab module. It brings up an EC2 instance, an RDS instance, Memke, Redis, whatever, all the things that are needed for a proper GitLab setup. But the EC2 instance, do I want to determine for the consumer of the module which AMI they're going to be using? Namely this Ubuntu 16.04 LTS AMI or do I want someone else to be able to to push in an AMI ID and I will just trust that that is okay. I have to, as a module maintainer, I have to choose one of these two and as a consumer I want to be able to determine it for myself. So what you end up doing now very often is downloading a module, copying the whole thing, changing it so that it can use my AMI ID and now we're duplicating code. So this was a problem early on in the puppet days as well where you would download a module either from GitHub straight or later on from the puppet forge and you would start modifying it slightly because it didn't do exactly what you wanted it to do. Over time, modules matured and so that problem mostly went away and now if you're writing puppet code and you need to bring up my SQL you would need to be very convincing to convince me that you need to write your own my SQL module and I'm hoping that we get there with Terraform eventually as well but an if statement is a very big thing that's missing in that respect and so there are some other things as well that I didn't want to dive into because there's not really a counterpart towards a puppet so I'll leave those for now. One of the things that made the puppet world much easier, especially in larger environments where larger teams are working on things and sometimes disjointed teams is a solid testing framework. In puppets there is an RSpec puppet for unit tests and a beaker and server spec for acceptance tests or for integration tests and they make it much easier to rely on your code and to make sure that what you've written works now and it works next week and it works next year. In Terraform, I searched around and there are some efforts going on. There's a kitchen Terraform but it doesn't seem to be widely used at the moment and having a solid testing framework is obviously much more difficult in a Terraform world because you're looking at testing does this GitLab module bring up successfully a whole GitLab instance and that will only work in specific conditions so it's a bit more difficult but it's not an unsolvable problem and there's definitely a big space there for someone to come up with a solid way to test both on a unit test level as well as on a more integration level. Now slowly as the modules are becoming a bit more mature and the way of working becomes a bit more mature things are splitting off into modules so the recommendations are also appearing that if you're writing a Terraform repository and it has GitLab in it for instance then you should split that GitLab off into its own GitLab module and develop that independently and test it independently so that your main infrastructure your production infrastructure your staging infrastructure only consumes that module and only has to test whether the module works correctly and not has to worry about the internals of the module bringing up a proper GitLab instance. Click twice to go to the next slide. Interesting. One of the last... Did you walk all the way to the other side to show me that? Sorry, I'm almost ready. So module management almost every other tool has some way to manage modules so Puppet has a Puppet file RubyGems have a gem file NPM has a package.json and it's basically what it boils down to is having a single place that lists which versions of which modules you want to use. Right now if you're using a module in Terraform you can specify where you're calling the module you can specify I want it to come from this location and I want this version but that means that it means a bunch of things. It means that everywhere where you have modules you need to keep track of where these modules are defined and go and see if there's new versions so it makes tooling a little bit more difficult so for instance Puppet has R10K which makes it very easy to deal with a Puppet file it means that you don't have to distribute your modules with your Puppet code. The other thing is that it makes it more difficult to have an overview of which modules you're using in your infrastructure to maybe add one or update one and having it's a fairly simple thing I mean R10K for instance on the Puppet site is not rocket science but it's very convenient to have such tooling available and to have a single place that lists out these are the modules that are being used and these are the versions that are being used. So yeah that would be really convenient to have. I think that's my list. I intentionally kept this a little bit shorter because I'm also managing the dev room it's two different things. I didn't vote for my own session by the way so I'm not standing here because I like myself so much other people thought they wanted to hear this as well but yeah so I kept it a little bit shorter but this is my list and now we have some time for questions. The question is can Terraform provision into different cloud providers for the same application? Yes and no. So Terraform works with you define a provider and you use that provider to bring up infrastructure. However in Terraform you specifically say I want an EC2 instance and obviously Azure doesn't have an EC2 instance so you'll have to basically write the same thing with the Azure resources in order to be able to bring up the same application in different places. So this is currently still a bit of a thing. When I first started Terraform I was actually expecting that I could just say I want an instance or a virtual machine and I could specify which cloud I would like it to be on but the reality is that if you spend about three minutes more thinking maybe even one the differences between the nitty-gritty differences between all of these cloud providers are so huge that that's not really an actual possibility. So yes and no. You can totally talk to different clouds but you have to basically duplicate your code. So you have to talk to the guys up there because I'm not writing this stuff. I'm just a user. So I don't know. It's a long-standing issue so I presume there's work being done on that but that's not my... So there are conditionals but they really stop being useful very quickly. They solve a few problems but for instance, un-setting a variable is impossible. So if you're setting an attribute and you're using a conditional to set the value of that attribute you can only set it to a value or another value. You can only set it to a value or nothing. Those things are within no time still not enough. Not a MySQL provider. I would say, just correcting the terminology in my opinion it needs a MySQL module and a MySQL module for instance these things are also specific so MySQL module for AWS. So there are people who are running specific, very specific, I come from a MySQL consulting background that was what I did years ago so there are people that are doing things that are so specific that they cannot or don't want to use RDS so they want to actually use EC2 instances with specific EBS volumes that are set up towards their specific requirements and so this is a thing, right? So you would want to do this and you would want a MySQL module an AWS MySQL module where you could say, okay, go and install me Bracona Server 5.6 or go and install me MySQL Enterprise 5.5, whatever and bring up these things but it's not a possibility because you can only make a super opinionated MySQL module and that's why there is no module up there at the moment. There was one more question down here somewhere. You had a question? Yeah, so the question is if there's any plans for integrating an if statement the same plan was over there so that's not a question I can answer because I don't do that. Question over there, all the way up there. So the question is why would you want something like testing because in Terraform either something succeeds or it fails? This is true but there are a few problems there. One of the things is that it will only really fail so there's a Terraform plan which we'll look at if your plan is mostly correct but then during the apply things can still go sideways very easily and I don't want that to happen during my apply run. I want to know beforehand that it doesn't work so if I'm having a GitLab module for instance take a puppet module. I take a puppet lab's MySQL module it comes with a bunch of tests so I download the module, I run the tests if all the tests pass and there seem to be a decent number of tests I have a relatively good insurance that this module is doing a reasonable job. In Terraform there is no such thing right now so if you have a GitLab module you have no idea if it's even working and if it's working for your situation if you don't have specific problems that prevent it from working and you'd really want to test the smallest units possible individually so that you can guarantee that each unit individually is working and then... Sorry? So take the example for instance of I just spend a bunch of time making a GitLab module so that's why I'm talking about GitLab all the time but let's say that you're making a GitLab module and you want it to bring up an EC2 instance with an EBS volume and then use the EBS volume as the...