 Thanks, Marisa. Thanks everyone for attending today. As Marisa did mention, today's presentation is meant to be very interactive. We have several questions, poll questions that we'll ask throughout. And you'll see that pop up as a poll for you to actually click and we'll collect responses that way. There'll be several other chances for you to participate. And so if you could use that chat panel to do that, that would be fantastic one request there. The default in the chat panel is to send your responses just to us just to the panelists if you could hit that little drop down and just choose panelists and attendees so we can all kind of see and participate. And everybody can kind of see what everybody else is doing that will be best I think for interactivity so just a little drop down box, and that would be great. So, the reason we're here today, we've actually been working on this book that's almost, but not quite done. We will release this very soon but you know the folks Ben and I here at sneak engaged with engineer better to help write this book about infrastructures code and treating infrastructures code really as code, the way that other application code is is treated. We've had conversations with a, with a bunch of customers engineer better goes out and does, you know, implementations and helps and works with, with, with people's engineering teams on cloud native transformations and that sneak we work on, you know, helping people create secure applications and making security part of the developers, you know, day to day work, and we wanted to combine those two things we talked to a lot of customers and potential customers who, you know, they know is that infrastructures code has been around for a while that you know as a, as a technology it's not new but a lot of companies are still formalizing their practices around infrastructures code and starting to sort of centralize on one or two different flavors of infrastructures code and figuring out how to really treat that as if it is an application. And so we wanted to write this book is a bit of a guide for that. So we're going to cover speed and security is sort of the two guiding principles for infrastructures code, and there are a whole host of practices that go into, into those things so we're really excited about this book today's presentation is going to cover a handful of those practices. We'll talk about those in a little bit more depth as we go through this we'd love to get your feedback that's where the interactivity will come in through here, and we'll send this link out to this book when it is published if you're, you know, if you're reading today you'll get that link, assuming you've opted in for a follow ups. So, just as a warm up and kind of get people that get people going in the chat panel and those kinds of things. If you could just in the chat panel real quick type in you know your, your biggest obstacle that you're facing right now with implementing infrastructures code and this is wide open you know anything that you want to type in it could be your obstacle. It could be an obstacle that you know your company as a whole is facing. So, so just let us know there. Yeah, technical problems, political problems, strategic problems, knowledge problems, open to hear everything that. Experience and confidence problems ones I find a lot as well. Yeah, seeing a lot of sharing and testing knowledge seems to be one one man team, not entirely sure I want it yet. That's, that's, you know, that's definitely a, you know, an obstacle I think in a good consideration not laughing because it's a, it's a bad thing it's, it's definitely something that I think everybody should consider lacking experience and time, global adoption so that sort of centralizing lack of consistency so we're seeing several of those things. There's a bit of a theme there as well Jim I think around like testing as well. Yeah. Yeah, testing best practices that lack of knowledge secrets those are all, you know, several things that are listed around there so yeah not not completely surprising. We actually ran before we did the book back at the end of sort of middle of the end of last year sneak ran a poll, ran a survey actually externally, not just to sneak customers we ran it through a third party to kind of ask folks what they're where they are in their implementation journeys with IAC and what they're running into in their their answers were very similar to these as well so you know continues to be a problem I think I see a couple people to say it's you know it's one man team. So, you know, how do you do all this when it's just one person. So great that's great, you know we're going to talk about several of these things as we go through this, this presentation today. So with with an actual poll question here I've got it on the screen but I'm going to actually use the built in polling function here. So we can collect answers in a little more formal way. What source code management or version control system get, you know, like a thing. Are you using today with IAC if you're using anything, and we've seen everything we've seen people who have nothing they're just, you know, they have it on their system and that's it we've seen people who have shared folders we've seen people with various get like things that they're using. You know, not a surprising set of responses coming in here either. I think most people are leaning towards some sort of get repo which is good. I think that's a great first step I think in the book we talked about that. Ben or DJ any, any thoughts there. How many people do you see that are still using just like shared folders or some old sort of like old school sharing method. I see a worrying number of people who aren't using anything at all. The number of kind of lone wolves there are out there that are responsible for one corner of a system somewhere that have, you know, stood it up on their laptop they're storing the terraform state locally. They're the single pointer failure used to talk about the bus factor but we prefer to talk about the lottery factor if that person wins the lottery, then all of their knowledge in their state goes away. That's something that I see quite a lot of. I think one of the encouraging things is that if people are into things like infrastructure as code, they're likely to be in a forward thinking sort of organization where they're more likely to be using something like a project rather than subversion. For example, we wanted the kind of older VCS solutions. I think you make a good point that down there it's not just the version control of the terrible files themselves it's also keeping hold of that state file as well. And slightly separate but very important kind of concerns here need to have the right mechanisms in place to handle by. Yeah, I think another thing that kind of came up in the book that's interesting to there's a, you know, a couple of people here who said that they're kind of a one person shop I think even for that, like, having a get and having that history knowing what's changed and being able to go backwards and look at what you did in the past is still a reason, you know, even if you're not sharing it with anybody that you might want to have it in a system like this. Absolutely and I think, hopefully I'm pronouncing this is Robert strand makes the point of get your IAC config into source control, even if you are working on your own. Can't recommend that enough. My other half is a interior architect and she does lots of stuff with AutoCAD and Photoshop. And there's no version control. I'm like, how do you do this? How do you work on big complicated projects without being able to make commits and have branches and being able to undo your changes. And similarly, with her I'm working on kind of a barn conversion. I'm having to do DIY for the first time. And it scares the hell out of me that I can't undo things like I'm cutting a piece of wood. And I will get revert on that. So, having that, if there are any folks who maybe more from the operator side and less in the development space and haven't got down and dirty with git, the barrier of entry, you know, it's quite a learning curve because there's 15 ways of doing any one thing in git but I would highly, highly recommend it. Because the approach and some of the other practices that we'll talk about later. It's fundamental. And this goes into, you know, when we're talking about applying this stuff continuously, that's, you know, really requires having your IAC config somewhere that's remotely accessible and shared. So when we start talking about security and compliance, a lot of regulations and policies policies I should say rather than regulations were written before there was an assumption that everything would be stored in a durable shared formats that can be tamper proof. If you go to maybe a more enterprisey sort of organization where they're expecting ops to be happening manually, and you tell them well actually I can show you every single config change that's ever happened to this system. It's only being applied by a CI server, everything else is locked down so I can prove who did it when they did it what they were aiming to achieve, and we can undo it. All of a sudden that becomes a kind of revolutionary capacity capability for them. And often those policies that folks will point to as if they're etched in stone and can't possibly change or challenge, become mutable and you know like, okay now we can do things differently because technology has moved on since you wrote that document. So how do we kind of build on that breakdown, like how do we feel about mono repose for all of your infrastructure, for all of your organizations infrastructure, for example, because that by definition meets everything you've just articulated down this in this version you can kind of reverse it. But there might be 10 teams who live in there. Oh, you dive straight into the controversial ones that's an excellent question. To be honest, every. So, engineer better does some consultancy work with some customers where we go in where there's development dysfunction takes too long to get things into production or develop around low. In a lot of cases majority of cases we get called into a mono repo is part of the problem. And it incentivizes kind of bad behavior and depending on things where maybe you really shouldn't and tight coupling of things. That's not say that you have to do those things when you use a mono repo by which you know we mean one massive version control repository with everything in it. But when I repose pioneered by folks like Google, who had a whole team dedicated to writing tooling to make that work. And most enterprises do not have that team. So I'd question whether that's, you really need to look at whether you're getting the benefits from it and what it's doing to things like that your lead time to production how quickly you can get changes out and how your flow efficiency so how much time you're spending waiting on things and dealing with merge requests and internal communications rather than just getting code and throwing it down the pipeline. Yeah, I asked in the chat panel to how many people are using we've got a few few people are there going with mono, mono repo route in some places where it's where it varies a little bit so it's definitely out there we see it. I mean, we see it in our customers, you know, a bunch as well so I think it might vary by format, a little bit too. I think, you know, Terraform is one of those where we see, maybe more often, people are using a mono repo Kubernetes. We consider that we consider that infrastructure as code. So if you consider that I think that tends to be a little more divided. I guess it could depend on which aspect of Kubernetes you're talking about to I think other other formats. I wonder if cloud formation is a little more splitable versus Terraform having everything together I'm not sure. One of the things to fall back to here is the package cohesion principles so I spent the early parts of my career as a Java developer for my sins. And go have a look on Wikipedia about package cohesion principles but things like rules of things that change together stay together. So if you want to be looking at the boundaries of what's kept in the same location you shouldn't have two different reasons for changing one set of files. So to the mono repo point. I mean, something that I do see working well is when there is a product we're quite big into platforms and building platforms if there's one thing that represents your maybe a SAAS offering. Having that in a repo is kind of a repository per product. It seems to work well in my experience so I suppose it depends on how many things your organization does if you've got things that completely unrelated, all living in the same repository then I would certainly invite folks to check the metrics to see whether that's working well for them or whether it's creating problems that may be otherwise hidden. Yeah, I think the other thing we asked here to kind of move on but I think it's interesting to with I see to think about the fact that you have different things that you might want to store it so the code we asked about, you know, get repose and things but also I see specifics I think was something like a terraform again you have you have the state that has to be stored somewhere and get repose may not be the best place for that. Maybe use a terraform thing or maybe you have something else that you used to store it but I think there's other considerations there too that are kind of important that might split split the things up, even if they are related. We go on. I'm not sure if I'm using is even correctly the results in the poll how many people weren't using version control at all. We're not using it at all. It was very small so out of the with the total we had 100 and something odd people that voted in only 912 responses. Well actually, only nine responses were not using some sort of actual version control system. Cool. That's encouraging. Three of those were using shared shared folders. I don't know who so I'll keep the anonymity there for the people using shared folders. All right, let's go on to question number two. So, you know, if you do have these things in in in some sort of version control and get or I guess you could do this other ways to but you know what types of automated tests. Do you run as engineers update the code so as people make changes infrastructure code and code gets committed to wherever you're committing it. What types of, you know, testing are you running that that was one of the things that came up. In terms of, you know, an obstacle to to adoption of IAC and really to sort of broadening use of IAC so kind of curious what types of tests people are running. If you're running tests today, that's several options here so simple things like linting just looking for style errors and syntax errors and sort of kind of kind of getting to a point where you're coming up with a common of writing that and sort of enforcing that with some automated testing. There's loads of ways you can validate your IAC each format tends to have, you know, kind of either some sort of built in testing or there are open source tools and other tools that you might use to validate that you're actually writing IAC that is runnable deployable security tests. This is our shameless plug, I think it's the only shameless plug we have in the polls for something like sneak IAC, which will do security and misconfiguration tests but I think, you know, obviously in our opinion that's an extremely important thing to include somewhere in your IAC pipelines. There's loads of other tests you could you could you could be using if you do run other tests that we don't have listed here to pop those into the chat panel just we can kind of see what people are doing. And if you're not doing testing that's an option to I've got to say if you occasionally hear a very loud click coming from my microphone it's my ankle of all things when I'm shifting my weight my ankle is making a very loud clicking noise so I'll try not to do that. That doesn't sound pleasant. Thankfully it's not painful. But yeah with with testing I mean I suppose this is one of the it would be interesting to hear from the folks on the webinar how many people have come from a software engineering background and how many people have come from a more ops and kind of cis admin background. One of the things that we see more in teams that have been application developers in the past is that there's a propensity to shift left by which we mean move quality checks and try to make it fast so as soon as you've written a line of code, or you know made a conflict change. How quickly can you find out that was the wrong thing, rather than leaving it all the way to making some deployments and then running some big integration tests or even worse, making changes in production and then finding out that your users have discovered that you've got a misconfiguration. Hopefully everybody can see the poll results here so lending came up as the number one. Not surprising right I think that's, that's a pretty common thing can be built in a number of places as well. Lots of lots of IC validation and lots of people who are not doing much testing yet either. So I think that kind of echoes what we, what we saw in kind of that very first warm up question about obstacles to, you know, to adopting IC, you know, at a bigger level. In terms of, in terms of percentages at least not a lot of people doing the security and misconfiguration tests just yet either. Yeah, looking through the chat. So people just starting so they're not doing a whole lot yet. Lots of things about unit tests and CI CD tests, we've got some more questions about that too we've got, we've got more testing in built in here so one person who's writing go lang unit tests. So I think there was a tool, DJ that you had pointed out that uses go as it was. Whether I can remember it or not, I can't know how about the person that's writing the go lang test maybe can save me from my poor memory and throw in the chat, what they're using. I'm encouraging that there is this level of testing and for the folks that aren't doing anything yet. I would encourage them to encourage them to start. And it can be something as simple as if you stood up a Kubernetes cluster or whatever, just throw a curl at it you don't have to be an application developers to be able to start writing tests that prove or give you a greater confidence that the systems that you're deploying really work. Particularly when it particularly when it's, you're looking at kind of things later in the cycle like integration test system tests where there's a thing that actually exists and then you can use. If you're an operator your shell scripting skills, you know using things like curl or write write a bit bash or break out into a programming language if that's you know something that you're familiar with. It's starting be highly encouraged. And is there a kind of a preferred ordering here as well I see that like using things like I see my donations like terror test is five cents each night security but I can see pros and cons of which way around they were going to be a pipeline as well. I can view on that Dan, should I be running terror test before I run some security tools or should it be around. That's a good question. With that particular example generally the rule of thumb is you want to learn as quickly as possible as quickly and as cheaply as possible. So you've got to think about how long those tests are taking to run if they take a long time to run. You probably want them further down the pipeline so not shifted quite as far left as the quick and cheap things. So if you can get a rough indication of what's wrong very quickly, then you should do that. And I think in the book, I'm not sure whether we talk about it today, but in the book we talk about using pre commit hooks, so somebody mentioned in the chat that they had multiple monorepo shall add them to consolidate their pre commit hooks. So these are a feature of git where when you do git commit and you snapshot in time saying that you know this is a version that I want to record. Git will then execute another process, another executable on your machine, and if that fails, then it won't proceed with commit so you can use that to execute a quick shell script that maybe does the linting. So that's a quick test that you can do the the sneak IAC testing tool how quickly does that kind of turn around results. The benchmarking is like 500 kind of files is under 10 seconds, for example, running over, I mean I was a native US example so I was running over 150 tests as well against each of those files. So I think pretty quick, a lot of customers who run it around kind of the IDE local development type stages within a feedback loop. And with that kind of turnaround time, it depends on how how small your commits are if you're committing every two or three minutes then a 10 second hits probably not that great, but I would have thought with infrastructure especially where there's maybe less refactoring than if you're making a commit every half hour, 10 seconds is nothing. And if you can find that out before you commit and before you then push it and share it with somebody else, then you're also saving any downstream time wasting, whether that's if you're doing continuous delivery and trunk based developments and kind of polluting the main branch or whether it's raising a PR that's going to fail because somebody else will have spotted this thing or the CI server will have spotted the thing. If you can spot that you've missed something sooner than you absolutely should. Yeah, I think, you know, we seek I see the other side of it too depends on what you're testing right I think, if you're, if you just run a test over the whole repo, even though you've only made a change to one file and that's obviously takes a little longer if you just want to test one file you changed and that's a little bit faster. But I think, you know, most people just set up a test and what they just test whatever. So either way we done a few names did come up in the chat so Tara test was mentioned several times. It was a task cap for cloud formation which I haven't heard of so I'll have to look that one up later in cucumber somebody mentioned to come I don't know if that's like a real thing or. Yeah, no that that's that sort of thing and that's very interesting it was one of the curiosities I had I was trying to work out with last well was there's my ankle again I don't know we heard that the often. One of the ways that we like to test systems and encourage our customers to test them is. Do you know what feature it is that you're delivering is and when the work came to your team was it transmogrify the widget, articulate the splines without an explanation of intent why you're doing that and what someone can do with the system now that they couldn't do previously. It's unfortunate enough to work in a world where your work is expressed as user value that makes testing a hell of a lot easier and gives you a kind of. A bit of protection later down the line of if the story is that customers can now search for cats on the front page. That's happening, not only do you prove that you've enabled that feature the first time you write it, but then you can be running those against your running system constantly to find out whether there have been any regressions. And whilst that that particular example is very kind of user facing customer facing, you could also be doing that with things like security and making sure that you know the production database is not accessible from these kind of subnets. And once you stop building up that sweet, not only do you have protection against change when you're knowingly making change, you also have protection against change when somebody else is making it, and maybe you don't want them to be and I think we kind of touch on that a little bit later with configuration drift. We certainly touch on it in the book. I'll touch just a little bit on the policies question in the chat as well around should you be doing security evaluation as a proof check, or should you do this post check. I think there's always a bit of an it depends answer that for me, I always strives to kind of do both. That's why you want that fast feedback pre check get it early, but equally you want that confidence as well with stands and drift, but you're still in that desired state as well. Let's go to our next question this one's this one's always interesting as well and it's, I think, I'm a little bit surprised at how often this happens but for the folks who are who are attending day. If you're using infrastructures code in theory, at least all of your changes should be done in code and then deployed through your pipelines or whatever. I see the clinic method you have but but how many, how often I guess our changes still being done outside of those pipelines or we're thinking about this way how many, how many of those changes are still being done manually so you've deployed something via I see and then somebody logs into AWS console and manually changes something usually it's quote unquote temporary. I think it'd be interesting to see the split between zero and any other answer. You know the, it's an interesting world when you can say to compliance and security folks that no one can make changes to production systems, unless there's a break glass procedure. None of us have any access the only thing that has access is the CI server. Unless we go through this particular process where we get short lived credentials and it's logged and it's audited and we had to put in a reason why that's then opens up a world where things like continuous delivery and much more rapid delivery of change is much more palatable to the folks that are you know historically painted as the ministry of no. Exactly. As we're seeing the responses come in here too so it's, there's a fairly good percentage of folks who only allow changes to the pipeline still 25% right so that implies that 75% of the people still have changes they're coming outside of those pipelines. Now, it's a relatively low number I'll go ahead in this kind of centralizing on the answers, whatever they see what's showing up here. But, you know, that's still a lot 75% 70 over 75% of changes, or like it's not of the changes but over 75% of folks are still seeing changes made outside the pipeline so this is a significant, a significant number. I guess that begs the question you know, there's a need for some sort of auditing of those right of knowing what's going on because it's happening outside a version control, right. So, I wonder how people do that. Yeah, yeah, I mean, so I'm working with a customer at the moment, who has grown very, very quickly, and their platform team is massively kind of overloaded with work and so app developers are making changes to terraform code applying themselves. And that's kind of leading some confusion, but then when that doesn't work, they're going in a manually changing things through the AWS console, at which point all bets are off like some poor, you know, first line support person second line support person has to log into the system. They've got no idea what state it's in some change could be made manually by a human and that makes the debugging process so much more painful. So that kind of thing definitely to be avoided. I think Catherine raised a good point about having a rule that anything that's done manually, then back porting it to be captured by the IAC configuration. You know, some I think we mentioned this in the book. I've forgotten about Perry, we've got a fictional character called Perry is a system admin DevOps type person who's going on this journey. I think Perry had to make some manual fixes when something went wrong. Sometimes there are those situations where you've got to dive in make a fix in order to save the day. But then it needs back porting and I can think certainly one customer that we worked with a good few years ago now where something happened on the weekend. People were up all night they fixed it about four in the morning. And of course they're like, right, I'll put that in version control in the morning. I forgot about it. And then, you know, the next change rolled through the pipeline undid what they've previously done the production issue happened again. So that kind of discipline and rigor is really important. And if you have a CI pipeline that, you know, does all the quality checks on your infrastructure as code does those deployments does promotion does all of those kind of good things and does it in a reasonable amount of time. And ideally without any manual checks because those are the death of, you know, continuous delivery and getting things done quickly. If you've got a CI pipeline that can get things into production quickly. That's a great enabler of being able to fix things the right way, rather than manually changing a live running production system once you've figured out what fixes. If you don't do it through the UI, don't do through it through the CLI, commit it and get wait 20 minutes if you can, then see it gone out permanently and see all the tests pass, which is, you know, another thing that you don't necessarily get when you're diving in and fixing things manually. Yeah, I think Carlos had a good comment in the chat panel to large percentage of the manual uses do the mechanism or flow of change in companies I think it kind of relates back to. This is how we've done it in the past is how we did infrastructure and made changes in the past now that everything is code. Well, some people don't know that it's code some people don't know what to do, if it is code, or where to go to make changes and I think that's one of the reasons I think we really wanted to write this book there. So if we're talking about Terraform, Hashicorp has great tutorials on how to, you know, how to do Terraform, how to write Terraform and learning Terraform those things, but I think these practices around treating it as code and getting everybody in the company to kind of think of it as code, the way you would other things right it's kind of a new class of developers in a way I think a lot of infrastructure folks come from, you know, a background where maybe they develop scripts, things that they wrote just to kind of make their jobs easier and this is, you know, a big, big leap forward I think or can be done. Absolutely it's when done well it's transformative and one of the totally echo your sentiment Jim there is that this one of the reasons we were so keen to work together on the book. Oh, if you're one of the folks listening to this and you think that you're, you know, the people higher in the food chain are not going to let you do this and you're never going to be continuously deploying into production. I would really hope that if there's one thing that we managed to achieve with the book it's that we can open up some minds and there's actually some sections in there that are not just about the practices there's a few kind of small pieces about making the case for safety at speed and making the case for all of these automotive processes because yeah linking my fingers synergistically. All of this stuff working together has way more impact than just one or two practices if you can do one or two things and get incrementally better absolutely do it don't think it's all or nothing, but when you do do it all together, then it opens up all of these possibilities of different ways of working, and then you get into the fun world of digital transformation and realizing that you know compliance and the governance people can't move as quickly as your infrastructure can, but what a great position to be in. That's a good point to. Let's go to the, the next question this one's kind of interesting too because I think this is a, this is a, this is a challenge probably overall I would imagine this is a challenge for any kind of code but you know for I see. I mean, if you are running these things, you know through through pipelines, can you predict the outcome do you will you get the same result, every time you run the same code through it, or I didn't put this if you want to use the big fancy term that the DJ put in the book is your code is your infrastructure is code I didn't put can you predict what's going to happen reliably. And you put a couple examples in the book DJ of things that we sort of break that. One was, I think, you know, that was just a simple toggle, you know, and if it's a simple toggle, you won't know, you know, if it's, if you started out with yes and you toggle it your answer will be no if you started with no your toggle until the yes within therefore it's not predictable right. Absolutely, I've just corrected a type on this slide so if you want to hit refresh. I think we can do that. Yes, so I didn't potents or I've got a colleague pronouncing idempotence, which I find slightly mind bending. But yes the, the knowledge that you can run the same scripts run the same stages in your CI pipeline several times and they do exactly the same thing. That's absolutely key and you tend to find this. And more in, I mean terraform itself should be idempotent and convergent, you declare your state terraform figure out what do I need to do to get to that state and if there are no changes it shouldn't do anything. Occasionally you come across a buggy provider, and you'll know about that because something bad will happen, but often it's the scripts that people write supporting things. So for example, we're big fans of having a pipeline that does everything to stand up a new environment from absolutely nothing to, you know, the full thing running and fully tested. If you're going to do that, where do you store your terraform states, you know if you're going to store it in an S3 bucket, then you will need a bucket to exist before you run your terraform. Otherwise it won't have anywhere to put it state and check whether state exists. So that's the kind of task often ends up at the very beginning of a pipeline that is often shell scripted. And so you need to take some care when writing those kind of things as to, does the script fail, if the bucket already exists I'm pretty sure the AWS CLI will return an error if you try creating a bucket that's already there. So then you need to start wrapping that in checks, either you disregard error messages which is a bit dangerous, because how do you know you disregard the right error message maybe it's an authentication error rather than an error of the bucket already existing. Or you defensively check, like if the bucket doesn't exist, then create it and if it does exist then just know what and we'll assume that everything's fine. And we sometimes see that with credential creation as well of you know scripts that might create SSH keys, key pairs, certificates those sorts of things where they're generating new one every time they run which is not necessarily what you want. You're getting some awareness of that when you start scripting something, especially if you're new to doing it continuously through a pipeline, where the pipeline will be running times maybe you don't expect, because your colleagues have committed to change, making sure that you've got that idempotence is really key to having a stable system and a happy life where unexpected things don't happen. Nobody likes unexpected things happening to their production systems. How big of an assumption is there in here as well that your infrastructure is stable and reliable in order to kind of underpin idempotence? Yeah, I mean that that's helpful if your infrastructure is stable and reliable if it's not. It's not quite the same as idempotence. I think that's technically re-entrance the idea that if you do something, if you if you start a task, it fails halfway through. And then you run it again that it will be able to like correct or resume where the last one run off. That's also a desirable property of a system. I'm just looking through some of the things in chat. I've got Carlos, if you use githops to trigger changes from repos to infra with Terraform, theoretically should be not deploy anything right. I think that that's correct me if I'm wrong Carlos, but that's talking about the kind of no op idea, the no operation idea that if you continually reapply your Terraform, Terraform should be smart enough to go. There's nothing to do here. It will be safe and it will have the same results. And generally you can rely on that contract with Terraform. If you can't, that means there's a bug in one of the providers that you're using, which we've experienced and wasn't much fun. But I just remembered another like really big point here. Don't depend on latest ever. Don't be tracking. Maybe we'll talk about this we get to promotion, which is in the book and about promoting change through environments, but definitely don't be depending on the latest tag of container images. That is an example of a non idempotent system you run it one day you get one result, you run it the next day. The latest version of that container image has changed and all of a sudden your system is behaving differently. There's a new version of CLI in the image that you relied on your script. Bad times are had by everybody. So you've got idempotence in declarative convergent tools like Terraform where you shouldn't have to worry about it. You've got idempotence that you need to design into your scripts and anything that you execute anything procedural to make sure that you're having the same effect when you run several times. But then also in terms of your dependencies, can you reproduce this. Will it do the same thing from one day to the next day. And do you want it to, you know, you ideally want to be keeping the same versions of everything until you've decided that there's a new change that you want to consume, or that you have a development pipeline that's pulling in all the changes and then promoting those through. So the next question I'm going to make a note to before we publish I need to add idempotence to the glossary. But moving on there's actually this next question applies to some of those scenarios right so the idea of reapplying your infrastructure is going even if there are no changes so that I think Carlos's comment was about. If, if there are no changes then nothing should be applied right, but I think that's sort of the end, the last step but that doesn't necessarily mean you don't kick off your pipeline anyway just to just to, you know, sort of force that, that check right. I guess here like, do we consider rotating credentials a change as so is that a deployment or triggering rotation credentials by making a clean deployment, separate things that makes sense. Okay, let's let's remind me to come back to the credentials points. We want to minimize configuration drift by continually applying these things so that you don't get a surprise when you haven't made any code changes for a long time and then you run something and it all breaks. So you know that there's the longer it is between the runs of your CI pipeline and the application of your IAC, the more chance there is of something having changed whether that's something on the inside of the pipeline of like this new version of something and maybe you as a new version of a provider that's behaving slightly differently and you didn't have a lock, or there could be any change in the environment where someone's gone in and made one of those manual changes that could then cause your deployment to fail. So that's one set of things we want to protect against. And then we also need to think about the by continually reapplying our IAC if we trust it to be idempotent, we trust it to be convergent, then that's protection against attackers. That's a protection against people going in and trying to change things without you realizing if there's a maximum amount of time that a VM can live for before it gets repaved. If there's a maximum amount of time that VPC can be configured for before it gets totally reset, then any attack has a limited shelf life, it's only going to be able to exploit things for a certain amount of time. I think I can't remember which security person it was I was talking to, but they were saying you know the nightmare scenario is not that you have a vulnerability in an exploit it's that you've had a vulnerability in an exploits that's been sat unnoticed for three years, and has been slowly harvesting data, and then after three years decides to splurge it all out at once, where you haven't been detecting it. So that continual reapplication is important there. Kind of the angle I was thinking you raised from is like that mean time to recovery here and if you know there's a security incident and the fix is to rotate your credentials like how easily and cleanly can you do that like you have to make another change in order to do that. What can you roll something like this forwards and just trust that on every deploy your creds change anyway, and that's how you resolve the incident. Sure, sure. And, and yes, we ideally one situation where you can just rotate credits without having to make a code changes necessarily and the CI system picks that up. It was one of the points where we got to in the book where this really does start to depend on the things around you, which secrets manager are using are you using secrets manager it was one of the things that when we asked for barriers secrets I saw being listed as a clunger there so if you're using something like volts, you know your life will be much easier tying this into a comment that I saw scrolling past in the chat. Antonin said that versions have to be configurable so you can decide when to upgrade and test it through your environments, going to the last point. The CI servers make it very difficult to notice when the outside world changes and bring new versions of things in and related to secrets changing and then your CI server noticing, oh, there's a new version the secret in vault I should probably apply it. There are other CI servers that make this much easier. We've been looking for a lot of concourse because we rather like it and it makes these things very much easier. We've been looking at tecton we've been looking at Argo we've been looking at Argo workflow, and none of them have a inbuilt polling mechanism. And kind of version tracking of new things that come through, they can do web hooks, but if you're in an enterprise environments. I would suggest you if you need to react to change in the outside world for example helm charts or container image registries like Docker hub. So having a CI CD system that can track changes, including those in credentials and then take, make take an action as a result of things in the outside world changing things that you depend on changing. That's a good place to be and at that point you've got a living breathing system that doesn't need manually updating by human. Roughly two thirds of the folks are doing it, at least ad hoc they're kind of reapplying it but actually that's not I'm reading that wrong. Actually it's flipped around I guess it's about two thirds of I added the other way. My math is still right. So it looks like about two thirds of people are doing some sort of read reapplication. But a lot of its manual it looks like to only a few that are actually doing this often and automatically, which is interesting I think, you know again ties back to what we talked about before. There's there's a handful of people who have everything and don't allow any changes outside the pipelines and then are retesting everything dramatically. So, interesting results there. There was a question here to on the earlier, the last one, you know if you're not using something like latest how do you automate security updates. So if you don't, if you, if everything is idempotent and you know exactly what you're going to get at the end then how do you, how do you know when there's a security update and how do you automatically make that change. So something is going to be different, right. So that's another really good question. So, again, this is where we had to make design choices in the book as how we were going to tackle this we decided to use Jenkins as the CI CD system of choice because it's kind of like the lingua franca of CI CD and Jenkins makes this really reasonably difficult to be honest it's got no inbuilt mechanism for tracking things in the outside world and then you're into the realm of like writing scripts that will replace bits of stuff in in YAML. So you've got solutions like that where you have to hand crank something which if you've got no other choice you should absolutely do like what what you want is a series of pipelines where your dev pipeline, not only listens to changes that you need but it listens to changes in the outside world. Every time there's a new version of a new image you want a CI CD system that can pick that up go there's new version of this dependency. Let's now run that through the pipeline so that if it passes all the tests that automatically goes through and you're always on the latest version of everything. You might have kind of different views about that and about maybe not being on the latest version of things. What was the thing about like people saying don't use a new version of Windows for a year and a half. That was in the old days, but certainly having a CI CD system that can that can detect those changes and update things for you in a decentralized GitOps model I think this becomes harder. You've got lots of different things deployed in a cluster and lots of different dependencies being watched and that's not fed through a central mechanism of tracking all change. That can be challenging to reason about. In the book we talk about creating a bill of materials. Taking the point of view that you're working on a system a product. What were the versions of all of the things that went into this run of the pipeline and pass the tests and then you can provide that to your next pipeline like your staging environment or your UAT and deploy all of those versions. There's there's tension there in who is deploying stuff here like if you're creating an internal developer platform that that point of view makes perfect sense because you want to control all the things going into your platform. But if you're deploying, if you're responsible for deploying all of the app developers apps and the you know the various microservices that you've got going on at your organization, then that starts to break apart. But I'm going to kind of back into a similar sort of discussion about monorepos versus separate repositories about where do the boundaries lie and who should be updating what when on what kind of cadence and do we want to know that absolutely everything the version of absolutely everything on this platform works in unison or would that be excessive coupling and without slow your organization down if your Netflix or Amazon you've got no chance of doing that knowing that all of these services work together all at once and that's not the design intention behind those organizations either but if you've got 20 30 microservices and it takes 20 minutes to chain to test all of them working together and the changes don't come through that quickly. And you don't need to make changes more than once every half hour in production, then why wouldn't you want to have all of that confidence that everything works all at the same time. Yep, and that will get us into the next question I will put in one more shameless plug as the sneak marketing person on this call. If you had a security tool that told you what changes to make right to be more secure than you could probably take that another step further, and it just so happens that we have security tools that do exactly that so you know there's a there's potential to do something like that but it would require, you know, whatever security tools you do use hopefully it's neat and hopefully you're looking at sneak but whatever tools you use to kind of tell you look there's a here's a security problem here is the thing you do to fix it and then you take that and if you really wanted to get you know wave the magic wand and just say okay we're going to automatically jump to this next version then you know that's that's a potential I know very few customers who are at that level of automation with security fixes, but I think I think it's inevitable when you look at things like dependent bots that will raise PRs against your repose whenever there's a potential misconfiguration. That's bound to spread to infrastructure at some point and having a CI CD pipeline all the way to prod means that you're going to be getting security fixes faster than your competitors. Okay, so you mentioned the bill of materials we have a question about that to this will probably be the last poll question to get to I think. But do you track and record which versions of your IC and all the other components that were part of that work together so are you creating that IC bill of materials and you might be doing this manually. You run it through the tests and maybe it's a script or something that you've kind of written, or you might be might be using a, you know, tools that can just dynamically generate all of this for you. I think to your point, did you having a list, a record of things that have been tested together that are known to work together. Is good for a lot of reasons right I mean. I mean there's, and the answers to this they're legitimately good answers, different good answers depending on your use case, thinking of a customer of ours, who we've written about in our blog doing some really cool, really demanding infrastructure as code stuff that was like in a regulated environment behind an air gap that was lots of fun. So thinking about their setup. Having the, the, the chapter the SVP of technology talks about fleet management, and they've got that it's a software as a service but they need to have some single tenant instances. It's in a kind of 30 ish microservices that provide their product about 1015 data services so you know you name it MongoDB elastic search Kafka Redis Postgres, the whole kitten Kaboodle Apache storm zookeeper. They, and that whole thing provides their product suite. And they've got some that are public cloud among multi tenants some that are on premises and single tenants, they need to be able to roll out changes to the apps and the microservices, and then have those go out to all of the different places they need to be able to say all right we need a new version of Apache storm here, and doing all of that manually and or doing it with a non automated process is an absolute. Killing them in terms of productivity, but when they moved to a nice promotion process where it's like right, we're going to keep track of everything that goes into the first environment. When it comes out that we're going to write that to a YAML file which will then use to parameterize values in other YAML files because everything's YAML now. So you can see that set of versions go all the way through our promotion pipeline and now we can see how the progressive roll out has worked across our multiple different production environments for different customers that allowed them to reason about the system in a way that they couldn't do before before it was heterogeneous. Things deployed in all different places and different versions of this and different versions of that, which is a problem. Helping another customer at the moment tackle of that not having a centralized process for change management and the kind of recording of versions that work together. It makes debugging easier. When you've got a live production issue you know exactly where you stand and you can go with this bunch of stuff works in 27 other environments why isn't it working in this one it must be something to do with this environment. You know for security and compliance and that you can know exactly what it is that you've got in there and you can prove it because it's in Git somewhere. Yeah, it I think it makes for a much easier life you have to be you know think about where those boundaries are and what you want to couple and what you don't want to couple. But there will be natural boundaries of things like this bunch of stuff changes together. Therefore we should track all of these versions at once so we can know for sure this bunch of stuff work together. Yeah, there's a question here to Carl zone. Maybe I don't understand the question what use case. So he said, I understand that you know you shouldn't do this you should track all these versions, you know for things like recovery dr those sorts of things right, or somebody who just wants to fully deploy from scratch I guess. But what, what use case, you have for tracking I see an app versions I don't know. Oh, I get it. And I love the question I think it's a really good one and it talks a really good point. Depending on your product and your use case there may not be a clear abstraction between infrastructure and apps, some places there are there's a really good platform boundary where it's like we're the platform team with the infrastructure team. You can self service push your app code here. If your app doesn't work. Well, good luck to you. I know that's yours to figure out we give you the self service tools to figure that out. If the platform doesn't work. Then that's our problem and you come to us and raise a ticket. That's one model of working which to be honest, we do not see very often the level of sophistication in platform teams is surprisingly kind of behind the curve on that. Instead we end up with this mishmash of like infrastructure and apps. If you have a thousand app teams and one infrastructure team, then it doesn't make sense to check that all of these apps work with this version of infrastructure, but in that. So the example I was talking about earlier where you've got like 3040 microservices. If your product that end users getting value from doesn't work, unless all of those things work together, then you should be testing those in lockstep and promoting in lockstep. If people can get value without all of that, then maybe there's less of less of a case for it. There's, there was another point I was going to touch on there but the do people get value from the whole or just part of it are you providing a platform or are you providing infrastructure to enable product features. And that's probably when you want to track that these things work together. That was the other thing I was going to say. Platform updates underneath underneath apps, like the number of this has got worse in the era of Kubernetes, the number of customers that I go to saying well there was an outage because we did a cluster upgrade. Like great. Why isn't anyone testing that. And there's various different reasons but you need to know that even if you do have a platform boundary and you're a platform team and there's an abstraction. You need to know that this bunch of apps works on this version of the platform and that needs to be tested separately before you go making that change in production. I'm rabbiting on. If you want to continue the conversation on Twitter with me it's Daniel Jones EB, and I can continue rabbiting on there. Yeah, yeah, we're right at the top of the hour so I do want to thank everybody for for your participation today this is great. We love when we get on here and people actually participate because it gives us more to talk about. And we learn a lot from that as well. In the book. Somebody asked in the chat what's the name of the book. It's, we're going to release the book. I'm hoping this week, if not this week next week, and we'll send it out it's currently called continuous delivery for infrastructures code but I'm in the middle of a title change to infrastructure as code for security and speed but it's the same content. So we'll be releasing that ASAP and we will send it out if you signed up for this webinar and you opted in for follow ups will will send that out. If you haven't opted in for fall for follow ups you can check out the sneak page. These links here will take you to the sneak page and go to sneak IC page if you, you know, engineer better did a lot of the work here and I want to think DJ and his, his team for for the, you know, really the bulk of the work that went into this book. And, you know, if you have questions about implementations and you want some help with implementations, the engineer better team is, you know, is fantastic. And so please do, you know, reach out to them as well and, you know, get them involved in your projects but thanks again everybody thinks DJ thinks Ben thinks Sarah thinks Marissa Linux Foundation, and we'll chat with everybody again soon have a great one. See you. Thanks so much fix take that. Thank you everyone as a quick reminder this recording will be posted on the Linux foundations you to page later today. Thank you.