 Thanks, everybody, for joining us today. We're really excited to have this crew together and to interact with you, especially. So we're going to ask a few questions as we go through this. And we want to get your input. We encourage you to use the chat window. You can use the chat or the Q&A window. The chat window is a little more interactive. So we prefer you use that one, but either place you feel comfortable typing in questions or comments or feedback, please do so as we go through this. As Megan introduced all of us today, we did a presentation or a live panel like this a few months back, I think maybe four or five months back. And with DJ and with another product manager here at Sneak, it was really good. We had a lot of great interactivity with the audience. And we talked a lot about adopting IAC and adding security practices in things and also how you do that and still maintain the speed and the velocity that you want to get with IAC. We had just written this e-book together. Sneak and Engineer Better had written this e-book together. And so it was just barely published at the time we had this last presentation. It's still available. It's free. It's a fantastic resource that talks about a bit of a roadmap for adopting IAC for speed and security and those are things we'll focus on. What we didn't have last time, which we do now are these code samples. So if you're interested in actually trying out some examples of these practices that are outlined in this book, you can see that in these code samples. There's different stages of commits throughout these code samples that will line up with the things that are in the book and the practices that are in the book. So we encourage you to go and do that and you can learn even more. But in the book, we have a bit of a call it sort of a roadmap or a journey of adoption. This is taking a lot of feedback we've seen with customers that each company, Sneak and Engineer Better in our histories have worked with and some practices along the way that help with these ideas of achieving speed and security with IAC. The last time we did this, we kind of focused on the early stages of adoption and those first few things that you wanna do. Today, what we're gonna focus on is a little bit of the later stages of production and sort of underneath this category of safety, some key functionalities that I think are desirable, key outcomes that are desirable, maintaining or minimizing drift after you've started to deploy things with IAC, making sure that things are functionally correct when they're in production and of course, making sure things are auditable as well. I see that somebody asked for the links and Stefan and others are posting those in the chat too. So we'll get those, I'll get those here in just a second. So that's what we're gonna focus on today. And again, as we said, we want this to be very interactive and it's great that people are starting in the chat already, that's fantastic. So we'll start with the question just to get some more feedback into the chat. So is you're thinking about your own personal journeys and your organizations work with IAC today and kind of where you are? What kind of obstacles are you facing in your adoption of IAC? A few of you may be lucky and say, we're doing great, we've got this thing licked, in which case we may turn you on as a panelist here to join us in the chat. But a lot of people are facing other challenges. You think about people process and technology. I encourage you to go and pop your responses in the chat panel so we can kind of see those and we can discuss them. But we see things like just not enough skills, not enough people that know the format of choice that we've chosen where that might be Terraform or Kubernetes or things could be the technology. We just haven't made a technology to choice or we don't feel like have enough tools to automate and implement and secure. Could be process as well and just not knowing what the right process is to bring all these different teams together. I see a few things coming into the chat here. DJ or Stefan, anything that you kind of see is sort of come, actually let me ask a different question first of the customers you've talked to. How many do you see that you would say really do have this like down pat? I'm assuming it's probably low because if people are calling an engineer better they probably want help. So they probably identify as like we're not doing this so great. Yeah, unfortunately, very few customers call up saying hey, we're having a great time. Can we hang out and pay you to do so? So yeah, we do have a selection bias. We tend to work with a lot of large enterprises who they have a lot of legacy heritage stuff so they're not starting from a clean slate. And sometimes the kind of the join points between what's new and what's existing are the sticking points. That's where things become slow, where you're dependent on a central global networking team who need to make changes to a global load balancer manually and aren't quite on the IAC track yet. So those kinds of things often hold back further adoption. I don't know about Stefan, your thoughts? Yeah, another common pattern that I met with talking to users of the tools, the open source tool that we built is a little bit different. Sometimes they are very skilled at one cloud provider. Let's say they automated like 99% of AWS deployment. They are very happy about it. And then they had this different project on Azure or GCP and they are less skilled at this new provider. And they don't have the time to get up to speed and like gain, I don't know, four years of knowledge in like three months. And here they take a lot of shortcuts and then the multi-cloud automation is definitely not as good in this new cloud provider and it was with their AWS provider. That's a common pattern that I saw talking to users. Yeah, good. And one thing too, I want to point out real quick, we should have done this in introductions, but Stefan, I neglected to point out that you came from a company called Cloud Skiff, which we sneak had just recently acquired, but you have an open source tool called Drift CTL, which is still open source, still available for everybody here that if you want to go and try it, that is a drift management tool. And so a lot of experience around that. We'll get into that more as we go through this, but something that I would encourage everybody here to go and check out as well. I saw a couple of things come into the chat panel. So one person said, struggling to pull out of a inheritance model and realize a composition model. I'm not sure what that means. I don't know if Christopher, if you have more details or DJ or Stefan, if you maybe have more understanding of. I'm not sure either. I would probably imagine something around legacy and get started creating new resources and building on that, starting from something already there. It's something quite common as well. You start automating new stuff, but you still have the legacy, the existing stuff that you have to import in Terraform or any other tool and it can take time and it can be an awfully long process and the way to automate it is often quite ugly. The code is not maintainable or really like verbose. So it might be something like this and definitely it's a major blocker when you have like thousands of resources to import. Maybe, Christopher, it would be great to get your additional input on this and maybe we can come back to it throughout the session. If you're thinking from a programming model of composition, sort of go lang style versus inheritance and something more traditionally object oriented, that would be interesting to hear more about actually. It's something that came up a little bit in the last session talking about Terraform modules and being able to compose those together to create a system and in truth it's something we don't tend to do a lot of in that when we're working with customers it's often deploying one platform so that there's less need for reuse between things. So it would be interesting to hear more about that. Yeah, he followed up and said, simple to import a resource. He said Stefan was kind of right. He nailed it. It's simple to import a resource. But then to make the abstractions to migrate up composable configuration from those individual models is a tough mindset to keep modules. It's a tough mindset to keep. So yeah, I think you nailed it there. The other is another person here mentioned inconsistency between the deployment and the state. So I think the intended desired state in the true state which we're calling drift. We'll definitely be drilling into that a little bit more today. And then incomplete providers for AWS Azure and OpenStack and probably other things as well. Yeah, sorry. I was just going to say that incomplete providers are definitely paying and this is where certainly in my experience we work a lot with teams who are going from operations and kind of traditional system admin into infrastructure as code and treating it like a software development project, which I have found time again that people, once they click into that change and feel empowered, it's a really satisfying one for them motivation wise, building something instead of doing daily toil. But having some go lang skills to be able to contribute to Terraform providers and dig yourself out of the corner. That's a totally mixed metaphor, but you know what I mean. It's definitely useful. I also saw that Robert said earlier, it's a source provisioning time to the cloud platforms. I'm going to make no comments about which ones you may be talking about, but I have my suspicions. Yeah, as we else mentioned, you know, missing operations. I think, you know, like real programming operations, like if statements we had in the last one, we talked about item potency, right? Being able to run your pipelines, run your plan and apply for Terraform and always get the same, the same output, which is a challenge when you don't, you can't test like you can't do an if statement or a case statement or switch or things like that, right? So you have to be creative and it's a complex problem. I think. Isn't there a new addition to Terraform that has more functions? Yeah, they have an SDK now. And I think, I don't know, Esteban, you probably know more than I do about it, but I think they've been debating whether to add more programming like control. Yeah, you have more control in the latest versions of Terraform, but you're right, CDK, you can, with CDK, you can basically like write your Terraform, you're using like JavaScript or something and it converts then the code back to Terraform, the standard HCL, so it can be like run so you can have a more complex logic in your like traditional programming language and have, I don't know, big iteratives, iterations, if statements and all the, that you're used to and generate in the end HCL, it's definitely something new and in the Terraform, there's also technologies like Pulumi. I remember speaking to those folks at KubeCon where it's more of the opposite of declarative, imperative programming model, which would definitely be more flexible. I would temper this conversation though by saying that there's great power in declarative simplicity and I can totally imagine edge cases where you'd need to do something that maybe isn't straightforward, but generally I would, if I found myself needing more flow control statements, I would probably want to take a step back just to double check like is this really the best way of solving this problem or is it the simplest way? Sometimes it absolutely will be, but other times it might be a smell that maybe you're doing some working against the tooling rather than with it. Yeah, I see another comment here too from Karen about a challenge of migrating between existing versions. So as folks like Terraform, I think produce a new version of their format. It's maybe not compatible with the previous versions of their format, which is a challenge I think all around too. I think it's a challenge for somebody like for Sneak too, where we have this whole set of rules that work well with a version and there's a new version and some things change and the rules have to be adjusted and policy checks have to be adapted. Yeah, it's definitely a problem. We've been bitten by it in the past and this ties into one of the themes in the ebook about reproducibility and making sure that everything that goes into your pipeline is versioned and you can pin it to a particular version. It's all very well saying, well, we've got this Git repository which holds some Terraform config and we're going to be really controlled about which versions of that make their way through different environments. But if, I don't know, you're using Jenkins and a Terraform plugin or something like that, which auto updates, then that could totally invalidate all of your testing, all of your promotion. So yeah, definitely being restrictive about, okay, there's a new version of Terraform got to be tested before we allow this any further through our procession of environments because I've definitely felt that pain. Yeah, for sure. All right, I'm going to go to the next question just so we can continue on here. But I'm curious too, in your organizations, who's responsible for IAC? We've seen organizations that have just like, we're full DevOps and the developers can write whatever they want. It's their responsibility to own it, maintain it, keep it healthy over time. Other organizations who have said, we're going with more of a centralized model, our app teams can come in and request things and maybe they can make some one-off changes. But for the most part, we're going to centrally control things. And we've seen people who have even said, everything is coming through our infrastructure team. They're the only ones that can do anything. We're not going to let anybody else do anything. And just kind of curious where people are in that thinking and also where you think it will go in the future. So if you have any input on that, we'd love to see that in the chat. We see a mix of those things. Any thoughts there? Kind of what sort of mixes you guys see, D.J.? As a consultancy, we see quite a few different models. And as a consultancy that gets called in when people are facing challenges, one of the ones that we see, different organizations work different ways. And there are many different ways of being productive. So there's not necessarily one universal right answer. But where we do see people have problems is where those responsibilities are not clearly defined. Where you've got a platform team who are kind of doing all the back-end infrastructure stuff, but they're not really running platform as a product and with developers as customers in mind, they're just kind of the ops people with a new name. But then you've got some developers who are managing infrastructure and you've got to be a bit to the infrastructure. That gets to be a mess really quickly. So I think having clearly defined responsibilities, whatever those responsibilities are, is better than having a kind of mishmash where the gaps can occur, especially when we're talking about things like safety and security. If it's not clear what's being tested and what hasn't, that's definitely dangerous. If it's not clear who's responsible for something, then who's responsibility for the security of that and making sure that it's auditable and that things are properly configured. Yeah, clear definition of roles is a thing to strive for. Yeah, for sure. I think you see all the headlines about breaches and a lot of them start from some sort of misconfiguration. And I think there's a couple of different ways that that happens, but it falls to controls to some degree, but also just maybe not people knowing who's actually responsible for making sure this thing is still in the right controls. And we're going to talk about drift as we said a little bit later, but things can drift even from what we said it was going to be to what the actual state is. So I think that clear definition of who's responsible for it, not just defining it, but then maintaining it over time is critical. I see Robert Newman has contributed that their whole scrum development team is responsible. Those kind of cross functional teams could be so productive. Having transitioned from being mostly a Java developer to doing kind of a full stack infrastructure all the way through. It can be a challenge for folks. And I'm a big fan of kind of platform as a product type setups where you've got a team who are not managing infrastructure, but they provide self service platforms to developers. They don't necessarily have to kind of worry about those lower levels of abstraction. And then Christopher, on the other hand, is saying there's no kind of centralized use in Christopher's organization. The developers are relying on the SRE team to handle all the infrastructure stuff. So it's kind of already seeing two quite different ways of doing it there. Yep, yep, for sure. Okay, let's go to our next one. This one's a poll question, so you don't have to type your answers, although we would love to see if you have some details in the chat. But this is about the types of tests to use. So we're talking about who runs this and some of the challenges you have. But I'm also curious, and we've chatted a little bit about this, testing these things before you deploy and even sometimes after you deploy your infrastructure. I'm just kind of curious as to what types of tests everybody uses here. So you should see a poll pop up on your screen. You should be able to pick a choice here. I think you can only pick one. I know probably several people are running more than one. Pick the one that's sort of the biggest, I guess, or most common that you're using. DJ Estefan, what kinds of things do you see as sort of maybe starting with like early adopters, the first things that people do to like more mature organizations? Yeah, probably the first thing people can use is like this very simple built-in validation from like terraform validate. It's the most basic thing that you can do. And then you can probably very easily just launch a live linter. Like there's different linters on the market from TF Lint to our own SNCC ISE analyzer. So that's the kind of things that can bring an initial high-quality feedback loop. It's to start with that. And it's already great because you have already quality feedback instantaneously. And then you can move on like to move on to the next step, to the next step, to the next step, and you can move on like to much more advanced like unit testing and more advanced testing as well. And up to, I don't know, you can even finish with one product I love from Chef that was in spec. So you can write in Ruby fully. You can test all your compliance using pretty simple Ruby checks and that you can render that at the very end of your pipeline to just ensure that everything is definitely the way you intended it to be by writing it properly. That's the kind of things you can do just a few examples, a ton of different examples can be done. Terror test. Terror test, I think Daniel, you have probably a lot to say on this one. It's wonderful, but it's quite complicated for early users to use. You basically write all your tests using Golang. So it requires a bit of time and skills as well, but Terror test is great. Yeah, and again, we're not being paid by anybody offering Golang training, honest, but yeah, useful skill to have for people who are treating infrastructure as code. Regardless of the tool, I'd probably just recommend to make if there is anyone on the session or watching this who is, you know, not from a software development background and maybe from more of an infrastructure admin or upside to really try and embrace testing and embrace early testing as much as you can, getting that fast feedback and knowing quickly when you've done something wrong is so valuable. A customer that we're working with has some unit tests for some of their infrastructure projects and they just weren't running them as part of the CI build. And, you know, our consultants were like, why do you have these but you're not running them? And yeah, they went about making that change and one of the engineers on that team, one of the customer engineers, like, oh, this is really good. We find out much faster when things are wrong. I say, yes, that's exactly the point. So if you can test, do and do it as early as you can. Yeah, the results here are pretty interesting too. So we've got good, you know, basically kind of a split between two. There's, you know, everybody, there's a little, there are a few answers for everything, but a split between the two. So one, we're not doing much automated testing yet. And the other is linters and some of those early sort of feedback type checks. For the other ones, the, you know, the more full featured security and checks and policy tools, relatively low usage, which is not, I mean, I don't think that's a different answer than we would get if we had 5,000 people here. It's probably, you know, fairly same in terms of percentages, but yeah, fairly, fairly consistent there. Okay, let's talk about, you know, actually implementing these changes. So as you go through, we talked about automatically testing and some people are doing that and some people, you know, haven't really yet, but if you are doing some automatic testing, do you promote these changes through multiple environments? In other words, do you have several stages that it goes through to make sure everything is working as expected prior to reaching that production stage? So really sort of having that thorough pipeline that you're testing for IAC. This is pretty common, I think in software development or becoming more and more common, I think in software development and really kind of getting to treating your IAC as if it's a, you know, it is, you know, in this case, a software thing, right, a product. So curious if people are doing that when we make the question live here for you. So you can pop an answer in there too. Again, I'd say this is so important. The number of folks who test the hell out of their software and then infrastructure, you know, they're just applying it straight to prod without it going through any earlier environments. It's, you know, that's kind of undermines all of the testing that goes on in the software process and nobody wants to be the person that makes a prod change and causes a massive outage. You know, ask the people at Facebook who made that networking config change. So it's super important and there's another benefit that you get from having promotion through different environments. One of the things that we talk about in the ebook and in the code examples is we're quite big fans, and you know, better of having like one instance of a pipeline representing one environment. And once you do that, and if you start from the assumption that everything that you need for an environment should be createable through a pipeline automatically just spin up a new pipeline and it should run and give you everything that you need for an environment by itself. That sets you up to chain those together so that then you can test them one after the other, but also starts giving you so much more flexibility in kind of providing development environments for the software developers that are using your infrastructure. On the subject of the type of testing, you know, we talked about more the kind of unit testing. What do you do close to the commit? There's some, for the SREs here, I think there's some really interesting intersections between things like acceptance testing and service level indicators. You know, you want to be testing your infrastructure and the things running on it as if you are an end user. You know, can you log into the system, not just is the login page returning healthy when I hit the health check endpoint, you know, can write tests that step through things like a user. That gives you your SLIs and, you know, the service level indicators and also gives you real confidence that stuff's working, not just metrics, not just, you know, response rates and those sorts of things, but like you've got some proper confidence that somebody stepped through each of these stages and if our test can do it, a user can do it. And if a user is telling me that they can't do it and our test just ran and said that they can, maybe it's a, you know, problem exists between keyboard and chair type problem, which as an operator is a useful kind of way, you know, for triaging things. Yeah. Yeah, one, one, first let me just say this. There's an interesting, so looking at the poll results, interesting, zero people said they're actually fully automated with these multiple stages all the way through production. If you said they're close, but not quite all the way through. I just, you know, again, not something that I think is uncommon. We, before we did the book, we had actually commissioned a survey to go out and just talk to IAC users and see kind of where they were and their journeys and how, you know, how good they felt they were doing and what sort of things they were doing. And a very small number of people. I think 6% or 7% of the respondents, there were like 500 respondents and not just sneak customers. This was, you know, a wide audience who said they thought they were doing, you know, sort of best in class, IAC automation and testing and security and management all the way through. So, you know, if you're in that group of people who say zero, which is everybody on the, on the joining us today, you know, don't feel like you're way behind. I think it's a fairly common thing. People are still working to that. The other thing that was interesting as we worked on this book, DJ was the notion of the pipelines and, you know, but if you go through the code examples, like they're set up for Jenkins, I think, right? And we talked about like what, what tool should we use? Should we use Jenkins? Should we use something else? There were challenges we had with Jenkins, right? Because it's good for some things, but for IAC there, you know, there are specific things, you know, outcomes that you want to achieve that Jenkins, you know, doesn't necessarily inherently understand. So curious. I've seen some other, some other tools that are, they're maybe not CI tools or pipeline tools per se, but are interesting sort of IAC specific testing tools where they do try and replicate this idea of having these multiple environments, either simulated deployments or not. Have you seen much of that in your, in your practice? You don't want to be a DJ or a Stefan or anybody here if you're using something like that. Definitely be interested in hearing what the people on the call have to say on that one. We actually did a series of blog posts on this. I don't think I can get the link handily whilst also focusing on the webinar, but we did a series of blog posts comparing, inspired by the work on the ebook to look at Jenkins, Argo workflows, Tecton and a kind of niche tool called concourse that we're quite fond of. And Tecton and Argo definitely support pipelines in a better fashion. They tend to be quite GitOps driven in that there's got to be one thing that triggers the whole pipeline. And, you know, maybe it's a file or a Git repo with a file with all of the versions in and then you've got to write something to update all those versions when a new thing becomes available, which concourse does more elegantly. But anything that is CIS code or declarative CI config and allows a whole pipeline workflow rather than discreet individual jobs. You know, Stefan, what about you? What do you've seen work here? Well, I've seen things not work. Definitely like very badly. And it falls back again on the question. I think it was Christopher like 10 minutes ago, telling about the modules and the versions and the dependencies and the versions of the providers and probably even up to the Terraform versions as well. And I even seen development or staging environments like they were working very well because people were like pushing and testing like every day or hour or so. But production much less often. And basically the impact in production for the small focus that you have on your module and you burn the versions and you upgrade it, it's all okay. And then at the end, it's big day, you deploy something and as it's not tested and automated up to production, then that's when things break because there's too much of a gap from the version that was actually deployed and production and the new version and dependencies from the Terraform providers or versions from Terraform itself. And sometimes it's even like just the pipeline not being like compatible with what was expected from that upgrade for production directly. So that's definitely the kind of issues as well that you can have here. And that's why testing from the stage into production is definitely so important. And you have like tools as well like kitchen Terraform. It comes from test kitchen from the chef days. And today it still works very well to iterate through a set of configuration variables and you can inject data from your test cases to test and simulate up to production if you want to. And you can even like did you mention like definitely test for real if the results actually is the one that is expected. Yeah. Yeah, there's a comment or question I guess in here from Don about, you know, just getting started on my IC and wanting to get some advice on a simple start to get going. I'll just I'll plug shamelessly plug the ebook again. A lot of the stuff early in the book is very simple, simple sounding at least been somewhat simple to do as you're getting started just things like, you know, they'll just store your IC on your laptop, put it into a shared code repository and start collaborating on it there and follow sort of that software development practice. So there's some good advice for just even the early stage things to set you up for better success as you as you start to, you know, bring this into production. But DJ is to find any any feedback you would have. Definitely agree with you on the book because we we use our experience kind of taking teams through this journey. And there's a fictional count of Perry, I believe, who is a operator who turns into an IAC kind of developer. So hopefully if we've done on your well, then some of that should be relatable. So skip to that section that's about that journey. And if it's not relatable, let us know we can improve it. The things that Jim mentioned, getting the under source control, making sure that, you know, you standardize formatting. I know that sounds really trivial, but like if you're going to work with other people, it will make your life a lot easier. Find some way of testing things. Once you found a way of testing it on demand through your own CLI, look at integrating into a CI server, something that will every time you commit run those tests. So you'll start to get certainty on it. Then you can start thinking about continuous deployment. And, you know, okay, once we've run those tests, sure we do a terraform apply automatically on every commit to an environment. I expect that you'll find, depending on the sector you're in and the kind of organization you're in, you may find some resistance. Security folks who are objecting to things being automatically applied, even if it's in dev environments. Hopefully there's some arguments in the book that, you know, we use when we take people on this journey and take organizations on a transformational journey about why it's safer and better to do things these kind of ways. So hopefully those will be of help. But whatever you can do, whatever small incremental improvements you can make, I encourage you to do it. You know, every little helps, every little thing that's automatically continually checking that things are better is better than nothing at all. Yeah, and I would just simply add, like, install those little plugins in VS Code or whatever. You can have the linting, you can have all the quick feedback that you need. Well, as a developer, you want that feedback as soon as possible and that will help you get started. You can even have, like, small snippets with code default that can be tested automatically in VS Code or other IDEs. So definitely do this and it's highly helpful. Even when you're, like, scale a little more experienced in the area. Yeah, I'll just say too, again, on the responses to this question in particular, there's a good number. It's around, it's 28% basically who said they're running a Terraform plan and then it's straight on into production. So this whole idea, having these multiple environments and promoting through it and automating all this, right, it's not something that everybody's doing it. I feel like you're way behind the curve, like everybody's learning this at the same time. So there are good communities for this too. I think, you know, and you should definitely, you know, hop in there and Terraform's got a huge open source community. Drift CTL has a huge open source community when you start to get into tools and things like that. Some of these other tools we mentioned, TerraTest and others have open communities too. So you can definitely jump in there and start learning and asking questions. I do want to move forward because we said we were going to talk about Drift and we're getting to that point now. So we'll start with this. How frequently are you reapplying your IAC? And in particular, you know, as a practice, reapplying your IAC, even if nothing's changed, just reapplying it. And I think when we talked about this in the e-book, the idea was, you know, to help to converge things back to what the expected state is, right? And just sort of make sure that we're always capturing those things. But just kind of curious what other people are doing. But DJ, Stefan, any... I'm going to say that Stefan has probably got a way more deep insight on this, given his career history. From the e-books perspective, we were talking about the three R's, which I think is Martin Smith, who's ex-Google and ex-pivotals, who have rotate, repair, repave, make your credentials easy to rotate, repair, apply patches, all of these things that having IAC and a CI pipeline will help with, but also repave of... if you're continually reapplying things, then any drift, whether it's detected or not, will be blatted. You know, you'll be overriding it with what you've got desired or Terraform will throw a funny turn and tell you, hey, things have changed. I don't like this. I'm not going to apply and you get a big red failed build in your CI pipeline. So from that point of view, that's valuable. This may be something that just reapplying won't do in the repaving fronts and a bit of a tangent. So I won't talk about it for very long. But recreating systems, if something does get into your production infrastructure, if you're tearing it down and recreating it and slowly doing a rolling update of pods or whatever, at least then you've got a short time to live on anything that's got into your system. The worst thing then, you know, malicious code being in your system is malicious code that's been in your system for six months, harvesting data with no visible signs. It's not trying to make any network egress. And then all of a sudden it sends the whole lot out to nefarious hackers somewhere. So if you can, you know, minimize your exposure by recreating things, that'd be good. But Stefan, definitely would love to hear your thoughts on this. Yeah, I definitely agree. And I would even add, like, that in this process of applying as soon as possible and as often as possible your infrastructure as code, I would as well, and before applying it, I would use a tool like Drift CTL, the one that we created, the open source tool, Drift Control, because what you want to ensure is the parity between the state that you currently have, as in terms of terraform, it's the terraform state, and you probably have many of them, because probably you have a state for your eye abusers, for your S3 buckets, for your Kubernetes or something. And you want to ensure that parity between two states, aggregated states, and the reality on the APIs and AWS or something, or between regions as well, and before doing anything else. So you want to start from a clean state and beware terraform apply does not actually enforces the parity. It just basically uses a the state is not a conversion of the HCL. It's also interacting with the API from AWS or the cloud provider, and it may make changes to that state. It's meant that way. It's not the source of truth. And that's why in reapplying stuff, sometimes you can end up importing, sorry, manually added resources. It's hugely disappointing. A lot of people can have bad surprises. I was the first one to have terrible surprises with manually added rules or some settings changing S3, etc. So that's a reason why I would do two things regularly. First check for the parity between your states and the reality. And depending on the result of that clean state, then apply again anything because you're sure that what you wanted did not change from the last time that you applied it and it was working. Because the worst thing you can do is basically be think that you are secure because, well, I turned from applied. It's okay. Well, no, you just applied like your security group is now open to the world. Good job. It's on your terra from state. So, yeah, I would do the check before that. Yeah, responses here. So we had around 14% who said they do this often and automatically, which is great. But we also had 32% who said they never do this. And so why, you know, why bother if nothing changed. And I, that was a bit of a flippant addition to the response on my part, but, but still 32% who are never doing this. So I think it's, you know, it is a great practice. The bulk of people about 45% said they do this occasionally, but it's ad hoc. Like it's not an, it's not an automated thing. So definitely room for improvement. I think for time purposes, I'm going to move ahead to the next question because it digs a little bit deeper here. And just curious how often are changes made to your infrastructure, the things that you're managing with IAC outside of your, outside of your software practice here, outside of that IAC coding and, and pipelines. This is, you know, I think of probably a fairly common problem is to find you. This is one you've been spending a lot of time on in the last couple of years, but any, like what do you see out there? Well, the reality is always fun because very often you, you know, I met a lot of people that were like, yeah, we have all the good practices in place. Nobody can do this or that, et cetera. But then you dig a little deeper and say, okay, you don't have customers like you're not a consultancy. You don't hand over like accounts to someone. Are you sure it's completely locked? Are you sure that even, I don't know, you don't even have any single service that it authenticated. So basically nothing runs on your infrastructure. Nothing is live. Nothing like, did you unplug the full infrastructure and you fired the whole team or what? So the reality sometimes is a bit different. You always have this team on call during the weekend. They have to do something quickly and basically maybe they don't even have the time to change this on Terraform. And the thing they will do it like on Monday, they will report and they will do it properly on Monday morning. And guess what? They forgot. And that's when all the manual changes happen. And you have all those authenticate scripts as well. Like, I don't know, backup scripts for hard disk or things like that. And sometimes they are authenticated. They do actually make some changes on the APIs. And sometimes you have bugs on the scripts as well. And for some reason, like you end up with 10,000 EBS duplicated over the night. And I've seen this. So that's, it's not even drift. It's just pure bug, but it's, and it's in the end, it's definitely manual changes because it's not under control by any code or any real intention. And definitely not tested. I think the, yeah, this goes to show the importance of being able to detect and visualize, detect drift and visualize the state of systems. Because exactly a step out says there will be times when it's entirely legitimate for people to make manual changes. You know, it's an emergency. We don't have time to push it through the pipelines. They'll forget to backport it. I think we can talk about that in the book as an example, which is based on a real life experience. So, yeah, being able to see what the state of things is and detect automatically when things have diverged from what you want. Really critical. Yeah. Yeah. The responses here are kind of just basically spread across the four response, the four choices, which, you know, was basically zero. No changes can happen except in pipelines, which is the fund said is probably not quite right. But the next choice was 25% at 50%, 75%. Maybe kind of scary to see that still 75% or more of changes are happening outside of these pipelines and outside of the software practice for a fairly significant number of people, 27%. So that's still a bunch. I think, too, given, you know, what we were chatting about around the first question where, you know, it's, it's one thing I think if you're just starting from scratch and you're, you know, you're a brand new company and everything's done in IC from day one. And, you know, you have some chance of everything being in the pipeline and captured in code. Most companies aren't that way. And so a lot of companies are trying to bring things in to, to IAC. When you have that mixed model, I would imagine, you know, some people don't know which resources are manually managed and which resources are managed via IAC. And they just make changes and don't know that they've affected somebody else's work. I think I see that being an issue too. And drift detection is a great way of identifying that and finding out like, hey, you know, Rachel and the other team, she didn't realize that this is being applied automatically. And, you know, we found that it's diverted. Yeah, for sure. All right. Let me pop up this next question because we are just down to our last couple of minutes. Do for the folks that are here, do you actually have something that you're using today to do any kind of drift detection? Some people, we've seen people who do just sort of manual audits, which can be harder. They can compare the, you know, manually compare the deployed state to what's in IAC. There's loads of tools. We've talked about drift CTL as an open source tool and there are others there. I was kind of curious how many people have something that they're doing there. DJ Stefan, what's what's sort of the spread of people that you're, you're seeing there? Well, I've seen and I've used and created a lot of internal tools like based around like, I don't know, the AWS CLI comments are during the chef years. You could do a lot of things to actually list and compare based on like inventories that you could have like chef server was a huge, great resource a few years ago to do this. So you could do things like this, but it was highly, you had to do a lot of manual, manual steps. Yeah, that's that's one of the reasons that we created drift CTL. It's, it's so like easy to use it. You can aggregate and all the states that you want works for the free main cloud providers today and you can do it and do things that we do, we actually do in, in real life like having multiple terraform states or running them at some point on S3 and sometimes stored on terraform cloud and sometimes stored somewhere else and you can aggregate them all and compare it. So that's one of the reasons you know, drift CTL exists because I think it's a great point too because I, you know, as a simple marketing minded person, the extent of terraform that I've used is like there's a state file and that's it. It's very simple, right? Because I deploy like three things and then I destroy it but real the real life is you're going to have multiple state files and maybe in different places and you have to have a converged view of the world to really know what's, what's happening. Yeah, and definitely you do not want to have everything on single state, the day it's corrupt. You're not happy about it at all. For sure. I'm going to, we'll skip over this question because I want to get this because I know we're, we're going to be out of time here in just a second. I do want to think, first of all, I think DJ is the fun for joining us and think the Linux foundation for, for hosting us today and especially thank everybody for, for your participation and for just joining us and taking time out of your day to be here with us. If you are interested in, in learning more, we have loads of resources for you the ebook and, and the code samples we posted early in the chat to scroll way back up to the top, you'll find those links and we'll make this presentation available and of course the recording will be available as well. You can get to sneak if you want to learn more about sneak IEC we do have a security tool for, for your IEC love to have you and talk more about that obviously. Drift CTL as we mentioned, we acquired a company called Cloud Skiff who had an open source tool called Drift CTL. It's still an open source tool. It's still called Drift CTL and we would love to have you participate in that community as well and use that tool and then engineer better. Of course, fantastic consultancy. If you're looking for help and you need some guidance with these, with these things and these practices and implementing them. Encourage you to, you know, reach out to engineer better and bring them in and see what they can, what they can do for you. But, you know, really appreciate everybody joining us. And I guess we will end there. Wonderful. Well, thank you so much, Jim. And thank you, Daniel and Stefan. It was a very engaging presentation today. And as Jim mentioned, this will be available on the Linux Foundation YouTube page as well as the webinars page where you found to register. And we hope to see you at the next one. Thanks, everyone. Bye-bye.