 Okay, hi everybody. My name is Nader. I'm going to talk to you about testing strategies for pipelines. First, a little bit about myself. My name is Nader Zieda. I work for a staff software engineer at VMware. I've worked in a bunch of CI CD projects previously. Tecton, when it was first starting, and Concord CI, which was probably not as famous. Most of you probably don't know it by now. Also, recently, I've been working on a lot of Kubernetes stuff, cluster API, Knative, and currently working on supply chains for VMware. Today, we're going to talk about the importance of testing your pipeline, kind of how to approach it, how to effectively test it, and key areas of testing it to make sure it's a successful pipeline. Okay, so we have this new project. We're building a new app. It's like a Twitter, Blue Sky, whatever, some kind of app we're building. You're posting stuff, has a timeline, and have a database, and like a normal standard app, a new project. So as a good project team, we're building our cool new pipeline to make sure everything is working. It's automated. We get our code. We build it. We test it. We run our tests. Everything is good. And then we deploy it to our production environment. The app is running. Users are happy, and everything is good. And then as we go along with the project, we start, of course, to add new features and add new things, and we're updating our pipeline. So we go and add this configuration stuff for our deployments. We have our new environments. We have testing and staging, and a bunch of other environments that we need to update, and we go and make some changes to our pipeline, and now we're updating, oh, our app is updated. I think something went wrong with the app, but we don't even notice that. Oh, and then later realized something was configured wrong in our configuration step in our pipeline, and some of the parts of the deployment went in the wrong environment, and now people's, some of the users' timelines are kind of messed up, and the app is not working, and we didn't notice it until it was actually in production because the pipeline had a mistake in it, and we didn't know. That's one example of things that can go wrong in your pipeline. Another example, this actually happened to me also in a previous project. We had this really good upgrade testing pipeline, testing all the different combinations. We were releasing a new product, and then after our release, we hear back from the customers, like the upgrade is broken, it's not working, and then later we realized after investigating that we missed one of the different combinations of the upgrades, and was never even tested, and then we had to create a patch, and there was a problem, and a big mess for no reason, it's just that we missed it in the pipeline. Obviously, you see that even though we don't pay much attention to testing that the pipeline is doing what the pipeline is supposed to do, we just assume we're going to test our code, our code is good, and then we're going to use the pipeline to automate our stuff and deploy it and send it to customers. It can be causing a lot of problems that we're not even aware of, so maybe we should write tests for it, same as we do for our code. So really, the pipeline is mostly just a bunch of steps. Each step is executing some kind of code, script or something like that. The two approaches that I have seen or done in my experience in previous projects are one of these two approaches. The first approach, and we're going to talk about both of them, the first approach is to treat your step, each step considered. If it's getting like a bash script, we're going to get into each one of them in details. If it's like a script mostly done in bash and it's getting too out of hand, maybe you can change that into a language that you can write tests for. For example, I'm going to use go lang as an example, but you can use whatever language that has a testing library in it, and then have actual unit tests to ensure that this step is actually doing what it's supposed to be doing. That's one approach. The other approach is to have a framework to actually deploy your step and test it and make sure it's actually executing what it's supposed to be doing. If we go with the first approach first, you're going to use a language that actually has a testing framework. I've used go before. If something was getting too complicated, like a bash script was getting too long, or it's dealing with a lot of different steps, different if conditions, and different situations it has to deal with, then maybe it was good for us. It was good to use go lang and replace our bash scripts. The only kind of overhead is that you have to kind of compile it every time, and then put it in your probably container image that's going to be run in the test, in the step. This might be something just to keep in mind. If it's something that you're changing so frequently, it might not be a good candidate for it. For example, you have here, this might be too small, but for example, you have in your script, you have the first section, you're checking a bunch of dependencies. You can replace that by a go code that's checking all these dependencies and just replace this one line. This is like, you know, it's tested. You have unit tests for it. You know that it's solid, and then this way it's kind of, you feel comfortable with that, and then this is the one line in the script that is not going to cause problems. This is an area that have caused problems for us before, or something else, like one of these different sections that you're doing something complicated. You can just be calling a go script. If you go extreme case, you can do everything in go, for example, or Python or whatever language, and just have the one line in your tecton task, just calling that, and then everything is tested, and you're more comfortable with it doing what it's supposed to be doing. What are some things that are good candidates for being used in a go program, or whatever language? Like network stuff, like we've had parts of our top pipelines that had to interact with an API. I don't know if any of you were here before in the previous presentation. They were giving an example. In this room, they were giving an example about their Docker file. There was go pulling some script from their partner, and then pulling it down, and then executing in the Docker file to build the image, and then the partner or the vendor replaced that file and replaced the script, and it wasn't there anymore, and their build was broken. So dealing with something like that in go language and having testing for it or having checks and validations and stuff like that will make it easier to catch these kind of things, or also other candidates are processing large amounts of data, or parsing log files, or looking for specific patterns. I know the Kubernetes project built this whole release tool in go, and they use that in their stuff. To make it easier, they pull the release notes from the different commits and do stuff like that. I think it would be a lot easier in a language like that and being tested than doing it in bash. The other approach that we ended up also using in other projects was having a framework. I haven't really seen existing frameworks that are very flexible and can cover or generic enough. So in our case, in the project I'm working on now, we have our own small framework, and I'm going to talk a little bit about what it does. But this gives you the flexibility to have more coverage of your steps and tasks, gives you room to add more to it easily as your pipeline evolves. And in our case, we're doing Kubernetes resources that gets deployed on a cluster, so we needed a way to be able to validate these, like our CRGs are deployed on resources and check the status of specific things in them. So we had to build our own kind of matchers to match for that. So it's easier for us to build that. We couldn't find anything that we could use. In our case, the framework was basically doing these steps, so it's creating a cluster, which is easy for you to do using a client cluster or something like that, as deploying our code, which is the controllers that we're trying to test, these are the controllers that we've changed. And then applying the test, which is the resource, for example, we apply this workload resource. And then the point is to apply the workload resource and then it will work on the changes that you've done in the controller and then see the output resource, how it updated it or reconciled it. And then you validate the status of the resources that got created. What the status happened? Like, does it meet what you're expecting or not? If something goes wrong, of course, you go and check the messages, you can log whatever you need to log and have specific conditions. We had assertions for each one of the different status values. Make sure that this English got resource, got created with label equals whatever and things like that. And then each one of them is an assertive. It's very easy and quick to catch what went wrong and be able to fix it. And then, of course, you can have the cleaning up of the environment after you're done with running your test so that the next test can go on a clean environment and so on. You can run it locally, you can run it from, we can run it locally on a coin cluster, you can run it on your CI. But it really helps when you're making changes to make sure that your actually pipeline is actually, we're kind of treating it the pipeline. We always hear that we should treat your infrastructure as code and make sure that your infrastructure is checked in and everything in your pipeline and so on. But not a lot of people actually tested and make sure it's doing what it's supposed to do. So, of course, everything is working separately. But you also have to have some kind of integration test. We have end-to-end tests that are just like, we had a problem before, like the deploy example that I talked about at the beginning where one step was working, the next step was working, but one of the data was not being passed between them properly, so that caused the problem. So, if you don't have an actual whole pipeline being tested in your test environment, then you might not catch things like that. So, testing each step by itself plus testing the whole end-to-end is also very useful. So, of course, not everybody has the capacity to write all these tests or make changes to their, like sometimes people don't even have the capacity to test their own code and let alone to test their pipelines that are just deploying the code. So, we're going to talk a little bit about what parts are more critical to test. So, these are like the criteria that we go by, like the risk of failure, which task has like a higher chance of failure and causing a bigger problem. The cost of failure, which task would, if it failed, is like has a higher impact, and which task, if it works well, will have a higher reward or benefit. This is pretty, like, a good criteria that we went by to kind of figure out our priorities in which ones to start with testing. So, if we go through this pipeline and kind of think about it, these criteria, the get code is pretty basic. It's just pulling the code down. If it fails, you'll obviously know right away it's probably easy to catch. So, you don't have to worry too much about testing it too much. You can like test it, but it's not like a high priority. It's like a low impact one. So, it's the build. Like, if your build is failing, you'll catch that pretty quickly. You don't have to worry about it too much. So, I would say like the first two are probably lower priority to be tested. The test one is like a little bit more medium just to make sure you're running the right tests to make sure you are running the right test against the right environments. If there are multiple environments, so maybe like a medium priority being tested, if it's not tested, it could lead to some problems that your code is not being tested and things are being missed and stuff like that because mostly when we have all these infrastructures or pipelines in place, we rely on it. If everything is green, we kind of don't notice too much if some of the things fell through the cracks and are not being and not doesn't happen. So, test is a little medium. You need to kind of make sure it's running against the right stuff. Deploy, I would say, is the most critical one because if you're, for example, in our case, we were deploying to production based on a bunch of different testing steps. If that is doing something wrong, we didn't have any like, this was like a small project, so we didn't have any like blue, green or like canary. So, we were just like testing to production. So, this is like a higher impact to be tested. The impact of this one going wrong, I think I would say it's pretty high. So, just kind of recap this part. Like, the deploy has a higher risk of failure because deploy is just dealing with the different environments, different configurations, everything has to go right. The cost of failure, it is bigger and the value of success of it is high because end user is using the app, otherwise the end user is not able to use the app. The test is medium because cost of failure is high because your app has something wrong with it, it's not working. But the value of success is higher, your application is more solid, but the risk of it is a little bit less. The get code and the build, I would say, is low because risk of failure and cost of failure are pretty low. So, just to recap quickly, we need to find a way to test each task of our pipeline. The complicated parts of bad scripts or scripting would be useful to use in a language with a testing framework. In my experience, we've used Go, it was useful because you can easily compile it and put it in your container image, but any language that you're using in your project, you can use, that's fine. Or use or build, if you have a framework that you can use or you can build your own to test your tasks that you can just deploy it. In our case, because it was a Kubernetes resources, it was kind of easy to just build this small framework that is, most of the work in that framework was really in the matches, like the checking of the different status of the resources, but the deploying of things is pretty straightforward. End-to-end tests are super important to make sure the interactions between the different steps and the tasks are working as expected. Start with the more critical tasks in the pipeline that we've talked about and we've talked about how to try to figure out which ones are more important. I guess the message is the pipeline should be a tool to make your project more efficient, working much better, automating your stuff. It should not be a source of problems for you. So just by having this, investing a little bit and just making sure that your pipeline is doing what it's supposed to be doing and ideally have some kind of periodic, like nightly job or something that or whatever curation, but some periodic job that goes and runs that and tests it and makes sure that your pipeline is kind of continuously working as you think it should be working, then we'll save you a lot of debugging and issues that you can run into if your pipeline has issues and you don't realize it. Anybody have any questions? I think it went a little bit short, but good news is it's about to be lunchtime, so at least, yeah. Yeah, please go ahead. Oh, I think, hold on. Let me give you that. Thank you. Sorry. Yeah, the question is do you run end-to-end tests in density? Yeah, we also run end-to-end tests to make sure that the pipeline, you mean like run end-to-end tests about testing the pipeline itself? No, not the same pipeline. Yeah, the pipeline, these tests we're talking about are testing the actual pipeline itself and then we have also end-to-end tests that are like executing our code, yeah. Okay, by the day around separately, outside, outside. Okay, make sense. Thanks. Sorry, Peter, there's a question on the other side. I'm assuming that we run that test every time that you make changes to the pipeline. Do you also have, like, do you also try to run these tests in other situations like maybe eventually when, like, frequently? So in our case, we have a periodic job that runs these tests in case things that change like external dependencies, you're pulling down stuff, you're building images, stuff like that. So we have a periodic, I think it's nightly, and you just like have it in the background and it's running and you don't have to worry about it and then sometimes it catches stuff. But yeah, we do have a periodic test for these. Okay, sounds good. So you run these two times? We don't, sometimes, you most don't, I find the periodic one more useful because you don't change the pipeline as frequently as your normal code. So the periodic one will catch things that happen outside. Yeah. That makes sense. Thank you so much. Okay, if nobody has any more questions. Just the last one. Oh, sure. No, go ahead, I just didn't see it. Does it make sense for you to actually have a stage for test after each stage in the pipeline or you normally embed the base tests within the stage itself? Do you mean like in the normal pipeline, like test and then test both in the same pipeline? Does it sound to, I don't know. It might be an overhead in your pipeline itself because you don't, if you're not changing these frequently, you don't need to be testing them every time. If they're the same. So that's the only consideration. But you can try to do that. I think it's fine, but just that it might be an overhead in the pipeline itself. If you're always running the same stuff, nothing changed. I see. So then I guess the focus is more on what is volatile, right? What is constantly changing? Okay. Yeah. Thanks. Okay. I think that's it. Thank you.