 Thank you very much, Chris. I'm afraid we won't have time for jokes because I have 30 minutes and like 90 slides. So Hi, yeah, welcome to my talk Can you hear me? Yeah, you can there is no crazy game here, but it's okay So my name is Sebastian Witowski and today I want to talk with you about Continuous integration or to be more specific about how you can optimize your CI pipelines Setting up a CI pipeline is not the easiest thing to do Because unlike with a local development and now you have to debug things running on a server that you don't necessary control and on top of that there is this additional layer of complexity that You have to configure various services together using some kind of configuration format that your CI provider requires you to use But usually with some templates that we can find on the internet we can glue together some reasonable setup and That might work well when we start our project But as the complexity of the project grows also the complexity of the CI starts growing So we start adding more tasks and more tools We start building different versions of our release packages. We have more and more tests and It starts to get frustrating to wait for the CI Run to finish before we can merge your code or for example to Wait for half an hour and see that your pipeline has failed because you have like unused variable And that made your linter unhappy So in this talk, I want to take a look at a few different ideas that you can consider when it's time to improve your CI setup First we'll take a look at some improvements for Docker images Then we'll talk about running things faster for example by configuring jobs to not wait for other unrelated stuff Then we'll have a look at not running Unnecessary things and stopping them earlier and then finally I will share some miscellaneous tips and tricks depending on how much time we have left If we want to discuss continuous integration, we have to choose one of the existing implementations because Different CI providers like github actions githlabs CI circle CI and whatnot. They all Require you to use a different configuration setup. I mean the general idea is the same You write some kind of a config file, but the way you write this config file differs between different CI providers So you cannot take let's say github configuration move it to githlabs CI and expect it to work out of the box So for the purpose of this talk, I chose githlub Because this is the platform that I have been using most in my recent projects and also according to this Reddit survey It's still the most popular option But I am in no way related with github. I was not paid by them to come here I know that they have paid plans, but everything I will be showing here can be fun can be used with the free plan We also need some code that our CI will run on So I created a simple project that you can find under the link at the top I will have all the links to the slides and to the this project at the end. So don't worry This is a Django project with a simple to do up if you don't know Django, don't worry You don't have to know it to follow the stock I used Django for the simple reason that Django up by default is a bit more complex Than a bare-bone flask or fast API. So we will have some migrations to apply and We will also have like more files lying around. So it feels a bit more real world But we just need this project to have something that we can run our CI on so I'm not even going to explain you the code That's not important What is important is that I have for example a bunch of random dependencies that I'm installing to slow down the build process I Have some tests that are sleeping or performing some large mathematical operations. So also they are slow We also have a build process that uses docker and docker compose So docker compose will set up two services postgres database and a web container and The web container is built from a pretty standard docker file. So So we start so we start from a Basic image we just set some environments we copy the requirements run peep and then we start the server pretty pretty standard stuff and This initial setup has a pipeline that takes around six minutes to finish so Here we can see the six minutes and we have three jobs So first we build a docker image then we have a test job that runs the migration and runs the test and This stage is actually badly designed because the first docker run command here will actually build the docker image from scratch So I will I will fix it as we go with the talk and then finally we have a deploy stage That takes around 54 seconds and all it does is just prints this command so 54 seconds is the time it takes to just start the job container. So keep that number in mind And of course this example project is simple for the illustration purpose. It's not production great So maybe don't use it in production As I mentioned that my example project is using docker Docker or containers in general are now very well supported by most of the CI providers So if you can use docker use docker because it will make your Development CI and production setups much closer to each other And Even if you don't do develop development with docker, it's quite easy to wrap a simple application in a simple docker setup So if you do use docker The first step to improve your CI is to actually take a look at your docker file and make sure that you're not doing some Obvious mistakes like make sure you're using layer caching properly during the build time make sure you use tags So you know which images you're actually using in your setup And speaking of images which one of those two images is better So we have slim buster that is a smaller Debian based image And we have alpine which is a pretty bare bone Linux image. So who here thinks that alpine is a better image as a base image Raise your hands. We have around 10 hands up and who here thinks that slim buster is a better image Okay, more people The answer is it depends So using alpine image means that you have to install a lot of additional Linux Libraries yourself This will make your docker file a bit more complex and the build process will be longer But the final image will be smaller because alpine image is very small So it will be faster to push this this image back to the registry and pull it in all the other jobs On the other hand slim buster is twice the size of alpine Which is still not that bad because if I was using Buster that would be like 15 times the size of alpine But if you use slim buster the chances are that it has all the linux dependencies already installed So all you have to do is to run deep install and you're ready to go So yeah it will be a larger image but and It will take longer to download it between all the jobs But your build process will be much more simpler and in the end the choice should depend on whether your pipelines spend more time Building the docker image or actually pushing and pulling it between the registries But slim buster in general is a good choice. I would say So one way we can speed up our build time is to not build the docker image in each job But to actually build it in the build step and pull it in all the consecutive jobs And here we can see that even though we are building the docker image in the build stage In the test stage when we run this docker compose run command We will be rebuilding docker image from scratch because jobs are independent So the test job doesn't know that the build job already build our docker image So We can fix that by pushing our docker image to the registry at the end of the build job and simply pulling it at the beginning of some other jobs and Here we can see that the build job up stop it The build job build job is now Slower but the test job is faster and we are talking about like few seconds of difference So this might come from the fact that it just took longer to start job container But you're mileage mile my very so if your build process takes long and you have many jobs Then I would say it makes sense to pull the image from the registry But if you have a build process that is simple and fast Then maybe actually maybe it actually makes more sense to build the docker image at the beginning of each job Because the saving you would get from pulling the docker from the registry won't be that great. So you have to test different setups Another thing that you can do if you end up building massive docker images is to use a multi-stage build Multi-stage build means that you start a build in a separate image So you copy a bunch of files you install a bunch of Linux dependencies You run the build process that creates some cache files on temporary files and the size of that image grows Big but you don't really care because in the end you will just take the results of your build process and Move it to the separate image that is much smaller And the multi-stage build works much better in languages where the build step requires you to install a lot of additional Linux dependencies But the result of your build will be a single binary like for example in Rust In the Python world, it doesn't matter that much I mean if I were starting with Alpine Linux and I was installing a lot of Linux dependencies Then I could get some benefits, but I used SlimBuster that had all the dependencies already installed So in case of my particular example app, there will be no difference Let's leave docker for now and let's talk about CI pipelines. So how can we make them faster? So as your pipeline starts to grow and your stages starts to include more and more jobs You'll realize that maybe some jobs are unnecessarily waiting for other jobs to finish before they can start By the way, if you're curious, this is the CI setup of GitLab itself. So that's that's that's quite a big setup So you might want to take a look at the structure of your CI configuration and move things around For example, instead of running your jobs in separate stages one by one You can run some jobs in one stage in parallel So this setup and this pipeline is ultimately going to finish faster Except that what if one of the jobs in the stage fails Now we are wasting computing resources running other jobs in this stage Even though we no longer care about the results of static analysis or preparing the release because we have to go and fix test in the first place And there's actually an open issue about Canceling pending jobs if one of them fails in a in a stage But that issue is open since 2018. So don't get your hopes very high that it's going to be solved anytime soon So there are always some design considerations that you have to Consider when you're structuring your CI setup But in general, I would say that the time you waste waiting for a pipeline to finish is much more precious than build Than the than the cost of a build minutes that cost you a couple of bucks So I would say that if you can parallelize things you should parallelize things and There are actually two ways how you can move things between stages One is called directed a cyclic graph or DAG and you might know this concept from other tools So we've dug we can configure one job to start after another job finishes regardless of which stage they belong to For example, let's say you're you have a Python package and you're testing it under different Python versions and For some reason you don't want to use a tool like talks or knocks that will allow you to set up different Python environments You just build Docker image with Python 3.8 run tests there and run the release you build that Docker image with Python 3.9 run the test and so on So here we have a build stage that has to finish then we have a test stage that can start after the build is done And then finally we have a release sage In the ideal world all the build jobs will Run in parallel and finish in more or less the same amount of time in the real world They won't some of the jobs will take longer to finish and if you have a custom git lab runner setup You might actually have a limit on how many jobs can run in parallel So some of those might actually wait for their turn to start and if the image for Python 3.8 is already Prepared what's the point for the test for Python 3.8 to wait for this Python 3.10 to be built? We can start right away So this is where we can use the directed acyclic graph and connect some jobs with their dependencies regardless of what stage they belong to So here we can see that the corresponding test and release jobs are starting right after the previous job is finished And in terms of code all we have to do is to add this Needs keyword that specify which jobs have to finish successfully before this job can start And Direct data cyclic graphs are cool Yeah, but if you have a lot of jobs and you start connecting them throughout your whole configuration file It might be hard to follow what's going on So instead of doing that you can group some of your jobs together and create mini pipelines that can run as a whole So for example, let's say you have a project that uses a different text tag on the back end and different on the front end That could be like Django rest framework with react on the front end and then your back end code leaves in the back end folder you front end leaves in the front end folder and Your back end test probably don't depend on having your front end up and running and also You probably don't need to run all the back end test if you only change something in the front end So you might want to separate those two things So we can create two child pipelines and they are one is called front end And the other is back end They are pretty similar. So we'll focus on the front end. So under the trigger we say that configuration for this child pipeline is living in the front end GitLab CI and we also specify the strategy depend so If we don't specify the strategy and this child pipeline fails Then the parent pipeline will continue running but if we say that the parent pipeline depends on this child pipeline then if a child fails the parent pipeline will also fail and We also have the rules key here We which says that okay, we only want to run this job when something changes in the front end folder and Same here. We only run this pipeline when something changes in the back end What else can we parallelize? We can parallelize tests Most of you who used pytest are probably familiar with the pytest ex dist plugin So it will distribute your tests across multiple CPUs Which can give you some good speed improvement if you're running your test on a server that has a lot of CPUs But if you don't you can run your test across multiple runners So this is especially useful if you want to like dynamically spawn new runners instead of keeping one Large expensive multi CPU server up and running all the time. So here each runner can run in a separate VM And to do that we need to install the pytest test groups plugin and then specify the parallel option in your githlap config You also need to provide pytest with two configuration variables Test group count which specifies how many groups you're gonna have in total and test group which specifies the index of the current group So are we in a group one two three four or five out of five total groups? And this setup is nicely supported by githlap ci because we have environment variable for both of those things So all we see here is all the code that you need to run your test across five different runners in parallel And this is how it looks when we enable this feature So by the way my repo has different branches that corresponds to different things that I'm talking about So if we go to parallel pytest in groups, we can see that now We are down to five minutes instead of six in terms of the time But we basically used twice the amount of computing credits because I think previous time It was like six or seven computing credits now. It's 12 And here in the test we can see that we have six five jobs running in parallel So yeah It's faster, but it's more expensive because as we saw at the beginning just starting the job container takes around One minute so starting five job containers cost us five computing credits Other things that you should pay attention to is to make your make sure that your jobs are interruptible That is if you have a pipeline running, but you push some new code You want your pipeline to restart and actually run on the new code and there are actually two steps here Which is maybe not something that everyone is aware of so first make sure that in the setting of your project you select this on auto cancel redundant pipelines option and this will Restart the pipeline when there is a new code pushed to the to a given branch This option is enabled by default so you might not know it's there and take it for granted But if you have a good reason you might want to disable it But with this setting alone your pipelines cannot be interrupted They can they cannot stop in the middle of the job They have to finish the currently running job before they can be restarted and that can be a pain if your job They if your jobs takes a lot of time because let's say you have tests that are running for half an hour You have to wait for this half an hour to finish running tests even though you actually Want to stop and start running tests again? so you can mark your jobs with this interruptible true and This will make them interruptible So they can stop immediately when there is some new code and you probably want to have this option enabled for example for the build and test jobs, but you Don't want to have this job enabled for the deployment job because you don't want to end up with like Partially deployed code to your server so in in case of deployment You probably want to like finish the current deployment and then start a new one Another thing is to stop your job when it doesn't make sense to run it anymore For example, if your full test suit takes half an hour to run and already the first test failed And what's the point of running all the test if you know that you have to like run the test locally and fix those tests So you can run pytas with minus X this will stop the pytas to run after the first First failed test and then some next job can start You know what else can make your CI faster Not running things in the CI So one of the biggest revelations for me was that you don't have to run every possible check in every possible pipeline If you have a bunch of slow integration tests You can just run them on the main branches for example in my current project We have some integration tests that takes quite a bit of time But they check that all our apps are working nice together. So they are like testing different integrations But because of that our test suits takes around 45 minutes So we marked all of those jobs as slow and we moved it to a separate pipeline That is interruptible and that runs only on staging and on master branch And now the pipeline for a merge request takes five minutes to finish and that works fine I mean sure we don't detect all the bugs right away, but the Merge request pipeline finishes in five minutes and we eventually get the feedback from the full test full test run So that's fine. And you can also run some particularly slow jobs manually or during the night and Very quickly because I'm running out of time and we might actually have time for questions So I think I'm speaking faster than when I was practicing. I hope you guys can understand what I'm speaking My my talk will be on YouTube so you can play it with like those 075 percent speed So you probably know that you can cash things for example, you can cash the peep cash between jobs and that can Give you a bit of a speed improvement, but you can also specify the cash policy By default each job that uses cash will pull things from cash Run the steps that you define in your job and then push things back to the cash again But maybe for some reason you don't want to do that because let's say your job is doing something destructive to the cash So there is a policy keyword that you can use to either Disable pulling the cash at the beginning of the job or pulling or pushing the cash at the end of the job and Speaking of caching you can also select the fast zip compression method and that will allow you to specify the compression level for your artifacts or for your cash So here we can select for example the fastest method That will run very fast, but the resulting cash object will also be larger So it means that the caching will take less time, but like downloading this cash object in consecutive jobs will take a bit longer That works really nice with the cash controlling policy because let's say If you have if you have a pipeline setup where you Build your cash only once and then you push it But then you pull it in all the other jobs It actually makes sense to use the slowest compression method that will take a bit longer to build the cash, but the resulting object will be Smaller so it will be fast to put in all those next jobs If Docker is too slow to build your images consider using a different build system. There is build out There is canico. They have very similar commands. They actually have a bit of different features So maybe one of them is better for you You can also use your own runners And this is a very vast topic that could take a separate talk to talk about in in details But using your own runners first of all will save you cost because you're not paying for the computing credits of you For using gtlapci and also gives you much more flexibility For example in my current project We have to use runners because we have some proprietary code that we cannot really push to gtlap So instead of running tests on gtlap we set up runner on our server So the codes are running on the tests are running on our server And we only push the results back to gtlap and then gtlap handles all the displaying whether the job failed with all the logs And so on and just for fun. I checked How it is running a runner on my computer and I got actually some interesting results So the build job was faster and the deploy job was also faster. That's the build That's the deploy so deploy is twice as fast as it was on the gtlap VM But actually running test now took me like 10 minutes Which is super weird, especially that it was for those tests that were making a lot of mathematical operations So I would expect that my MacBook Pro with 16 gigs of RAM and 16 CPU cores Would be more powerful than a small VM in a cloud that gtlap CI is using but apparently Apple scum me But jokes aside this drives a point home that CI is a complex beast Especially if you don't do DevOps on a daily basis as I guess most of us don't do it because this is a Python conference So adding a custom runner gives you flexibility, but it also gives you another layer layer of complexity for HCI Okay, let's wrap it up So here are some key takeaways. I want you to remember from this talk learn concepts not tools Even though I was showing you how to use gtlap CI here. I think whatever I showed you here is universal if If you know how to use something it's just if you know that you can do something It's just a matter of figuring out how to do this in your CI setup There are no silver bullets in terms of a perfect CI setup Should you use alpine image and install linux dependencies yourself or should you use? Debian image and deal with the fact that the image is larger Should you pull your docker image in order all your jobs or should you build your docker image from scratch? You know all your jobs. I don't know it depends on the setup of your project Not every check has to run in every pipeline slow jobs can run manually or on the main branches if there is new code available interrupt and restart the job and Try to make your merge request pipelines fast and your main branch pipelines thorough And also if you think that you set up your CI once at the beginning of your project and that's the last time you touch it think again Not updating your CI is the same technical depth as any other technical depth in your code It will make your code review slower It will delay important feedback and it will make your developers more and more annoyed So give yourself a favor and check from time to time. What can be improved there? But overall a well-designed CI can be a great tool in your daily work. Thank you very much for listening. I Think we have a time for question or two Anybody wants to ask a question. There's microphones in the room and anybody can come and ask a question if anybody has a question online I'll So it And if you don't have questions now, you can always find me online here is the link to the repo here is the link to the slides Any questions on this court? Okay, I Knew I was talking so fast. Oh, there's a question Can you maybe speak to the microphone because I think it's recorded In terms of the multi-stage docker builds that you mentioned before do you find them useful? And what scenarios what problems are they solving for you specifically? That's a very good question. I Never use them. I mentioned them because I know that they exist and in some Particular cases where you have like a huge build step They would make sense as I said, I don't think in Python They made they made that much more sense when mostly like Java Rust and things like that But yeah, it's a viable option Thank you for dog. Is it possible that you mentioned the Customer-runner so was it running locally on your Macbook? So is it possible that you build a docker image for Intel or something and then ran it on on your Macbook? But it wasn't in a multi-platform build but from the What is it arm in my book the silicon so is it possible it was so slow because it was emulated and not native So that's a good question. I knew I should have like debug. What's the issue? I think I was building the docker image fully from scratch there So I think it was using whatever architecture I have But I know that the runners have like a lot of options whether you want to run things in shell whether you want to like run it in like docker-in-docker and things like that so Yeah, I really don't know what was the reason behind it I just check it I was surprised and then I moved on with my life Yeah, thanks I think it might be the case like it's it's super slow on the emulated when you build it on like Linux server And then you pull this image and run it locally on Mac It's super slow and it's emulated. Okay. I have to check it. Thanks. Thanks. I think that's that's it Thank you very much for coming. Enjoy the rest of the conference