 We'll start in three minutes for the recording live stream, but yeah, I introduced Gregor, working for GitLab that will talk about Docker, Kubernetes, and integration testing. So, very interesting subject. We'll include it. Okay. Hello, everyone. I see that the room is full, so thanks for coming to listen to my talk. My name is Gregor Bison and I work at GitLab as a senior developer, focusing mostly on the CI CD area, but also on test automation. So, some of you probably have never heard about GitLab. So, GitLab is a solution that's very similar to Git Hub, but in the open core model, which means that we have GitLab Community Edition, which is a free software, free open-source software that you can install on your server. So, today I would like to tell you about building an end-to-end integration testing framework. Okay, I will do my best. Okay, I will do my best. An integration testing framework called GitLab QA. So, end-to-end testing is a strategy used to check whether your software works as expected from start to finish. We can also use end-to-end testing to check whether deployment process or installation process is okay. For example, especially when you distribute some kind of a package like Debian package or Docker image, what we do at the GitLab. So, we needed an end-to-end integration testing framework because we had some major problems in the past with users not being able to install GitLab. Users not being able to update GitLab. We had some broken major features like pushing to GitLab. We had also some broken deployments to GitLab.com which caused us a lot of trouble. So, we knew that we are going to need end-to-end testing framework. And in GitLab we are using merge request workflow, which means that we have a depend on code review. But code review is also not a silver bullet because sometimes contracts between interfaces are not clearly visible in the code. So, in the past we had a merge request which was basically one liner reviewed by a few senior developers and no one was able to spot the bug in the code. So, we thought that we would probably need to build an end-to-end integration testing framework. We discussed our goals and we decided that we need to make it possible to run tests in the CI-CD environment to automate testing. We also decided that we need to test installation process, deployment process, that we want to make it easy to locally reproduce failures and that we also want to run tests against staging, production, or any other installation, like on-premises installation. These were our goals. But our ultimate goal was to make it possible to run end-to-end tests in merge requests to catch bugs before even merging them to a stable branch or master branch. We knew that we are going to use continuous integration to automate testing. And continuous integration is a software development practice where code is integrated frequently and we're quite lucky because we had our own integrated GitLab CI-CD solution. That's an integrated product. You can use that whenever you push code to GitLab to see continuous integration feedback in merge requests. It's a free opens of software part of the GitLab community addition as well. So a few notes about GitLab CI-CD. Logo in the middle, it's a GitLab instance. When we describe GitLab CI-CD architecture we refer to GitLab instance as a coordinator. The satellites around are GitLab runners and you can add a runner to a project. You can have multiple runners and runners are going to pull coordinators API in order to pick builds, process them and eventually push changes, push results to the merge request. GitLab runner also supports workflow with workflows with Kubernetes with Docker. So end-to-end testing is a great use case for using containers. We knew that we want to cover scenarios like updating GitLab from latest major version to current version. Or using GitLab CI to test GitLab CI. So we wanted to test scenario where we could start a container with GitLab, register a runner and see pipelines being processed correctly. But how to do that? How to design framework like that? So in GitLab one of our core values is iteration. So we decided that we are going to trust iteration again and that we need to start very simple. So we created a separate project called GitLab QA. We decided to use Ruby, Capybara, WebKit and RSpec tools because these were tools that GitLab's developers were already familiar with. And we created our first end-to-end integration testing scenario. As you can see, it's very simple. We have a main page that we are using to sign in using credentials. And then we expect to see text signed in successfully on a page. So this is pretty simple. And page main is our first page object. And page object, it's not a new pattern. Page objects is a widely used pattern in end-to-end testing. Page objects are very useful to abstract away things that you can do on a page. And later you can also build some features on top of page objects. As you can see, our page main page object is plain old Ruby object mixing Capybara DSL and using Capybara to fill in forms, click buttons and check if there's some content on a page. We also added a GitLab CI configuration file to run end-to-end integration tests on every Git push. It's also quite simple. You can see that we install some dependencies like WebKit, that we have only one test job for testing GitLab Community Edition using a feature called services. So we use services to start container with GitLab CI latest version. We export some environment variable and then we just call aspect with Capybara, which is testing GitLab from the container. Later we refined our CI configuration file. It looks a little bit different differently right now. We are not using services. Well, we actually use Docker in Docker service and BNQA is wrapper around Docker. So we knew that end-to-end testing is not really fast and that we eventually will need to heavily parallelize our pipelines. So we decided to design the tool in a way that it's quite easy to parallelize tests. So we knew that tests have to be idempotent and isolated, but it's quite easy to do that when you are actually testing a stateful application. So we used a feature in GitLab called subgroups. So whenever we boot up a GitLab QA test harness, we create a new subgroup and then we also create a separate project for every test example, which is not particularly fast, but because we can parallelize everything and this is something that we can actually optimize later. We decided to stick to this approach. In the next iterations, we added some page objects. We implement new tests and then we encountered first major problem like build time taking almost two hours to build a GitLab package and Docker image that we later wanted to test. So we thought that maybe we can run tests against nightly builds and nightly pipelines, but feature like pipeline schedules was missing. Also other features that we could use like multi-project pipelines or multi-level container registry merges were missing, but we were lucky again because we could build features that we needed to test GitLab. So we started building features, but first we had to solve a problem of merging GitLab Community Edition to get GitLab Enterprise Edition because merge was happening once a week, which caused fragile test problem. So there are multiple definitions for fragile tests. My definition is that you have fragile tests when your tests are not able to adapt to changes in your code base. So whenever you change something in the code base, you constantly need to update tests. So imagine a scenario like when some developer changes page layout in GitLab and because we have a good documentation, here she goes to the GitLab QA project to update page objects. So we can test GitLab CI Community Edition, but because of the weekly CE2E merge, GitLab Enterprise Edition was instantly outdated and we had to wait entire week before the next merge being done. So we decided to resolve the problem by moving all instant test scenarios to CE and CE repositories. And from that point, all our page objects are living in exactly the same repository, like views and selectors that page objects depend on are. So we also changed the responsibility of the GitLab QA project, which I told you was a separate project. And now it's test environment orchestration tool. We gamified that. It's now available on RubyGems for everyone to use it also to test their own instance. In order to make GitLab QA life easier, we also had to dockerize our test harness. And then after a few more iterations, our build team did a great job with reducing the build time down to 20 minutes. We are now merging GitLab Community Edition to Enterprise Edition every three hours and this is an automated process. It's not really relevant and important for GitLab QA right now because after we moved test scenarios to CE and CE, Enterprise Edition lagging behind is not a huge problem anymore, but it's much easier to develop features when these two projects are synchronized. Then we built some features like pipeline schedules, manual actions, and blocking manual actions, multi-level container registry repositories. And we found a workaround for missing feature called multi-project pipelines and we are now working on making it a first-class feature in GitLab. And then again, imagine a scenario like when a team of front-end developers decides to rebuild a top navigation bar in GitLab. It's something that we actually did every release ago, but they don't know that there is a QA subdirectory in the project holding all the page objects we have. So they forgot to update page objects. So how to solve that problem? We decided to solve it by making it possible to couple every page objects with views and selectors from the project. And this is not a perfect solution. We are aware of that. It was kind of 20% effort, 80% result solution for us and it would be obviously much better to assert on Elements existence on a generated page. But this is something that we plan to work in the next iteration. But this solution works for us pretty well now because whenever someone pushes a code to GitLab with views being changed or selectors, we have the job that verifies the contract between page objects and views that page object depends on in the continuous integration. So it's instantly visible for developer that GitLab QA is not synchronized with the code that have been changed. We also decided to make it easier for developers to contribute. And we designed a concept like GitLab QA factories. This concept is very similar to factory bot for merely known as a factory girl project known for a lot of Ruby on Rails developers. So whenever you design a factory, you make it as possible to add dependencies like project factory depends on a group being fabricated before fabricating project. So whenever in the code, someone wants to fabricate project but there is no explicit group assigned to the factory, we fabricate all the dependencies including group for the project. And then we managed to add manual action and with that change, we achieved our ultimate goal of bringing GitLab QA to Merch requests. So now whenever someone clicks package QA manual action, all the GitLab QA tests this code in the Merch request from start to finish. And obviously, because manual actions are in GitLab are optional, which means that this stage has been skipped. It's still possible to merge a Merch request without running end-to-end test for your changes. But we have this feature called blocking manual actions. When you make action blocking, it's not possible to merge a Merch request until action succeeds. So now our GitLab QA process looks like this. In the CE or EE project, someone triggers QA action. QA action starts a brand new pipeline in GitLab Omnibus project. GitLab Omnibus project is responsible for building GitLab packages, building Docker images and pushing them to the registry. And also Omnibus later triggers pipeline in the GitLab QA project. GitLab QA pulls images from the container registry, orchestrates test environment and runs end-to-end tests against test environment. Later we propagate pipeline status upstream directly into Merch request to make it possible for developers to see the status. And now live demo. So I know that people at conferences live live demos. There are so many things that can go wrong. So let's hope that's going to work this time. So I'm going to open a new console. I'm going to make this text a little bigger. Okay, I think it's visible. Bigger. Okay, I think it's okay. The second one. As you can see, I have no containers running. Yeah, and now let's try executing comment like GitLab QA test instance image CE. Okay, and let's see what happens. Okay, so let's go up. As you can see, we add them to pull Docker image from Docker Hub. Then we create a Docker network. And then we start GitLab with Docker run. Because it's the first run, it will take a few minutes for GitLab to configure itself. So it's happening here. We have GitLab Omnibus project built into GitLab and it's Omnibus doing its work here. Yeah, so I can show you another thing that might be interesting. As you can see, we have test instance image CE. And now let's go to our slide number 14. And you can see, I think it's visible, that in the CI environment, we have pretty much the same comment. And actually GitLab-QA comment is a wrapper around being QA. So we are running exactly the same comment in the continuous integration like we can run locally using GitLab QA gem. This is a command line interface that comes from the gem that's available in Ruby gems. So let's go down to the bottom of the build log. It's taking a few minutes for GitLab QA to start GitLab. GitLab needs to configure itself. So whenever we trigger a package QA manual action in MerchQuest, the argument to the GitLab QA CE stands for Community Edition, obviously, but we can pass an URL to the image in the registry here, which means that we can test any image with GitLab this way. Okay, so we are now starting all the services that are in the container. As you can see that GitLab QA is now waiting for GitLab to start. And now I can go to my browser and enter an address like GitLab CE 90 test. Yeah, and I should be able to log into GitLab. As you can see, we have no projects. And because RSpec suit is randomized, RSpec decided to run tests for creating an issue, which means that an issue should probably have dependency for project. So we should be able to see that here after refreshing a few times. It's the first run, so we need to create all the subgroups for the GitLab QA and GitLab QA sandbox top level group as well. So it can take a few seconds. I think it should be already here. No, not yet. Maybe this is one of these things that can go wrong. Let's wait a little more. Yeah, okay, let's maybe go to groups. We have a sandbox created. We have a test namespace created. No project here yet. Oh, we have a project. So next we have tests for secret variables. We have quite a few tests here, but I want to wait until RSpec selects tests for creating a new pipeline. Okay, now we have tests for the IPI, creating a merge request. So we should see a project with a merge request. We have secret variables, so. Can I ask a question in the meantime? Your different tests are they relying on each other? No, as I said, we have all the tests isolated and not important. And because we create a separate project for every test example, these tests are completely decoupled and isolated. Yes, that's why I was telling you about parallelization, right? So we can parallelize the statute and it will eventually run much faster. Okay, so we have now a test for CI CD pipelines. So we are pulling GitLab runner. We are running and registering it. And we are pushing GitLab CI YML. As you can see, it should be there. So you can see that we have pipelines running. We have pipeline with all the jobs. One job managed to succeed. We have echo okay, job succeeded. Yeah, that's it. So we are now using GitLab CI to test GitLab CI and we are using Docker, in Docker, actually in Docker. So it can be quite complex, but it works. Okay, so let's go back to the slides. So a few lessons learned. First lesson that we learned is that there is no silver bullet in end-to-end testing because how you end-to-end test your software heavily depends on how you manage to shape processes around deployment, development, installation process. So there is this great paper by Fred Brooks about essential complexity and accidental complexity and that there is no silver bullet in software engineering. So I think that is very relevant for end-to-end testing because essential complexity is a deciding factor in how your end-to-end testing framework is going to look like. Another lesson learned is that testing shows the presence and not the absence of bugs, which in other words means that you won't catch bugs in code that you don't have test scenarios for. So this is a great quote by Ezher Dejikstra who is believed to be a great grandfather of modern software development and testing. And I think that's true and that we will eventually need something more at GitLab to test GitLab. So I personally believe that canary deployments and Kubernetes are a great, a huge thing. Being able to incrementally roll out your changes to your users using tools like exceptions monitoring, performance monitoring, security testing to guide you until up to 100% rollout is a huge thing. And at GitLab, we heavily bet on the Kubernetes and we are building a lot of features into GitLab to make testing GitLab easier and because we are building features into GitLab, our users can also use this strategy. So imagine a scenario where you can, you need test your software, then you can feature test your software, then you can build a package, test it with a tool like GitLab QA and then start incremental rollout guided by techniques like performance monitoring, exceptions monitoring and security testing, for example, until you reach 100% rollout and only then you click a merge button in a merge request because only using this strategy you will be able to know that your feature is production ready before merging into the stable or master branch. I know that a few companies are doing that already and at GitLab we are building a lot of features to make this workflow possible. It might take us a few months to get there to be able to test GitLab this way but we'll eventually probably get there and we'll also build a lot of features that other people can use. So I believe that products quality is not only a perfect code, products quality is a combination of users' happiness, developers' happiness, product managers' happiness and company well-being and I would like to end this talk with a nice quote from Tom DeMarco, product quality is a function of how much it changes world for the better and I believe that GitLab strives for changing the world for the better. That's all. Thank you. Questions? Well, we are using, so the question is, are we doing any testing with real browsers? So GitLab QA uses Google Chrome browser in a headless mode to click stuff in the GitLab. We can also disable headless mode and we can see in the real time what's happening and how GitLab QA navigates through the GitLab to different types. No, only Google Chrome right now. Well, it's a start, it's the first iteration, right? Yes? Some future, do you want to GitLab? Future, okay, so the future, despite of the workflow with MerchieQuest that I described to you, is that we are now working on Cloud Native Helm charts to make it possible to deploy GitLab to Kubernetes and we are also going to use Helm with GitLab QA because as I said, we also want to test deployment and installation process and currently we are building a big monolithic Docker image which we are testing with GitLab QA but eventually we'll have stable Helm charts that we'll also use to create a test environment and we'll need to run GitLab QA against this kind of test environment. Does that answer your question? Okay, thanks. You mentioned that you are using several dockers. Do you have dockers running into dockers? Yes, we have dockers running into dockers. Well, this is quite complex. So currently we are using Docker engine. In the CI environment, all our builds are being processed inside the Docker so in order to make the workflow the same locally and in the CI environment, we have to use Docker in Docker in the CI environment. Docker in Docker is a service. Well, I'm not sure if I understand the question. Okay, so let's meet after a talk. But I will try to explain. Maybe I managed to get it right. So in GitLab QA, we use Docker in Docker but the architecture is quite complex. GitLab QA is responsible for starting container with GitLab or multiple containers for example in the test scenario where we test GitLab Geo which is Enterprise Edition feature. So we need multiple containers with GitLab but we also need to start container with instance test scenario which we build whenever we tag a new version of GitLab or whenever we build GitLab we also build corresponding integration tests Docker container. So in order to make it possible to execute tests we link containers together but also inside the container with tests we have Docker available which we do by mounting socket in the container. So I'm not sure if this answers your question. Yes, we also do that. Well, it is because in the CI environment this is fully Docker in Docker using privileged mode but in the local workflow this is the single Docker engine, okay. Now currently we only do click driven testing so whenever we want to create a project or something like this we click our way through GitLab to create a project group and all the things. We're thinking about making possible to use the API to speed up the process a little but currently we are only testing API so we are not using API to create resources. Our GitLab QA factories depend on only page objects currently. Yes, of course we have unit tests we have integration tests we have feature tests and integration tests on top of the testing programs, right. So we have a test scenario using GitLab QA which is going to start container with previous version mount some volumes, share them and after stopping the previous container we use the same volumes to start the new one and the code inside the container is responsible for updating GitLab. No, we don't. We don't use preceded data. Okay. Writing a scenario needs to know a bit of Ruby. Yes. Did you ever met the case of having people on the QA or people that report a bug that would want to express the behavior of the bug or issue without knowing how to write this in Ruby? Well, not really usually users that encounter bugs submit issues with the description and we have in our issue tracker something like 10,000 of issues which a huge part of this are bugs which is perfectly normal in that project in that scale. But yeah, so GitLab QA is still in early development phase we would plan to make it easier to write tests to use domain specific language and stuff like that but as you told you we plan to iterate and yeah, and that's it. There are changes that you made to your bug detection tools to your oracles trying to decide if you found a bug as a result of going through this exercise? Well, yes, of course we, when we designed GitLab QA we had to make it possible to fail the CI pipeline whenever GitLab QA encounters an integration problem. And we do that so we check, for example, exit status of the Docker engine binary, right? We also, whenever we encounter exception on the page we also fail the pipeline so we had to incorporate a lot of techniques like this into the GitLab QA. Okay, so I think that that's all. Thank you very much for listening. I really appreciate you coming. Thanks again.