 Hello everyone, welcome to the Conf and our next session is presented by David, Andrei and Jakub with the topic Promote Streams to Production Level with Arga and Tekton. Guys, the stage is all yours. Okay, thank you. So, hello everyone and welcome. I think we can move to the next slide. So, at first I would like to introduce me and my colleagues. My name is Jakub Teyskal and with me there is Andrei Babets and David Konal. We are all working in Red Hat as quality engineers and we mostly take care about quality of several different projects. And we also prepared initiative called Tealk, which you want to present it to you today. And this initiative basically put together several projects which we are working on in our work. Okay, let's go through a presentation timeline. At first we want to introduce more closely Tealk project. If you wonder why we call it Tealk, we are just basically fans of the Stargate, so that's the reason why you picked that name. The second point will be about Arga and Tekton, why we choose those technologies in our project and for what exactly we are using it. And the last part of the theoretical presentation will be about Strimsy, what is Strimsy and why we choose Strimsy in our project. The second part of the presentation will mostly focus on several demonstrations. We want to show you how we deploy Tealk project and how we basically use it. And after that we will show you how we use Argo CD and how Argo CD sync our applications. And the last part will be about Tekton operations which we used in the project as well. And then we do some short summary. Also you can see the QR code on the right-down corner, you can scan it and you can join our Telegram group. And if we will be lucky and there will be some tweet about Defcon on Red Hat in the following 40-50 minutes, you will be able to see it in the chat group because it's basically kind of our reload which we use in Tealk. Okay, that's all from me, so I hand over to David. Okay, so what is a Tealk project? Tealk is a code name of our project and FCACUPSET is also a character from Stargate series. It is a collection of tools and deployments for running continuous testing of provided scenarios of Kubernetes applications. We created it mainly for continuous testing of automated upgrades of running application with some load and configuration. We wanted to simulate customer-like deployment on Kubernetes cluster and see how application behaves after few days of continuous running and upgrades. We use modern tools like Argo CD, Tecton, Ansible, and for monitoring Prometheus with AlertManager and Grafana. It also periodically runs test suite against software under test using Tecton pipeline. Deployment of application under test is stored in Git repository and it is sent by Argo CD and alerts are sent to our mailbox. Our reference application for system under test is a Strimsikovka operator. So let's move to the overall architecture of Tealk project. You can see we use right now two OpenShift clusters. The first one is called Infra cluster and this cluster is mainly supposed to run our control application like Argo CD, Tecton, and monitoring of Argo CD applications. On the right side we have a worker cluster, which is the cluster supposed to run our application under test, which is Strimsikovka operator with Kafka cluster, some Chemocafka connector, and of course monitoring for Strimsikovka application. You can see that we use for our referral scenario, we use Twitter as source of data and Telegram for resending some interesting topics which we get from the Twitter. On the top we have a Telegram, it is just for storing our Ansible and Ansible playbooks for installing the infrastructure, configuring Argo and Tecton, and so on. And on the right side the Soka repo, which is a source of truth and it's a collection of our deployments and configuration for Strimsikovka cluster, Chemoconnectors and so on. This repo is periodically sensed by Argo CD into the worker cluster, so we are sure that configuration which is stored in Soka is always deployed into the worker cluster and if someone changed the configuration on the cluster side, Argo again synced the source of truth into the worker cluster. We also use a Strimsikovka repo and we use that for pulling the new configuration and new features, new CRD definitions and so on. And every time guys from Strimsikovka create new images and they push it into the query, Tecton is notified by this change and it runs one of the Tecton pipeline which gets all these images and updates the configuration files in the Soka and then Argo pull this Soka configuration and do the upgrade on the worker cluster. So that's about the architecture. You will of course see the live demo how it really works. So let's briefly describe what is an Argo CD and what is a Tecton. From the definition Argo CD is a declarative GitOps continuous delivery tool for Kubernetes. What does it mean? It means that your Git repository contains a source of truth for defining the desired application state. For example, we have a Git repository with the YAML deployment files of Kubernetes application and Argo CD allow us to deploy these files on a Kubernetes cluster and sync changes from Git repository. It works also as a guard of our deployed application in case when someone updates a configuration worker cluster and Argo again racing these changes and do the fix on the worker cluster. Argo CD provides also a very nice way how to monitor state of deployed applications while for example web UI. What is a Tecton? From definition Tecton is a cloud native solution for building CI CD pipelines. Tecton behaves like an extension on Kubernetes cluster. It installs CRDs like a pipeline task trigger pipeline around task run and so on. These resources define the building blocks. You can create and reuse these blocks in your pipelines and you can create and modify and delete and run the pipelines also with Qubectl CLI which is the same way like you work with for example other Kubernetes resources like POTS and so on. Tecton pipeline runs on Kubernetes cluster and it contains tasks which are composed from steps and every steps run in own container in the same pot. For example let's imagine a simple pipeline from the run J unit test. We know that we need Git for clone repository with test files and Maven for trigger this test. So we create a Tecton task with two steps. The first one use Git container just for clone the repository into the workspace which is a shared between all containers in the pot and the second step use Maven container for our tests from sources. This is an awesome feature because we don't need to build a really large container image with all our tools which we need for testing like Maven, Git and so on and we just use a small container image just for specific tasks. And I'll end over to Jakub and he will describe what is Strimzy. Okay so Strimzy is a CNCF sandbox project. Strimzy itself basically is a collection of operators which allow users to deploy and manage Apache Kafka on top of Kubernetes. We provide kubernetes experience which means that users can very easily define its own Kafka instance with custom resource and the operator will do all necessary steps to deploy it in your Kubernetes cluster. We choose Strimzy as a reference example in detail because it's a very nice and simple project and all of us are working with it closely mostly on daily basis so that's the main reason. We also use there in our deployment Kafka Connect and we decided to use the chemical connectors for sourcing data from Twitter and sync them to telegram. On the right side you can see the simple architecture of Strimzy. The main part here is cluster operator which basically manage Kafka related resources based on the custom resources defined by user. Introduce resources we can count Kafka, Kafka Connect, Kafka Mirror Maker and also, for example, Strimzy Bridge. There is also another two operators which are dealing with user end topics of Kafka. Those operators are called topic end user operator and are directly connected to a specific Kafka cluster so each cluster has its own topic end user operator. As I mentioned Kafka Connect and Kafka Connectors you can see it on the right side of the diagram. We basically use a source connector which is connected to Twitter and searching for the specific tweets and then send it to Kafka and another connector basically read the data from Kafka and push it to the telegram. We have the reservoir applications which are available in our GitHub repository. It's open source and we basically just transform the messages from Twitter to some more let's say human readable format on telegram to make it more easily readable basically. If you are interested more in Strimzy, my colleague Jakub Schultz will have a presentation about Strimzy and how to set up its own data analysis tool later today. If you are interested in, I definitely encourage you to join this session. That's basically about Strimzy and I think we can move to the demonstration part. We will start with the simple demo of infrastructure part which is Argo and Tecton on infrastructure cluster. Let me open the YouTube video. I'm sorry to break in. Is there is any chance you can zoom it please a bit? I don't think so. But anyway we don't need to check what is on the top of this video because it's just a run of Ansible playbook. The more interesting part is here where we will install the Argo and Tecton using operators from operator catalog on OpenShift. Of course it will take some time so I will jump through the video because the more interesting part is of course synths of Argo and so on which will be presented by Andre and Jakub later on. But anyway, the full deployment is automated by Ansible playbook. This Ansible create subscriptions for Argo and Tecton pipelines and it also configure secrets which we need for connecting to the worker cluster and so on. You can see that OpenShift pipelines operator is installed which is a Tecton and it will shortly install the Tecton instance after a while. Seems that Tecton operator is installed, triggers controller as well. Then we install the OpenShift kiddobs operator which is Argo CD. Argo CD is starting, should run so we can go to the web UI of Argo CD. We can login using the OpenShift potential and we can see empty Argo CD UI because the projects are not deployed yet. It will be do later on. The last part is a Grafana for monitoring of Argo which is installed again from operator catalog. Grafana is deployed and let's keep it and go to the Grafana UI to see simple dashboard for Argo application which will be of course empty because there is no application yet. We have prepared the infrastructure cluster and Argo and Tecton. The next part is deployment of report portal. Report portal is a tool for storing and visualization test data, the test results from our test suite. It's again installed from operator catalog so Uncivil Playbook will again create a subscription and install the report portal operator from catalog. It will take some time because report portal needs to install a lot of components. You can see that the report portal instance is spinning. After a while report portal is up and running and we can open the web UI. We are in and you can see the empty dashboard because of course we don't have any test result yet but when the test suite starts it will be the old data will be there. So that's it. We have our infrastructure installed and I will hand over to Andrei and he will show you how Argo projects are deployed and how Tecton works. Okay let's share my screen. Okay so here is the first demo from me and that is basically deployment of the StreamZ infrastructure by Argo project. We can start it up. We can see that we are also again running Uncivil Playbook for the stream now for the StreamZ infrastructure and after the while the after the everything is configured Argo CD starts deploying the applications for the StreamZ. We can see there are already a couple of them and more will come later. Just jump a little. Everything is pulling the deployment files from the soccer repository which David already introduced. We can see that StreamZ cluster operator is already deployed and now it's starting to run. Kafka is already starting up but for the Kafka it takes much more time to start up so we can jump a little forwards. We can see that all persistent long claims I have created and now the Kafka is just rolling to good rank. There we can see our pipeline runs which are basically this is a Tecton dashboard and these pipelines have been deployed together with StreamZ deployment. So it surely takes some time. These pipeline runs are basically for image update which David already introduced in the architecture. They are basically just pulling the images and changing the configuration in soccer so everything is up to date. We can see all the ports for the pipeline runs are running currently. Kafka is starting the fifth. Now it's time to deploy the clients which are basically running in the background. We can see one result of the completed pipeline here. So currently everything except Kafka seems completed deployed but that is not completely true because Kafka is not deployed yet so the external clients cannot connect to Kafka because there is no secret for them. There is no certificate which we will see that some of the clients have the failure in the logs. Yeah, we can see it now. So in this part Argo CD have the cell healing attribute which basically can reload the deployments if there is some failure during the deployment and we can skip it to the part where we can actually see what is going on. Yeah, we can see that now Argo CD just take care of the broken consumers and producers because there was no secret and after a while Kafka is deployed. Yeah, we can see now so secret is created and everything is rolled out again and seems healthy, seems working. We can see the logs. Yeah, the producer is starting and after a while we can see it's already firing messages to the Kafka. So everything is deployed since now that's basically everything for the stream Z deployment demo and we can move for the next part which is probably more interesting too and that is deployment of our Twitter application which as Jakub already described takes the tweets from Twitter and later exposes them to Telegram. Now we can see it in real time. So again, Ansible Playbook is run for the Twitter application deployment after a while we can see the Argo CD applications deployed. Yeah, also for the Argo CD applications we have two pipelines. One is again for the update of the images and second one is for creating the secrets and copying them from the Kafka and more deployments to write the namespace. You can see we have all the Twitter applications deployed and pipelines are running. For these applications in Argo are for like several parts. One is creating topics for the Kafka second connectors and other stuff which you can find more deeply in our repository. We can move to the part. Yeah, we can see the parsers are running and starting. Skip to the part where we can actually see something. Okay, connector is still progressing. Yeah, now it seems running and we can see that on Telegram, Talbot already is firing the new tweet is synced from Twitter. So everything seems working. Everything is deployed and that's basically for the Twitter Telegram scenario and we can move to my last demo for you guys and that is our automatic test to it which is named Tor. And as David described, there is report portal instance which takes all the results. Our test suite is stored in separate repository which is just triggered by Tecton. It's triggered basically on a Chrome basis but for our purpose we will just trigger it manually so we don't have to wait till Chrome trigger and see everything is deployed by Ansible all the pipelines are created and we can see we have Tor test suite pipeline here which should be triggered by Chrome but we will trigger it manually so we don't have to wait as I said already. So we will just trigger it manually so we have some results. We can see the pipeline has already started so there are as David said a couple of steps of cloning and so on. Now we can see that the tests are already running, tests are finished. I have already just triggered a couple more test runs so we can see more the results in report portal and as all the test runs are completed we can go to the report portal instance and now the previous empty dashboard is already fulfilled with the test results. We can see the results from last round which were basically one number four and if we go to the launch cheese on the dashboard we can see the five test runs reported everything is stored in the report portal and everything seems to crack so this is basically how the long running tests will run they will be triggered by Tecton and everything will be stored in the report portal and that's everything from my demos I would like to hand over to Jakub. Okay so let me share the screen. Okay so basically the whole tail project is focused on testing so we want to make sure that in that case Stremse is working properly and we want to run let's say long running tests and we want to see that Stremse is able to be operational but even after let's say one month of some work so one of the main part of this is automatic upgrades not just of Stremse itself but also of its component and this will be part of this demo. Okay you can see Stremse cluster operator up and running so let's check the images which are used between Stremse cluster and its Kafka deployment you can see that currently we use version 0.27.1 which is the latest version of Stremse and we use this in all components which we have deployed. Together with Stremse operators we also deployed Kafka which contains basically Kafka ZooKeeper, Antitoperator, Kafka Exporter and Crew Control. Now let's imagine that some change happened in Stremse and new images were built which basically triggered the event from KWI which was sent to our infrastructure cluster and this event triggered our Stremse deployment image update pipeline. This pipeline basically this pipeline basically changed the images which are stored in our SOCA repository from the latest version which was used to a new one. You can see now in the pipeline that pipeline itself calling the SOCA repository and also calling Stremse Kafka operator repo from where we take new versions of CRDs, cluster rows, raw bindings, configuration for Stremse and the deployment configuration itself. You can also see that it found some different digestities so the images will be updated in the configuration files and after that the files will be pushed into our repository from where ArgoCD will update it. Now let's check the commit which was pushed. You can see it was put by TELXCI and you can see that the floating tags were replaced by shadi gestures of the images which were pushed into KWI and which are basically the latest available in the Stremse organization. After that the ArgoCD will pick up the changes and should start the rolling update of Stremse and Stremse after that should deal with the upgrade of Kafka and its component. Basically Argo react in a few seconds so in a few seconds we should be able to see how Stremse cluster operator rolling update started and this will basically just change the deployment file, update the images and spin up a new pod. Okay it's already done and we can see that floating tags were changed to shadi gestures in the configuration. When the Stremse operator came up it will start the rolling update of Kafka cluster. You can see it's already happening so now it will take some time to do a rolling update not just a zookeeper cluster but also a Kafka cluster and all other components which are available here which are entity operator, course control and Kafka exporter. This will basically change all the images used within the cluster and automatically will update the images and the project version from previous one to the latest which is available in Kauai. Those steps are basically executed with every push into main branch of Stremse so we are basically sure that in case there is something new in the images we pick up the changes and working with the latest version of Stremse. Now we can see here in the image part that it used shadi gestures so the update was successful. Okay basically those technology operations are used not just for Stremse itself but it's used also for clients, for our Twitter app and for Stremse drain cleaner and Stremse canary which use in our deployments in our project as well. Basically we want to make sure that everything is working so we also implemented the other so in case a rolling update is not working or one of the components is not able to came up we will receive the notification from Prometheus and to be notified to email but in the future we want to make it more resilient to notify us for example on telegram or on Slack. Okay that's all so let's go for summary. Okay basically the main purpose of this project is to continue to test a very process and of the application under the load so we within teleproject we make sure that Stremse is working, the Karka connect is working fine because we are able to see the messages on telegram and in case anything wrong happened we are notified very quickly and we can see what's wrong. In our project we use zero-touch upgrade which basically what I saw in the demo that with every change pushed into Kuvai the images used within our project are automatically upgraded which makes us sure that we use always latest version and let's say we are basically the first users of Stremse. We also have the data testing which showed on train which basically practically around the tests and which are checked that our deployment is working, the messages are available in Kafka and no messages were lost during the upgrades and all other operations. Those that are in report portal so in case there are some issues again you can you can check in report portal and you can also get the notification for report portal so in case the Prometheus is not able to find the issue in the report portal and your test will usually do that. We also have the metrics there which are used by Prometheus and we also depended them in Grahana so we are able to check the metrics when we want so we can check if every broker is up and running, how big load is on them and so on and it's basically in the future we want to automatically add an upgrade of commandis cluster so we will basically check if Stremse is operational without issues during the upgrade of infrastructure which is one of the big steps as well. That's all from me and if my colleagues don't want to add anything else I thank you for your I think you're the two attendants. Thank you Jakob, David and Andrei. We have a few more minutes. Let me ask the audience please if you have any questions can you post them in Q&A section. I have one little question for myself if one wants to contribute to a project how they can do it? Yes the project itself is open source, it's available on GitHub. We have all links forward to use in our project in our presentation so I think the contributors can take a look on the presentation and in the end they will find the links and they will find also the project on GitHub and yeah they can contribute there. Also they can reach us via email or on Stremse like for example. Okay thank you and I see Andrei posted the link to the GitHub platform. Thank you thank you guys I don't see any questions in Q&A section. Okay thank you. Okay thank you everyone. The attendees for our schedule we have a long break now. See you later. Bye bye. See you bye. Bye.