 All right, welcome back. Like to introduce you to our next session, Optimizing Apache Kafka with Cruise Control. My name is Mike Ward, and I'll be your moderator for this session. It's my pleasure to introduce Donato Morazo, who is a Principal Specialist Solution Architect here at Red Hat for Application Development Services, where he supports customers in evaluating and embracing new cutting edge technologies like microservices with Quarkus and event-driven architectures. He has extensive experience across industry such as finance, insurance, banking, government, oil distribution, aeronautical, and probably many other areas as well. Certainly can bring to us a lot of valuable insights on enterprise architecture, mobile development, and service design and integration. A couple of comments on logistics for the session. Those of you that have been with us, this will be familiar, but those that might be newly joined today, that's great. Please post any Q&A that you might have into the comments section, and we will capture the Q&A at the end of the session. For some reason, we don't have time at the end to cover that. We'll make sure to follow up as long as we have your contact information to do so and address your questions there. With that, I'm going to turn it over to Donato and let him take it away. Thanks so much. Okay. Thank you, Mike. Thank you for the introduction. Okay, so the really extensive introduction makes this slide almost useless. So, okay, the only things that you can reach me on LinkedIn or via mail, if you like, on those topics and also on others. I have more than 20 years of work on Java. I have background in many things, like serverless workflow, which could be a topic for the next edition of this event. But today we are here to talk about Kafka, which is one of my specialty and I'm focusing in the last year on this. Okay, let me speak for a while about Ankyu Streams. Ankyu Streams is the name that we use in Red Hat to sell a subscription that will support our customer in production for the critical workloads. And with this subscription, we cover to open source project. As usual, everything we do, it's in open source. As a Red Hat, and so the project are Kafka, which is a pretty popular product today. And Streams, which also becoming pretty popular because it's the easy way to deploy Kafka in Kubernetes and in OpenShift. As a Red Hat, we support Kafka and some of the, and also Streams in OpenShift and also on bare metal and virtualized environment. But obviously we like to deploy it in OpenShift. Okay, maybe, so Streams is really well known for the operators, which makes your life easier in deploying Kafka on OpenShift. It's less known for other capabilities that are pretty important as well because they are the integration with the single sign-on, the HTTP bridge, which could be useful in some situation, and maybe more importantly, with the cruise control, which is the topic of this session. So, okay, so after this introduction, let's dive in cruise control capabilities. So why we need a cruise control, first of all, okay, so by default, when you create a topic in Kafka, what happened is that the partitions, topics are splitted in partitions, and partitions are spread over the cluster, which depends, can be all three, five, or even hundreds of nodes. So what happens is that Kafka distribute the partition in around the robin fashion. So one of the first broken, so the second and so on and so forth. And this works pretty well at the beginning as your cluster remain equal, and if your clients use the partition in a always homogeneous way. This is, unfortunately, it's not the case when in the day to operation, we often discover in the customer situation that the cluster are not so homogeneous because the application have different needs in terms of, so they use more some partition and less other. Another common situation that we have when we, the customer needs to scale out the cluster, it adds a new node to the cluster. The new member of the cluster is not used at the beginning unless you create a new topic, and so in this case, the new partition are created on the topic. Okay, let's, since I believe that image are even always more clear than words, let's see this with a quick animation. So this is what happens at the beginning. You create new topics and the partition are distributed over the nodes. This is what happens when your application grow ups and start having a normal behavior where some partition are used more than others. So you can have this workload which is not so balanced among the cluster. This balance is a balance in terms of CPU, but more importantly in terms of disk usage and also of network. Yeah, that is pretty important the network and disk right. So we decide to add a new member to the cluster and we create new topics. So the new topics, the new partition are distributed again in around the robin fashion. So you have the new partitions that are created, some of the new partitions that are created on the new cluster, but it's not what we would like to get. So we would like to distribute the workloads of otherwise this new cluster is almost empty. This is the result that we would like to achieve. How to achieve this result? You can do this manually, but I can assure you that it's a really complex and tedious work or you can do this using the cruise control. The cruise control is easier to help and do this job for you. Basically, it optimize the way how you use the different brokers. You balance the workload across the brokers. One of the important thing to understand is that you don't have only one optima. You can have an optima in terms of networking, an optima in terms of CPU, or in terms of memory utilization, the disk usage. And when you do this reshuffling of your partitions, you want to make sure that the algorithm take care of the REC, so of the effective topology of your broker. Because when you, because your REC for some reason might fail and one of the first goal that you want to ensure is that your topic is always available. So your Kafka service. Your Kafka is always available, able to serve your clients. And this can be done only if your partitions are distributed in a way that have always a backup, a follower partition available on a backup's node. Okay, so that's really important. So how cross control achieved this balance? So first of all, as I said, it tried to follow goals with different priorities. So we have a first class of goals. They are called the hard goal. Because you hard goals are those goals that you want to achieve primarily. And you never want to fail one of those goals. In the planning terminology, those are called the hard, hard goal. It means of primary importance. And so guess what? In this category, we have the REC awareness. We have the resource utilization threshold. So for sure you don't want to feel all your partition in one broker because obviously you have limits in terms of resources on that specific broker. So you never want to go over the disk capacity of the network capacity and so on and so forth. So those are the primary goal that you want to achieve doing this reshuffling of the partitions. Then you have the soft goal. Soft goal are what you want to maximize as much as possible. So the more you improve those goals, the more you achieve those goals, the more your cross control is successful in this final goal of redistributing, of rebalancing the cluster. There are many goals. Those are just some of the most important, some of the more important, more important, sorry. And so the first goal is trying to balance the resource utilization. So you want to have an homogeneous distribution of your resources. Then you want to distribute a topic across the cluster. And finally, you want in general, from a global point of view, all your partition should be distributed in a really homogeneous way. You can, using fine tuning, you can also switch off or switch on some of those soft goals. This is something that I recommend that you do only if you have studied carefully the documentation and you have done some experience in a test environment. In general, I would leave the default configuration because it's a pretty rational configuration and all the goals are balanced among themselves. So they are all pretty important. Okay, so last consideration before going in the demo, so trying to see this live. Okay, so the first move that you can do with the cross control, so the first, actually the first move that the cross control can do is moving the leader. So you know that among partition, you have leader and follower. So you can decide to move the leader from one cluster to another to another. So basically promoting one follower to leader and demoting the leader to the follower. The only effect, basically doing this change has only let's say side effect on the client. We'd have to take in consideration that you have a new leader. So it's a pretty cheap movement that you can do but it's really effective actually because a leader is moving a leader can move even the workload because the workload that is driven by a leader is much higher than the workload driven by a follower. Another things that you can do is moving the partition. So this is an actual move of real data from one broker to another broker. This is something which is pretty heavy because you have to move data from one node to another. And so it takes time. You usually try to do this at a specific pace. You have a throttling feature to make sure that the primary workload of the Kafka engine is always available. So you have always resources to serve the primary workload. And then you can slowly move the partition from one server to another. Last information that is important to know about the cruise control is that basically it uses an heuristic optimization. So if you have experienced the planning problems in your life, you know that it can be pretty complex to find a global optima. It's an NP complete problem. So a problem that it's really hard to solve using the computing brute force. Cruise control has implemented a bunch of heuristics in these algorithms. And it's able to find a good optimization from far from global optima, but good enough to solve our problem. Okay, so now we have fun with the demo, a small disclaimer. This demo, since it's based on the heuristic algorithm of the cruise control, also because I have to simulate a workload which is unfair, an unfair workload. Sometimes this demo doesn't work, so I want to tell you this in advance. But I have a backup plan. Again, in case it will not work, I will try to show you this with a video so you can appreciate it at the beginning. Okay, first of all, we have, so we have launched the describe command of the Kafka topic. To understand, we have a topic which is called event with 12 partitions. And those 12 partitions are distributed in an even way among the different, we have a cluster with three servers, free Kafka servers. So the leader of the partition zero is on cluster two, sorry, on server two, partition one on server one, and so on, so forth. Okay, so we have, this is the standard that has a set before the round robin behavior. So this is the starting point. I have prepared a client, a producer for this Kafka client, which is an unfair producer and create basically, it try to use only the partition zero, free and six. Why? Because in this way, we are going to create a workload only for the, well, mainly for the server two. Okay, so let's start this workload. I'm going to start the consumer and a producer. Okay, I have a script for this demo here in case I forget something. And I can share this script with you. It's on a GitHub repository. I will give you later the link. So you can reproduce this demo on your own. Okay, so now we have to wait a little bit to create enough messages on the cluster. In the meantime, if you have question, please submit your question. So I will see later. I'll try to answer during the pause of this demo. Okay, we have created a bunch of messages which should be already good enough to have simulated and unbalanced usage of the cluster. We can check with this comment how the messages are distributed across the partitions. And we see that as expected, we have messages on partition zero, three and six. Okay, we should see also in the monitoring, let me reduce the time frame. Okay, so far we see at least we start to see in this dashboard that the workload on the different cluster server is not uniform. And if you see, I don't know, I try to zoom a little bit, okay, so you can see better. But in any case, you should see that the workload on the one of the server is higher than the other server. And specifically, we see that we have some spike here that are a sort of glitch, but the server too is more user than the other. We can see especially, this is especially clear in terms of disk write. Disk write, you can see that we have in this demo, we write only a little number of messages. So even for this reason, it's not easy to see the difference between the server. But in terms of disk write, you see that one of the server is working 10 times more than the other two. Okay, so in terms of disk write, this is really exceptional and really clear. In terms of CPU usage is less clear, but you can also see that there is a difference and also in terms of networking. Okay, we are experiencing some spikes here, but the constant here is that the Kafka two, the server two is more used in terms of network than the others. Okay, so we have simulated the unfail workload. We have this situation where one of the server is more user than the other. Now it's time to call the cross control in action. Actually, I have already enabled the cross control. This is really easy when you work with OpenShift because it's just a matter of adding a line in the configuration of Kafka. Okay, let's see this. So this is the YAML file and enabling the cross control is just a matter of adding this line. So in the YAML, you just say you want a cross control. Obviously you can here put some fine grain configuration, but as I said, you don't really need it in detail. Okay, so we already have cross control up and running. You can see this even if you get the pods and you see that we have a pod which is dedicated to the cross control. This cross control service gathered metric information from the Kafka cluster via a topic. And so it's able to do its optimization logic on top of this metrics. Okay, so how to engage the cross control? It's a matter of creating a balanced policy. I have already prepared it. It's here and you can see that the Kafka rebalance have created a Kafka rebalance object where I asked for a full rebalance of the cluster. And at some point, you should see that it starts preparing a proposal of a multi-layered policy and it's going to be a multi-layered policy. So it's going to be a multi-layered policy. This is a proposal of an optimization proposal. You should see this at some point, hopefully. Okay, let me show this file for you. So it's pretty simple. So I just, this is the Kafka rebalance object, the custom resource that I have in Obershift. So it's just a Kafka rebalance. I call it full rebalance and I take all the default configuration. So nothing special here. Okay, let's see if I get my proposal. The proposal is still on its way, hopefully. Let me, okay. Let's wait, let's give him a couple of minutes, not more than this. In the meantime, I can check if we have comment. Okay, we have question, no question. Okay, so the proposal, we were lucky enough to have the proposal already here. So in time for the demo timeframe, so it's working. So what we have now, the cross control have prepared an optimization proposal, but he hasn't already applied the proposal. In fact, we see that the, the situation is always the same. So we have one of the server that is working more than the others, okay? Okay, so now it's time to approve this proposal and to ask cross control to do the real work. Okay, how do I accept this proposal? In OpenShift, it's pretty easy. So it's just a matter of adding an annotation on top of this Kafka rebalance object. And the annotation says rebalance approve. So I'm going to approve the proposal. And if we read the Kafka rebalance, so at this point we should see that the rebalancing is started. So it starts moving around the partitions. And finally, we should get the ready situation where the rebalance is completed. Okay, we get the first stage, which is the rebalancing and that's fine. Okay, let's wait another few seconds and we can also move, waiting for the rebalancing task completion. We should already see some effect on the dashboard. Okay, let me see if we see something here. At some point is that the difference between those is a difference in terms of workload. So in terms of disk workload and in terms of CPU workloads, those lines should converge because the different server works in a uniform way. So the disk write, the CPU is used in a homogeneous way. Okay, this is still running. Let me check. Okay, it's still doing the rebalancing. Let's wait another 30 seconds. Then I will move on my Plan B where I will show you only the recording. Okay, I see something happening here. Hope that it's not a mirage. Yes, something is happening. So I can see already in terms of write that something is changing. Okay, cool. Also the networking, from networking point of view, we are reaching this situation that I've explained before where all the server works in parallel and distributed the workload evenly among them. Okay, so I will move on. Okay, so I think that it did the job. In fact, we also see here from the rebalance resource that the rebalance is completed. Okay, so the demo worked as expected and we definitely can see that the workload is finally well balanced. Okay, so I'm pretty happy that my demo worked. I don't know if Mike has collected some questions from the audience. We don't have any questions at this point in time. If anyone does have any, please feel free to drop them into the comments field here. It takes a few seconds for our stream to hit you and come back. So we'll give it a moment, but any additional thoughts that you want to add? It doesn't look like anything's coming through. So it gets you to a great job covering the topic. Okay, cool. Yeah, so you can reach me as I said at the beginning. You can reach me on LinkedIn or also my email. I'm happy to discuss about this. If you want to also have a personal more details, I'm happy to engage a conversation. So I think that we can give back some time to the audience to relax before the next. The next session. Great. We appreciate your presentation. Donato is a great coverage of a great topic. So just a couple logistics for everyone on the back end. Again, as a reminder, the presentation will be available on the YouTube channel several days from now after the opportunity for all the sessions from all the events to get uploaded. So you will be receiving an email with a link to where you can find this and all the other great sessions. Definitely please hang around for the next session that's coming up. It'll be the next session is, let's see, the next one on the deck is going to be easily connect applications across clouds with service interconnect. So another great topic coming up or certainly feel free to jump over to one of the other stages for one of the other sessions as well. At this point we'll wrap up here and we'll actually drop the broadcast for a moment and get these speakers switched around and we'll be back shortly. Thanks a lot.