 Welcome everyone to my talk. So my name is Jakub and I work for Red Hat in the messaging and IoT team, and I work mainly on something what is called Red Hat AMQ Streams, which is our distribution of Apache Kafka, and one of the versions which Red Hat AMQ Streams has is running on OpenShift. So that's why I'm going to be talking today about how to run Apache Kafka on Kubernetes as well as on OpenShift. How many of you have been yesterday on my talk, which focused more or less only on Apache Kafka? Okay, quite a lot, but not everyone. So I will try to give at least a quick introduction into what Kafka is and how it works. So Kafka was originally developed at LinkedIn and was later open sourced. It's designed from scratch to be scalable, distributed, fast, durable, available all these cool buzzwords which are important, and it does most of these things through data partitioning. So it's basically sharding the data into different partitions, which can be then distributed across a cluster of Kafka brokers and used by the clients to send or receive messages. Thanks to this sharding, it is able to achieve high throughput and low latency, and it can handle quite large amounts of consumers or producers. So few basics about how Kafka works. So the messages are always sent to or received from a Kafka topic, but the topic always consists of one or more partitions. Actually, most of the operations are done on the partition level. So the consumers and producers, they usually connect directly to the partitions and send the messages into a specific partition or consume the messages or read them from a specific partition. So the topic is just more a virtual object which groups the partitions together. And so because the partitions are shards, then each message is always written only into one of these shards and to achieve high availability, each of the partitions can have always multiple replicas where one of the replicas would be called the leader replica and that's where the clients will be connecting and the other replicas will be called follower replicas and they will be just replicating data from the leader and waiting and if the leader crashes for some reason they will take over and one of the followers will become the new leader and will continue serving the clients. When the producer is sending the messages to the partitions, it usually does so based on a message key. So the message key is basically hashed and the partitions are used as a kind of hash table. So each key always, every message with the same key always ends up in exactly the same partition and the partition is really just a commit lock. So the messages are written into the partition, always appended at the end and that way once they are written into the partition the ordering is basically guaranteed for the consumers and when there is no message key in the message, so the message is basically no key or null key, then the messages are more or less distributed round robin and all the time I'm talking about consumers and API is actually called really in Kafka consumer API but in reality they don't really consume because when the consumers are receiving the messages from the partitions then the messages always stay in the partition and are not really removed by the consumer after they are received. So the partition has always something what's called retention configuration where the user when creating the topic and the partitions can say, oh hey I want to keep the messages in this partition for some time for example for one hour, one day, one week, one year or forever if he wants or the retention can be configured based on the message size like let's keep in the partition up to one terabyte of data and when I get more data let's start dropping the data and then this can be also combined if needed and then there is kind of a special case which is called compacted topic where Kafka would always try to keep the last message for a given key in the topic and removed all the messages. So that's if you for example are writing there some updates to some users or orders and you are always interested only in the last state then this is very useful because you can then save a lot of disk space in the topics. So that's I would say a quick introduction into what Apache Kafka is and if you'd be interested in more details I can recommend you for example to look at the slides from my yesterday talk where I describe it a bit more into details but this should be enough for us to talk about how to run Kafka on OpenSheet and Kubernetes and what are the main challenges. So now when we know what is Apache Kafka maybe we should talk about why should we actually want to run Apache Kafka on something like Kubernetes or OpenSheet and one of the things is that Kafka is designed as a distributed and scalable and also the workloads using Kafka are usually distributed and scalable so that's quite a good fit with a platform like Kubernetes or OpenSheet but to be honest this alone would probably not convince me because if you would come and ask me hey I want to run my Kafka cluster in production watch how should I start it then probably one of the first things I would recommend you is to get dedicated notes instead of running the pots on the regular work notes shared by other workloads as well and so there are other reasons as well one of them is that Kubernetes provides a great abstraction so if we would try to write some tooling how to deploy and manage Kafka in all the different public clouds and on-premise environments that would be really hard for us because even through there is tooling it's always a bit complicated there's always some differences and Kubernetes gives us this great abstraction where we can just use the Kubernetes primitives and not really care so much whether we run on Amazon and we are using the EBS volumes or whether we run on OpenStack and we are using Cinder volumes as the persistent storage so we don't really need to care and Kubernetes takes care of it. There is also this notion of making everything feel like a cloud so when you use some AWS services for example you know how it works you go to the page for the service you then click something like new database select some parameters then click apply or deploy and in a few minutes you have a database available and you can just connect to it and use it and Kubernetes allows us to do pretty much exactly the same in all of these environments whether it's Kubernetes or OpenShift running in Amazon, Azure, on-premise we can everywhere give the user this kind of feeling that they just request the Kafka cluster and it just takes a few minutes and the Kafka cluster is deployed and can be used. And then last but not least I think it's one of the very important reasons for many of our users is that a lot of companies or organizations are adopting more and more OpenShift and Kubernetes to run all their other workloads and to be honest why should they learn how to operate something new somewhere else outside of OpenShift if all their operation staffs and all their people are trained to work with OpenShift and Kubernetes and all of them kind of understand it then it's probably most efficient for them to run everything on OpenShift or Kubernetes and reuse this knowledge. So that was the why and now we are getting to the way how to do it. So there was a lot of operator talks already at DevCon and I'm quite sure that there will be some more. So how many of you have been already to some talk which talked about operators? See almost half of you. So as everyone else we are using the operators as well. And deploying the software into Kubernetes or OpenShift that's quite easy and you can really do that just with a bunch of YAML files or with some Helm chart. The operator maybe can do this a bit better that you nicely configure all the stuff in some custom resource but yeah it isn't really that much different and that much difficult to do it directly with some YAML files. But keeping the stuff running that's really the hard part and if you want to run something in production you know whether you install it five minutes or whether you install it one week that's not that important. The important thing is whether after you install it whether the software will run for five minutes for one week or for three years. And that's where I see the biggest benefit of the operators because they can help you through the whole life cycle of the application. They can help you with things like upgrades. They can help you with things like certificate renewals. They can help you with things like cluster balancing. And that's why our answer to running Kafka on OpenShift and Kubernetes is a project called Strimsy which is of course an open source project and what it provides is a container images with the different components so Apache Kafka, Apache Zookeeper and so on but it also provides the operators and kind of our aim in this project is to give the users Kubernetes native experience for running Kafka on Kubernetes or OpenShift. And how many of you used some operators or worked with them or have some idea how they work? It's not that many but basically the way how it works is you install the cluster operator. It's usually just running as a deployment or as a pod in your cluster and then you create something what's called custom resource. The custom resource is basically just an extension of the Kubernetes API which allows you to configure and define new things. So in our case what you define with this custom resource will be a Kafka cluster. And once you create this resource the operator will see it and it will automatically start deploying the cluster. So Kafka has a dependency on Zookeeper so the operator first needs to deploy the Zookeeper cluster and make sure the cluster kind of bootstraps and communicates and works. Afterwards we can go and deploy the Kafka cluster which building basically the Kafka nodes will connect with each other but they also connect to the Zookeeper cluster which they use to find other nodes and store some metadata and configuration. And last but not least we deploy something what we call topic and user operators which are used later to manage topics and users in the Kafka cluster and I will talk a bit more about them in a few slides. So let's have a quick look at an actual demo which will show how this works in reality. So I have an OpenShift running locally here. It's OpenShift 3.9 because in Strimsy we support OpenShift and Kubernetes 1.9 and higher. And first what I need to do is I need to install the Strimsy operator and that's done or I will do it here using just a bunch of YAML files so it's kind of the usual stuff. There are some RBEC files which give the operator the different roles and access rights to the Kubernetes or OpenShift cluster. Then there are some custom resources that's these extensions to the Kubernetes API which allow you to define the Kafka cluster and then the operator itself it's really just a regular deployment like any other application. So all I need to do is to do OC apply and it will create all these resources and start the cluster operator. And now when I want to deploy the Kafka cluster I need to create this custom resource describing the cluster. So I hope you can all see this. Maybe I can try to do it without colors. Maybe the red one is not the best one to read. So what you can see here is that I'm creating just a regular resource. It looks like all the other Kubernetes resources but the API version is actually something called Kafka.Strimsy.io and the resource kind is called Kafka. So that's our own resource which configures the Kafka cluster. And then you can see here different options for the, which configure the Kafka cluster like here I for example, say that I want to have free Kafka brokers in the cluster. I specify how much memory and CPU it should get. I configure some authentication and authorization. I use some specific Kafka configuration options. I configure the storage because we have the zookeeper I have to configure the zookeeper as well. And I pretty much summarize the configuration nicely in this custom resource. And then all I need to do is to do again OC or CUBE CTL apply. And the user operator, the Streamsy operator will see it. And as you can see, it is already deploying the zookeeper cluster. So we have three pods which are already starting. And after the cluster, so the pods have a built-in health checks. And once they are ready and once the cluster works, the operator will move on to start deploying the Kafka cluster. Now it's deploying the Kafka cluster and so on. So it will work exactly as in the animation on the slides one after one it will deploy the clusters and the pods. Now once the cluster is running, you can also decide to change some of these configurations. So when you change the configuration, you basically just edit the Kafka custom resource. You change there whatever you want to change. And then the cluster operator will see the change and will automatically apply the change to the cluster. So for example, for many of the changes, a restart of the different clusters will be needed. And in that case, kind of the operator will take down the pods kind of one by one to make sure that the Kafka cluster is still running, is still available, that your applications can use it while the update is going on. And only when the pod which was terminated is started again and is part of the cluster, then it will kind of move on and shut down the next pod and start it again and so on. So this way you can update the configuration and if you use topics which have these replications so which have multiple copies, then your clients should pretty much have no problems to work during this update because when one of the brokers is shut down, then one of the follower replicas should take over and should be able to serve the clients while it's shutting down and starting and so on. So we can have a look again at how it works. So with these custom resources, I can really do just something like OC edit Kafka and now name of my Kafka cluster and it gives me a text editor where I can really just edit the YAML stuff. So I will do some simple change. So I will, this auto create topics enable option defines whether when a client connects to the Kafka broker and tries to produce messages or consume messages from some topic, whether this topic will be automatically created or whether you actually have to first create it with the Kafka API. So what I will do, I will change this to fall so that it doesn't create these topics automatically. And now what happens is that the stream the operator sees that there was some change to the configuration and we're basically start rolling down the first pot of the cluster. So you can see that the my cluster Kafka zero is already terminating and once it terminates which takes some time, then it will start a new version of this pot. It will wait until this new version becomes ready and then it will basically move to the next one. So then it will continue with my cluster Kafka one. So you can see that it's still terminating in the moment we should see that it starts starting the new pot. Yeah? Does this operator support the setup that there are two Kafka nodes and the additional zookeeper witness node, like the dedicated zookeeper master node. It's the recommended setup because if you overload zookeeper, then the whole thing goes down. So I run Kafka like having two nodes, so keep on each of them and the third node which is just the additional zookeeper. Is the operator able to work in this setup? So the question was, whether the operator is able to run two Kafka nodes with collocated zookeeper and then one additional node with separate zookeeper. And yes, we are able to do this. So when you are editing, so I intentionally try to keep this custom resource a bit simple, but you can add there the affinity and tolerations rules. So you can really configure in detail in together with how Kubernetes works. You can set up dedicated nodes for both Kafka and zookeeper and you can basically configure it itself yourself, whether the Kafka and zookeeper nodes should run on the same nodes or whether they should run on a separate nodes and so on. So you can really do this. And in documentation, we have some kind of procedures and guidance, how exactly you should configure this, but we are pretty much using the kind of Kubernetes and OpenShift primitives to do this. Yep. So in the Strimsy project, we have a Helm charts, which are basically for deploying the operator itself and we don't really want to support deploying Kafka without the Helm charts, but you can deploy the operator with the Helm charts. Yeah. Yeah. Sorry, I didn't repeat the question for the video. So the question was that we are supporting Helm. So you can see that the rolling update is going on. So the first broker with index zero is already running. Now it's terminating the second broker. It's quite boring to just watch the pots terminate and start, right? So let's get back to the slides. And let's talk a bit more about the other operators which we have. So it's a bit like inception, right? So it's operators deploying operators which deploy operators, but right now it seems that that's really the future, at least on OpenShift. So in the StreamZ project, we wanted to do a bit more than just manage the Kafka clusters. Because if you run the Kafka cluster on OpenSheet, then if you use all these YAMLs and custom resources to deploy the cluster, why should you then go and learn all the different Kafka commands for creating users or creating topics and kind of run them the old school way from some command line. And that's why we created the user and topic operators which basically give you the Kubernetes native way for managing the topics and the users as well. So again, it works in the same principle as the cluster operator, you configure some custom resource which defines the user or the topic and the user and the topic operator which are always deployed kind of for each Kafka cluster will then watch for them and when you create them, they will automatically create the user or create the topic. And one of the advantages of this is that when you have some application which is using some topics or connecting to Kafka, you have basically everything in YAMLs, right? So you have YAML with the application deployment if it runs inside Kubernetes, you will have a YAML with the user definition and you will have a YAML with definition of the topics which it is using. And you can easily store all of these somewhere in GitHub and then just do OC apply on the whole directory, for example, or you can have it in one big file and everything will be created and deployed for you and you don't really need to do anything else. You don't need to run any commands to install your application, anything like that. The topic operator is a bit special because there's a lot of Kafka applications which will connect to the Kafka brokers and they will create their topics kind of on demand so they will use the Kafka APIs to do that. And that means that we cannot always see the custom resources as the single source of truth for the topics because if the application connects to the broker creates the topic with name mytopic and the operator would see, oh, hey, there's no custom resource with a topic named mytopic, then the usual operator approach would be, okay, there is no custom resource so this topic should not exist and it will go and delete the topic in Kafka and then the application using this topic will not really like it, right? Maybe they will just play a ping pong and they will be creating and deleting the topic. Maybe the application will start crashing or something so that's why the topic operator is a bit special and what we do there is we try to do this kind of bidirectional synchronization between the Kafka topics and the custom resources. We don't see the custom resources as a single source of truth and then for example, the application creates the topic directly in Kafka then what the topic operator does is it will create the custom resource of the Kafka topic type inside your Kubernetes cluster so that you can kind of use that then to manage the topics as well and so it does this bidirectional synchronization and there's a different mechanism that it can reconcile if different things are changed in the different places around the same time. So to again show a bit more about this. So I have here a simple application. It's an address book application. Those who were on my talk yesterday saw something similar. This is a bit different so this version is not using a database but it's basically an address book which is using Kafka as the only state store which it has so all the contacts in this address book application they will be basically stored inside the Kafka as a messages and thanks to that we don't really need any database or anything like that but what's more important for this talk is how the YAML looks like for deploying this application. So you can see that I start with this API version kafkastrimzi.io and kind Kafka topic and the topic is named address book here because the topic operator and the user operator are running specifically for the given Kafka cluster. I always have to use the label to say kind of okay this topic should be created on the cluster which is called my cluster and then I just specify the details about the topic configuration. So in this case I want to have three partitions and three replicas. I want to have the commit lock segments for the topic 10 megabytes big and I want to use the compacting cleanup policies so that it tries to keep always only the latest state for the contacts. And then in a similar way I'm defining the user. So that's this kind Kafka user. The user is again named address book and I can here specify that I want to use the authentication type TLS which means the authentication will be done using the TLS client certificates and then I can specify a bunch of ACLs which say okay this user should be able to write to a topic called address book. It should be able to create this topic. It should be able to read from this topic as well. And then these resources deployed the user operator will see it and because we are using the TLS client authentication it will automatically create a secret where it will generate the user certificate for this user. And inside the application I don't really need to do when the application is running inside the same Kubernetes cluster. I don't really need to do any exporting of this certificate and loading it somewhere else. I basically just map the certificate created by the user operator from the secret which was created into environment variable and use it to connect to the broker. So with this YAML file prepared all I need to do is OC apply and it creates everything for me and I didn't really need it to log into Kafka. I didn't need it to know any Kafka specific commands. Everything is described inside this single YAML. And when I go now back to the console you can see that the application is running and you can see that I can add some contacts. So add and now it takes some time but because it's starting but it will eventually be there. So this is kind of the marketing part of the presentation which shows how easy it is to run Kafka and the applications. There's a lot more features which we support. I don't really want to spend all the time going through them. Just the main ones apart from deploying Kafka clusters we can also manage Kafka Mirror Maker and Kafka Connect as well. So it's not just the brokers it's also the other Kafka components. And we can do all the different stuff scale down, scale up, authorization, authentication, affinity, monitoring using Prometheus, everything. So there's really a lot of features. If you are interested specifically in them might be best to have a look at the documentation. And if you would have some questions talk with us on Slack, Twitter, email or everything because we will anyway not manage to get through this all in 20 minutes. So instead I want to talk a bit more about the challenges of running something like Kafka on Kubernetes and OpenShift. And to be honest running stateful applications is still quite hard on Kubernetes and OpenShift. And I know some people might say why is it hard? We have the stateful sets. They are called stateful sets. So they should make every stateful applications very simple to run. So it's not really the case that's that wrong. The stateful sets basically provide just the Prometheus just the building blocks. The main thing which they give you is that when you use the stateful set your pods won't have this random address with some generated characters but they will have a stable name which is always the same. And that's important for many stateful applications which are using some custom clustering because that makes it easier for these applications to find the other nodes of the cluster and connect with each other. But in general most of the applications which have some clustering have a different way how the clustering is configured, how the different cluster nodes discover each other, how they elect the leaders and so on. So to be honest it's quite hard for Kubernetes to offer something more. In our case Kafka is using the ZooKeeper to do pretty much all of these things. So that's why we first need to deploy the ZooKeeper cluster before we deploy the Kafka cluster. And when we are deploying the ZooKeeper cluster we basically need on the beginning, ZooKeeper creates the cluster in the way that in the configuration file you actually have to list the addresses and ports of all the other ZooKeeper servers which should be part of that cluster. So this is something that the operator helps to generate that helps to bring the ZooKeeper together. And then with Kafka it's then a bit easier because Kafka just use ZooKeeper to do all these things like elect the leaders and discover the other nodes. So once the ZooKeeper is running, deploying Kafka is fairly simply you just point it to the ZooKeeper and it use ZooKeeper to find all the other nodes. ZooKeeper is really quite challenging. In some configurations it's basically impossible to scale it up or down without losing a quorum. And when you lose the quorum it's a problem because it means you will have some disruption, some outage of the Kafka cluster. So that was one of the challenges which we had to overcome to make ZooKeeper work. Then the other one is more philosophical. So operator like Strimsy needs to be able to handle a lot of different use cases and it's really wide range, right? So you want to be able that the developers just start their mini shift and deploy the Kafka cluster using Strimsy on mini shift with just a few gigabytes of RAM which they have on their laptops and so on, right? So that's kind of one end of the spectrum. The other end is we actually want enterprises to be able to deploy huge clusters with tens of nodes and dedicated Kubernetes or OpenShift hosts and a lot of memory, a lot of disks and so on. So that's the other end of the spectrum. We're waiting for this to happen. Then another kind of side of this is you want this to be super easy for the developers to deploy and just run the Kafka cluster so that they really just do single OC apply and they get all the features. But then for the production, every customer has slightly different requirements, slightly different setup of their data center. So you need to also give there a lot of flexibility for the expert users so that they can set up Kafka in the best way for their particular use case. And then when you add all these options for the flexibility, that's super great but that of course sometimes makes the configuration a bit hard and complicated because then you have all these options which I listed on one of the previous slides in the CRD. The CRD is actually quite huge and it's really hard to read. So that makes it kind of less user friendly. So these are all the things that you can write in the operator. You need to kind of try to balance and sometimes you need to decide a bit better. You want to make it super easy for the users who just want to get it run somehow and don't care how and make it also interesting for the experts who want to do some very specific setup. Another challenge is which versions of Kubernetes and OpenShift are you going to support? Right, it's still very much work in progress in some areas and every single release has a lot of new cool features and it's always very tempting to sit down and start playing with these new features in the latest Kubernetes release and get your operator to use them somehow but at the same time a lot of the users they are not really running Kubernetes 1.13. When you have OpenShift users there's no OpenShift version based on Kubernetes 1.13 for example, right? So most of the users will be running some older versions and you need to make, either you need to really put the effort into the code to make the operator use the new features on the new versions but use only the old features on the old versions which can be quite a lot of effort because it's not always easy to do that or you need to kind of say like, oh hey, we decided to do compatibility with this version higher and the new features maybe over the time as more users move to the newer versions we will start using them and that's pretty much what we did as the decision with the StreamZ project very basically said that Kubernetes 1.9 and OpenShift 3.9 are our baseline and we want to support that and everything newer and that's why we don't always use the latest features from the latest releases. The good thing about Kubernetes is that in general they have a very good compatibility so if you have something that's running and working on 1.9 there's only very little problems with using that on 1.13 for example so that's the very good aspect how the APIs are handled. And then one of the things which we had a lot of discussions about is this question, custom resources or config maps and that a bit ties to another problem which is the RBEC right so the access control which is part of the OpenShift so using the custom resources that we have really the Kafka type resource and you have the nice structure of the features that makes it easy to use for the users when they have it installed but to install the custom resources for example the user needs to have usually cluster admin rights which a lot of the users who are using some shared enterprise clusters they don't really have these rights right so that was another decision which we had to make we decided to support the custom resources I think it's worth it but we had a lot of discussions about it because it makes it really harder for some of the users to use. And now one of the big challenges for Kafka was accessing Kafka from the clients now if you are running something like a web servers that's usually quite easy so you have some service or some load balancer the requests are coming to the service and they are just routed randomly to one of the web servers so sometimes it's a bit more complicated because you want sticky sessions and so on but in general you often don't care so much which of the web servers for example serve some static files with some images and so on with Kafka this is a lot more complicated because if you remember what I talked about in the beginning the producers or the consumers they need to connect to a specific partition to produce or consume the messages and the partition is always running on a specific note of the cluster so for example imagine I have a topic called my topic which has three partitions so the message is sent to or received from the partition zero maybe they would be always on this pot whereas the partition two will be maybe on this pot so when the producer wants to send the messages to different partitions it will need to maintain the connections to all the pots but also it needs to know okay now I'm connecting to the broker zero now I'm connecting to the broker one so the way this works is that when you are trying to access it from inside your Kubernetes cluster we have two services one is kind of the headless service which pretty much just gives the pots a stable host names and then other one is the bootstrap service and the bootstrap service that's just a regular cluster IP service and when the client tries to connect to Kafka it first opens a connection to the bootstrap service and the bootstrap service basically roads the connection to one of the Kafka pots and this is pretty much random I made the arrow here to the bottom one but it can go to the one on the top it doesn't really matter because this connection is used by the client only to request the metadata about the Kafka cluster so whichever node it connects to it will send it the metadata for the whole cluster and the metadata say something like oh hey the my topic partition zero that's hosted on the broker zero which has this particular address the partition one is hosted on another broker which has this address and the client gets this metadata and use these addresses to connect directly to these different pots and opens these kind of subsequent connections which are then used to send and receive the messages now if I have a quick look into the console and open one of the Kafka brokers then on the beginning of the lock we have actually the configuration file and here you can for example see I hope you can see it I will zoom it a bit so here you can for example see the configuration of the advertised listeners these are the addresses which the Kafka brokers are telling to the client that they should connect to so you can see that this broker tells the clients to connect to the address my cluster Kafka zero dot my cluster Kafka brokers dot my project dot service dot cluster local and so on so this is what the broker will send to the clients and this is what the clients will use to open these subsequent connections now outside Kubernetes it's a bit harder because if your application is running outside Kubernetes or OpenShift it cannot really use the internal Kubernetes DNS host names to connect to the pots so unfortunately a lot of the users don't have all the applications running inside Kubernetes or OpenShift, right? They have a lot of legacies to our new stuff which is simply running outside so how do we do it from outside? We have still these headless and bootstrap services but we have also these per pot services and these per pot services always wrote the user to the specific pot and these can be for example load balancer type so that they create load balancers and so on and when the client is connecting it then connects first through the bootstrap service again randomly to one of the Kafka brokers gets the metadata but now for the access from the outside the address which it would get in the metadata would be not actually the host name of the pot but it will be for example the IP address of the load balancer which was created for the service and in general in streams we can do this in three different ways so on OpenShift we can use OpenShift roads to access the Kafka cluster it's a bit heck because Kafka traffic is TCP and the roads are designed for HTTP but if you kind of pretend that it's HTTPS connection which is passed through the router into the Kafka pot then the router actually doesn't know that it's TCP inside so it works fine and then we can use load balancers or node ports the load balancers it depends on your cost for the load balancer is because if you have 10 brokers in your cluster you will always need 10 plus one load balancers to use the load balancers and then we support node ports which are kind of the fastest from the performance but you actually have to expose the nodes directly to the clients which are connecting so that's a bit unsecure or at least considered a bit unsecure and that's pretty much it one more thing most operators are written in Go but streams is written in Java so you can do operators in Java as well and the demos and the link to the slides you can find it on this URL if you want to have a look at it later and that's it from me and if you have any questions I think we still have five minutes yeah so the question was whether we plan or whether we do support or plan support access from outside through Ingress and yes it is on our to-do list we haven't gotten to it yet there are also I know there were some issues with this TLS pass-through with the Ingress controllers I think they have been fixed in the latest versions but that's what stopped us in the beginning but it's on our to-do list I hope sooner or later we support it as well yeah so the question was about other components like the schema registry yeah to be honest it's on our to-do list as well we have to see and think a bit about licensing because we want to use really open source components which everyone can use for whatever purpose and the schema registry license I think it changed recently to this Confluent license which is not as easy as the Apache license 2.0 or something like that but we have plans to support them basically once it's implemented it would either specify this part of the Kafka custom resource or it will have its own custom resource to get deployed and we will get to it as well yeah so the question was why basically why did we do this in Java and why put the effort instead of using something like the operator SDK so to be honest when we started with this there was no operator SDK so it's not something that we are working on for a month or two but it's really a long time effort and we have chosen Java because Java has a good API support for Kafka which Go doesn't have such a nice admin API client for managing the different Kafka aspects and so on so that was one of the reasons and to be honest the other reason was that most of the team members working on this were more familiar with Java than with Go maybe now with the operator SDK it would be a lot easier to write the operators in Go but when we were starting I'm not sure whether writing it in Java really meant bigger effort than writing it from scratch in Go yeah Do you plan to extend this operation in Java to include for example operation method? So the question was whether we plan to extend it to include operator metering and to be honest I need to learn a bit more about operator metering we were looking into it I think sooner or later we will get there we for example are already part in OpenShift 3.11 we were included in this operator hub which was tech preview there so we are it's easy to integrate it with the operator lifecycle manager and I think we will in the future integrate with the metering as well but we don't have any integration today yep the cluster we don't have sorry once again can the operator yeah the question was whether the operator can upgrade Kafka and yes the latest so that's not in the 0.9 version which I was using in the demo but we have it implemented now in master and we will release the 0.10 version of Strimsy with support for Kafka upgrades probably next week and it's kind of semi-autobated, semi-manual process so we don't really want to upgrade if you know how Kafka upgrades it's a bit more complicated because ideally you should upgrade the clients as well and you should make sure that the kind of protocol version is the same between the clients and broker and so on so basically the way it's implemented is you can use the custom resource to basically tell the operator which versions it should be using for these different parts so you can say okay now move the broker from Kafka 2.0 to Kafka 2.1 first use the old version of the protocol for storing the messages and communicating with the clients and then in next steps you can kind of upgrade the protocol as well so the operator does most of the things for you but you basically still have to tell the operator oh hey please do the upgrade because at the end you will be the one upgrading the clients the operator cannot do it for you yeah So the AMQ product from Red Hat contains many different components and one of them is this AMQ streams components which is basically the stream Z product project offered by Red Hat and then there is for example the AMQ broker which is the more traditional and it really depends on the use case which one is better so it's I would say a bit longer discussion and I think we just run out of the time so if you want I will be outside after the talk and we can continue with this as well as with all the other questions you might have thanks for coming