 Well, hello and welcome to another DevNation Tech Talk. We have an exciting topic for you guys today We're gonna be talking to Matthias. Matthias is coming from Germany And he's gonna give us this really awesome introduction to what you can do at Knative Kubernetes based on OpenShift and this technology called Kafka, right? So you're gonna be seeing Knative Kafka and Kubernetes all woven into the same story. I'm actually very excited about it This is a demonstration. I really love doing this for myself So we're gonna see it directly from the source today and that is what Matthias Let's turn it over to Matthias and get started Hello guys, thank you for having me, Berr here Let's get started. All right So as Berr was saying today We have a quick session about serverless Kafka on Kubernetes and that really means we run the patchy Kafka in Combination with the Knative project. A few words on me My name is Matthias. I work as a principal software engineer at Red Hat My responsibilities there are I'm a Knative working on Knative I'm a prover there on the upstream project So specifically on the eventing parts and inside of the OpenShift Knative team I'm leading up our internal Knative eventing team So nowadays a lot of people really have interest in a patchy Kafka and they love it and they all want to run it on Kubernetes So the best answer to that on the market right now is our upstream stream C project Stream C project basically gives you an operator which allows you a Kubernetes straightforward way of managing and installing your cluster your patchy Kafka cluster with the sukeeper responsibilities What this looks like is what you could see here The operator once it is installed is watching for a specific Kafka custom resource file Where you then declare how many nodes of your patchy Kafka you need how many nodes of your patchy sukeeper cluster you need and Besides the cluster operator it has another operator which gives you a declarative way to actually create users They're allowed to access the Apache Kafka cluster And it has a feature that you can declare your topics So in a YAML file you specify your Apache Kafka topic give it a name you give it Partition and replication and what have you so that is the introduction part for Kafka operator Today the Apache Kafka will be used behind the scenes of our serverless offering and for that we use Knative Let's do a quick recap what the serverless really mean so what we see here is a definition from the cloud native computing foundation and The cloud native computing foundation has its own serverless working group inside of that one. They have a definition or Specified what the terminology really means so I was slightly modifying this sentence with the Knative mindset Kind of applying to it. So what what does it mean serverless computing? refers to the concept of running applications that do not really require server management is described serverless describes a deployment model where applications are uploaded to a specific platform and then Executed scaled and piled in response to the exact demand that you need at the moment So Knative is really a serverless platform based on top of Kubernetes So what's the good news here is that you can bundle your applications inside of a container and You you upload it it to the platform So you don't really have the programming model here from a fastest perspective So the serverless is really more a deployment model You can just bundle your containerized applications and I show you in a few minutes how that really looks like The nice part about some of the Knative components is that based on the actual workload? We really see your container is scaled up as needed or even scaled down to zero and therefore Biled with the exact amount of compute resources that you need at the right moment when your Service is being executed by your customers So that's a little introduction of the serverless part again here So let's enter now Knative Knative Consists of two key projects or two key components here. We do have Knative serving Knative serving is as you can see here a request driven model that serves a container With your application and can scale it to zero so in Knative serving You basically declare your application as a K service That means you give it some YAML based configuration and you're referencing a Linux container there and then you upload it or you apply it basically and then Knative takes care of the scaling So if your application is Requested a very often and has a lot of load coming in then the autoscale mechanism from Knative sees that and Notice that there is a lot of incoming HTTP traffic here What it really does for use and automates based on traffic scaling So you will see a lot of pots out of your container are coming up It also realizes when the traffic is Relaxing and then it can scale down to the pots and if for instance there is for whatever reason no traffic on your machine What you then really have a nice benefit here It can scale to zero so that means that there is no actual running pot and that means you're not burning money You're not wasting compute resources with a program that would be sitting there and doing nothing So that is one key part of the Knative project Today we are going to focus on a few parts of the Knative eventing project Knative eventing can say it is providing a common infrastructure for the consumption and producing events So that will basically stimulate your application Knative eventing can be really seen as kind of an event mesh that you can have connectors They're called sources that can then basically access a third-party system for instance like Apache Kafka They read the payload out of Apache Kafka And they then send it along the wire to a Knative serving service that at the end of the day is your application so it provides you really a plumbing infrastructure that you can Interweave with a lot of events here You can build event-driven applications that are based on the scaling module can even scale to zero So this is a very nice feature of Knative Eventing Knative eventing can be really seen as a universal subscription delivery and also the management of events So it is all you need when you work with particular event-driven features and when you want to build a fully-fledged application that is doing event processing with you Knative has a few different modules here on the right hand side I did list the Extensibility feature so Knative on its own has a lot of eventing sources that you can basically use to access existing third-party systems For instance, we have one upstream that is GitHub So you can basically combine GitHub events like pull request related events with your serverless application and based on traffic with your GitHub project For instance a pull request has been merged it will trigger an event and then Knative is The eventing part is then receiving this event and it is forwarded to your application We also have sources that allow you to actually integrate the camel Integrations with the camel K project So that means all of your knowledge integration knowledge that you have and that is applyable for a patchy camel can be also used there behind the scenes the events are routed on Some internal transport mechanism that is called a channel and there is various channels available There is the in-memory Channel that's the default thing that you get all the time and installed and there is an Apache Kafka channel So all the messages that are then being used internally as a transport over this channel in our case Apache Kafka They are actually persisted in an Apache Kafka topic So the benefit here is that you have all of the features that you have from Apache Kafka Like you can replay events and you can even access the events from the particular topic with your vanilla Apache Kafka tools and there is another Channel implementation for GC PAPSA. So one of the big benefits here really is the aspect of the event orchestration Being a Kubernetes based platform Knative has declarative API to distribute events As I was already mentioning it allows you to actually scale from just a few events to a fully fleshed life data-streaming pipeline And we will see that later by a demo Where's to notice that the cloud event specification which recently went 1.0 is First-class citizens here. So all of the Knative eventing sources when they read data from the third-party system They're reading the payload and they're transforming it into cloud event and is routing forward the cloud event object to your application So the first demo that we're going to see is basically One of the very simple use cases that you can build you can use a source that is directly connected to a service In this case, I'm using a Kafka source and I'm connecting it straight to a service using the sync API of the Kafka source. So this is basically the simplest way that you can imagine to get a cloud event From a Knative source to a particular Knative serving service However, saying this is a very simple use case. There's a few drawbacks right now There is no queuing support like when the service is unavailable It's a one-by-one relationship So only one service can consume it at a time and since it's a straightforward simple direct connection There's also no back pressure for port. Also, there's no specific filtering All events that are produced by the source are being received by that particular one service So you have a one-by-one connection from the particular Kafka source to a Knative service Alright, let's take a look at some code now Before we do that, I would like you to show the OpenShift console I'm using OpenShift 4.2 platform here with the serverless product that we have here The serverless operator is giving you the Knative serving project So I was using our operators here that are integrated in OpenShift 4.2 operator hop So you can basically do a click and install all of the features that you need to run the Knative serving parts of the demo In combination with that, we have community operators that we wrote Which basically gives you Knative eventing. So this installs all of the Knative eventing The feature to run different sources, the in-memory channel And at Red Hat we did also create another operator which bundles all of the Knative components for Apache Kafka In our case, we are going to use the Apache Kafka source And we are also going to use the Apache Kafka channel So you will get that by using these operators Now let me go to my terminal console here Alright I have the windows split it up Here in the top of the window, I'm in a watch loop That is basically getting all of the pods that are deployed in my OpenShift's default namespace And let me go into my first demo here Let's see Okay So I was installing two different YAML files here Let's first start with the event display That is basically a Knative serving service That is going to receive my events here So being a Kubernetes based platform, we have an API group serving Knative dev And this is a kind of a service. I have given it a name and it's called the event display I have created some annotations here for scaling So it can react to five concurrent requests in this particular case Because it's a little demo here. I want to show the auto-scaling feature And what we see here, all I need to reference in terms of like where is my code Like my real application, where is that bundled? It's just a reference to a Linux container that I stored on my private Docker hub account Alright I have already installed this one here as we can see Let me do kgadk as we see. That's the short version for Knative serving service. What we see here is I have the event display already Installed but as you can see here in this video in this terminal here It's not yet running because there's currently no load. There's nobody accessing my event display application Okay, let's make some fun with this one Now I'm showing you the source So this is the interesting part actually What we see here is we have a different api version group And what i'm having here is sources dot eventing dot knative dot dev And this is the kind kafka source Inside of the metadata. I have given it a very simple name. It's my kafka source and inside of the specification here I'm giving it kafka specific configuration values I'm using a consumer group here right now. I'm using knative redhead developer demo for kafka I have a bootstrap server and that's by the way The only really reference that I need to have for my kafka as you see here My work project the default namespace here that I'm using Here's no kafka broker running. I don't need that. That's all Infrastructure pieces that are shielded away from me. I only have to have the reference here for the particular server What I'm saying here is I have a topic that's called the my topic And now all events that are being read from this particular my topic are then routed to a sink And as the sink i'm using here a knative serving service That has the name display events. So now let me Apply this one What we now see here is that I did create a kafka source with the name kafka source It's created we see here in the video in the terminal here Um container being created and once the status changes to running We will see a few eventing display pods coming up So there's the first one now k native autoscaler was basic or activator was basically seeing that there was traffic coming in And then the autoscaler was noticing. Well, there's a lot of traffic there So let me create a few more pods here So I can guarantee that this application called event display is not going dark and I'm Getting based on the k native autoscaling feature. I get a few pods So if we are interested in how many messages I have in my topic here, I was preparing a little bit I execute a little script that basically calls into kafka and I see I have like 200 203 thousand messages in my particular system running Now as we speak here the payload is being processed and then eventually Pods are being cleaned up. So that means once these 200k messages have been distributed To the particular different versions of my deployments here of my pods K native will start to terminate pods as they are no longer needed. So the autoscaler is also reflecting The number of traffic. So when it is relaxing, it is actually also scaling back So what we see here in our case is once these 200 000 messages have been all processed We will see some termination and that just happened right now and it keeps just run Instance running and if there is for like default configuration right now in my case is two minutes So if we continue two more minutes without any traffic here, we will also see that this guy is going away Okay, that's the first demo for running kafka payload through a kafka event source To my application you may now also wonder what this application really looks like. Well, I have a source code file here and that's really all what we have Is a go base application. That's just a nice display here The display function is being used inside of the main So the main basically launches my binary and what I'm doing here is I get a default client for the cloud events api And with the client I start the receiver and this receiver is getting a callback function Which is my display function here what the receiver does for me It basically starts a web server and behind the scenes the thing is building binding on port 8080 And events are coming on on route. So that is the only requirement that we really have here Okay, um, let me clean up the resources here for my source So now it says the source is deleted and it will terminate the source and eventually my uh, Applications my event display applications are being Also scaled to zero. So when we come back to this terminal No more workload is around and we will see That this last guy is also then gone away. Let's go back to the slides real quick All right, so the first simplistic Demo that we had was a direct combination of service and one by one tying it basically one by one to a particular service As I was discussing before there's a few drawbacks. I have only one service that can consume and yeah So there's another more advanced use case that we have here And that is that one or more sources can basically emit their events to a channel The channel is a generic api construct and as I was saying before we have multiple ones like kafka or gc pops up Now we have the nice benefit that we can have actually subscriptions multiple subscriptions on one channel On all events that are then on this channel are handled by all their subscriptions that we have here So in our graphical overview, what we have is basically I have two subscription to one channel and that means That both of these services they will get the same events here So there is an advantage that I have no longer a one by one Connection issue here. I have multiple subscribers that can consume the same event So service one that we have here could perhaps store the event in a database and service two could do some data processing And could even reply it The nice thing here is when we actually reply a value we could Leverage the feature of the subscription with the reply channel So the HTTP response from my web server that is doing the data processing here will be Going to another channel. So on this reply channel, I could then again have a lot of subscription that are Basically doing some other kind of processing So the channel api and the subscription with the reply channel Um give you a powerful tool to basically already build some data driven applications There's still one drawback here That you have to manually install if you want to not use the default channel that you have to install it Which I did very conveniently using the red hat operators that I was showing you before But still there is no filtering Filtering can be done by using the broker and trigger apis But that is something that is a little bit out of scope due to time limitations today And now I'm going to my last demo for today Okay, I'm back on the terminal And as you see I deleted the source. That means the source Has no longer any events to admit than k native was so nice for me and to clean up all of the resources I still have the deployment Specification for my application, but I have no running pod. So I'm not wasting any compute resources here Okay, let's go to my channel With the channel I was creating a different application. That's less or more the same Um event display. I just gave it a different name. It's called channel display So we saw that code already. Let's take a look at the next file that I was actually applying already before So what we see here is a descriptive file for my kafka channel It's in the messaging k native dev api group and it has a name called kafka channel one Here in the spec. I'm giving some kafka specific configuration So if you would for instance have a different channel like an a mqp based channel You would most likely do something inside of the spec that really makes sense for a mqp However, in case of a patchy kafka The numbers of petitions as well as the replication factor are things that you usually configure with your patchy kafka topic So this channel is called kafka channel one and behind the scenes Um, we do basically have the kafka topic. Let me verify this By showing you What i'm doing here is i'm listing all of the topics that are actually inside of my A patchy kafka installation that is managed by strimsey So we see we have a topic that's called my topic and we see a topic That's called k native messaging kafka. That is a prefix from k native So the k native messaging apis they prefix their stuff and because we have a kafka channel. It's hyphen kafka Then the topic has another Um, the nature here. It's the default that is representing my namespace and then I see here kafka channel one And this one is perfectly matching the name of my channel Okay, so this is already there now. Let's see what else we have Okay, in order to actually get events from a source emitted to a channel, I ultimately need a source Let me take a look at my source that I have here My source is a generic container source that basically allows you to bring your own container So instead of the container source, you can specify an image and here I have my image. That's a web socket adapter And I have parameterized it with the argument of a source And this is basically taking a secure web socket connection So my container source is running this web socket source Which basically is connecting to a web socket feed from the university of new castle reading all of the sensor data that they have Like all of the iot sensor data that they have inside of their building like temperatures, etc And is turning them into cloud events and is routing them to my channel Which in this case as you would have guessed is an apache kafka channel called kafka channel one However, this goes outside of the cluster So what we need to do here is we need to have a little bit of an egress rule here We have to have a service entry that's some Istio stuff Where we basically allow the host that we are accessing and the protocol hcdps and the default port As you saw that was web socket secure, but web socket secure is established out of hcdps So we we see hcdps here and we have another virtual service for istio that we can actually have my application running inside of the cluster Talking to the university of new castle. Okay Let me apply My source here And now I see container sources being created The status here is creating the container And very soon it will be running but nothing will happen. So let's wait a second until we get the running status So now the source is actually already receiving messages from the new castle university And is throwing them at the channel. However, I don't have any subscription So nobody is right now interested in reading the event Now I'm using the subscription to basically route the events from a channel to a configured service I use a subscription for this the subscription has a type name as a name subscription one And it says I want to read from the channel, which is again my kafka channel one and the subscriber is a reference here To my service the channel display. So now let's apply this one And then we will see a lot of pots coming up for this particular subscription If the demagogues are with me today looks like they are So now again the same logic that was happening before is all of the messages from this source They are going to an apache kafka channel and they are actually persisted in apache kafka Now once I have the subscription the event delivery kicks in and all of the messages from the particular channel Are then routed to my http endpoint for my channel display and what we see here is now the system is Distributing all of the messages and I get a lot of pots here. Now. Let me see if I can apply a second subscription to a different service So I have a second subscription So what we see in a in a few seconds is that we have a few more pots coming up that are then handling events for Subscription number two. So we will see channel display and channel display two. Those are completely different services or Running against the same channel, but are managed by different Subscription. So what we see here is I have a lot of display one pots here all of processing the data here And for channel display two based on or managed by the second subscription Um containers are being created the window is a little bit Small here, but you see now I do have a lot of pots running here. All right Um, what you see here is as well that now after the buffering the traffic is relaxing A lot of the instances are being terminated and only those that are relevant are being kept alive by k native Um, this basically concludes the demo. I now Have a little bit of a summary So k native and particularly what we saw was the eventing part and also integrating apache kafka infrastructure pieces With the k native kafka source and with the k native kafka channel Is really giving you a good Uh flavor to actually run your kafka based workloads and deliver them to serverless product I would like you to recommend two links go ahead and play with k native There is an upstream tutorial that shows you how to install it install on different clusters including open shift and nonetheless For our redhead serverless product. I was already showing you the operators in the Preview, uh, sorry in the in the browser when I was showing you my cluster there I would also like you to reach out to this website where you can learn more about our tech preview For open shift serverless and that basically concludes the demo And let me stop screen sharing here All right, so we're gonna have some questions for you And but we only have a few minutes left and I've tried to answer a lot of questions in real time So we've got Matias here on the screen Okay Okay, and so Matias there's actually a couple good questions those are kind of deep dive questions I want to make sure we touch on one is what about a fan out scenario So, you know fan out and the traditional messaging enterprise integration pattern And then what what to do with making sure every message is processed Yeah, in in terms of kafka we were using here the exactly ones delivery mechanism so The messages are basically like from the different topics are being read in our case We had a kafka topic that is based on 100 different partitions and the source for instance is implemented that way That you have a go routine that is basically a match like one go routine is matching one partition And is delivering the messages to the http channel and then based by the skating mechanism They are basically round robbing and distributed to the different services But under the lines here in terms of kafka, it's the exactly ones delivery mechanism that's applied here Yeah, once delivery. So that that's a good point And then there's a couple questions around the channel and I think the channel is kind of confusing is the channel itself Also kafka topic or is the channel separate from being a kafka topic. Can you speak? Yeah, sure the channel. Um, basically is an http application So it has the incoming messages that are for instance emitted by a source to a channel They are received over http. So our source that we saw in the second example was a web socket based source They read messages from the third party system a web socket Connection and then these events are actually being turned into cloud events and being thrown at the http endpoint of my channel The implementation for the specific kafka channel then matches every incoming message that basically Entering the channel through http are then being persisted on a virtually kafka topic So the messages that you see on the kafka channel are actually being persisted and made available for as long as you want And as based on your configuration for kafka and you can actually also access them using your standard kafka too So they based on kafka topics And then they are basically rerouted as soon as you have multiple or one subscription as I was showing in my demo So the underlying storage mechanism for a channel is basically in our demo case was kafka But the channel on its own like the interaction interconnection is basically it has an http Endpoint which basically receives events and then once you have a subscription The messages are actually being read from the particular kafka topic That's backing this particular channel and then being forwarded as if you will to the particular consumer Which is a service in the subscription there Okay, and there's and we're out of time and I apologize There's more questions But I tried to get to as many as possible But one that is the last one I'll throw out to hear Matias is Is kafka significantly better than other queuing or messaging solutions that are out there And what's your opinion on just kafka versus let's say amqp or some other messaging broker To kind of put that in context I think that's really based on the use case that you have so I would not recommend to go blindly with kafka It's really based on the use case that you have so kafka is good for certain things like it's very good There's a good disc operating IO around it. So you can distribute and scale very fast And also kafka has a scaling mechanism is very nice Like if you increase the number of petitions on your topic You can add multiple more consumers to that one. So you have like horizontal scaling there In in other more traditional messaging systems, that's slightly different. So based on the use case I would actually recommend Going with the messaging system that makes sense for you Okay, and then kafka Sorry, one more one more plug and kafka has also some nice apis around it like for data processing For instance, if you're interested in doing like full stream processing on your kafka messages Then you could for instance use the kafka streams api with the for instance quarkus kafka streams extension So then you have like full fetch apis that are really making sense in a particular context So it gives you a good tool for various things But really if you have certain use cases double check if they make sense to use kafka or not as a generic recommendation Okay, and if you have a github repo where you have some examples or anything like that Feel free to you know, let us know about that one And if anybody wants to email me and ask me your questions You guys have my email address based on the session, but we are totally out of time today And thank you so much, matias. Thank you all for showing up Yeah, awesome stuff. Well, thank you guys. We'll see you at the next dev nation tech talk