 Bona nit a ToL, bona nit. I, com que sí, ara posem en compte que es veuen els arquitectors de l'Eventiva i tot serà basat en tecnologies novençores. Com que podem veure, ens veiem a Red Hat, però no totes les tecnologies que anirem a fer ens venen a Red Hat. En una raó, tothom és molt preocupat que això serà una sèrie de espèrdues o una altra cosa. No és una raó. En un moment, per primer de tot, et diuen una qüestió shorta. As you can see here, I'm Carlos Arnal, I'm a software engineer, as I said, working for Red Hat. I am part of the Apicorio team. Here it says I work on Apicorio Registry. That is partially true. I work on Apicorio Registry. But the Apicorio community is a bigger umbrella with more projects than just Apicorio Registry. But in today's talk Apicorio Registry is the project that is related to the data pipeline. As you can see, I'm also an active Quarkus comiter, for those who are familiar with the Quarkus project. You are probably more familiar with the Spring project. Quarkus is an alternative coming from Red Hat to the Spring project, a very interesting one. I'm also an associate professor at the Polytechnic University of Catalonia, known as Barcelona Tech for those who like fancy names. First of all, let me make a small warning. This talk is going to be a bit different in the sense that I don't like presentations with a lot of text or anything like that. We will have 12 slides and I will do a live demo. That demo will take most of the time of the presentation. I'm making that warning because I know that most talks take questions at the end of the presentation. Since this is going to be a live demo, I would like that if you have any question in the middle of the demo, please raise your hand, ask the question, and we will continue from there. I will do my best to answer the question, okay? I think that's better because once we are in the middle of something in the demo, once we move to the next thing, it's going to be hard to explain what happened and what not, okay? So continuing with the presentation, this is our agenda for the talk. First, I'm going to introduce you the different components and why we choose them, their benefits, their pros, their cons, and why they are a good fit basically for an event driven architecture. That's the discussion for this talk, okay? As I said, I'm going to do a step-by-step walkthrough of deploying the entire DevTap Island live in the talk. So I hope that the demo gods are with us because it's going to be a live demo. Using different tools like StreamSeq, Kubernetes, and other open source tools, as I mentioned. And finally, I'll do a wrap up and I will share any thoughts we have in the last minute and open the floor for any last minute questions in case there are no questions during the live demo, okay? So, let's start talking about the different components we have. The first component we have is OpenShift. OpenShift is an application development and delivery platform. That allows you to streamline your workflows to get production faster into production. It includes, for instance, building Jenkins pipelines, for example, but it includes a wide range of technologies and a wide range of things, okay? You can also bring together development operations and security teams under a single platform to modernize existing applications. Imagine you have the classical monolithic architecture. You can start bringing your monolithic application to OpenShift and start breaking it into smaller pieces. Or, in case you already have a microservice architecture, that is the new trend, right? We are, in software, we live in the cycle. Now, people are starting to destroy microservices and going back to monoliths, right? The server is a cycle always. And, of course, with OpenShift, you can also go to edge computing. Essentially, you can compute, you can deal with data, you can compute close to the device, close to the source of the data, close, or as close as possible to the source of the data, okay? The next tool we are going to talk about is StreamC. StreamC is, essentially, a Kubernetes operator that allows you to deploy Kafka in an easy way, right? As you see, as, and I'm going to prove it in the live demo, as you see, as with a single Jamel file, you can have your Kafka cluster running in Kubernetes. In this case, it's going to be OpenShift, but OpenShift is built on top of Kubernetes. Everything I'm going to share is works on OpenShift, but it works on Kubernetes as well. So, a typical deployment incorporating Kafka components, and that will be our case today, might include the Kafka cluster of broker nodes. It also includes the Sukiper cluster of replicated Sukiper instances, and also the Kafka cluster for extended data data connections. That will be our case today. We're going to deploy a Kafka cluster, okay? Then, another component, the one I'm more proud of, basically, because I work with a team, right? It's a peculiar registry. A peculiar registry, essentially, it can work basically as two things. In our case today, the use case that is interesting to us is the schema registry use case. What is a schema registry? A schema registry, basically, imagine you have an application that is sending messages to Kafka. Usually, those messages are serialized using a schema, and are serializing using a format, and there it is, Abro, protocol buffers, JSON schema, whatever schema type you are using. So, instead of sending the whole schema with your message, you just identify it and externalize that information to the schema registry application. Okay? So, you have these benefits that you can see here. You can decouple the schema from your application, no more having the schema defined inside the application. You have a central location and a single source of truth for all your schema definitions. You can have different schema registry deployments, one for production, one for preproduction, etc. Or just one, and having different access management controls. And, of course, one of the very interesting things about the schema registry is that they are a centralized schema evolution place. You can define all the rules to ensure that your schemas can evolve in a safe manner, in the sense that you can define forward and backward compatibility rules for your schemas, and you try to define a new version of your schema that breaks those compatibility rules, then your producer won't be able to publish new messages, in this case, to Kafka, but to any messaging platform, preventing those messages from being corrupt for the consumers. Okay? Then, another very interesting tool for those who are not familiar with it is Divisium. Divisium, essentially, is a change data capture platform. And what's that? In our case, we are going to be taking information from a progress database, publishing the changes in the progress database to Kafka, and getting those changes into elastic search. How does Divisium does that? Divisium lets your apps react every time your data changes, in the sense that you don't have to change your application to modify the data. It continuously monitors the database changes, basically translating logs, as you can probably imagine, and stream every row level change from your application, you can see all the changes and deal with them as you see fit. Right? So you can, as you can imagine right now, you can take that functionality from your application and put it into a different service. No more, as you can see in the second point, no more need for a single application to deal with everything. I am going to update the database. I am going to update the search indexes. I am going to push the cache. I am going to send the notifications. No more need for everything in the same application, right? You can have that code and that functionality separated into a different place. And, as I was saying, it's looking at the transaction log, a very important part, because without that point, none of the others makes any sense. No matter the amount of your applications, when you have a stupid processing the changes from the database, the vision will continue from there. It doesn't matter the amount of your applications, so you are not going to miss any bit of your changes. Okay? Now we have finally the last tool, so to speak, that we are going to have in this presentation, that is Elasticsearch. Elasticsearch, as most of you know, is basically for searching. We are going to use it for build searching indexes for applications. And some of the characteristics is that it's very easily configurable. I can prove it in the live demo. It has very good scalability capabilities. And you can ingest data from any place, right? You can ingest data from a database. You can ingest data from any other place, right? You can ingest data from a lot of places. And now, getting into the interesting part. As I was saying at the beginning, we are going to use a PostgreSQL database in this presentation. That PostgreSQL database is going to be populated already with a few tables, in the sense that we are going to have, if I recall correctly, because it's been sometimes I look at the database. I was kidding. It's five tables. One with customers, one with product items, a classical database. We are going to deploy a PostgreSQL source connector that is going to take the changes in that database. It's going to publish those changes serialized into Kafka. How it's going to do that serialization? It's going to serialize that information using this serialization technology provided, in this case, it's a schema registry. It's a peculiar registry, the project I work for. So it's going to use that technology to serialize the changes in the PostgreSQL database. Send that information serialized into Kafka using Gapro. And then we are going to take those changes with the elastic search scene connector. And we are going to build a search index with those changes. You can see another element in the middle. That is the Kafka UI. The Kafka UI is a very interesting tool. It's another open source tool that allows you to browse your Kafka topics, your Kafka messages, everything that is related to Kafka. And we are going to see it in the presentation. Okay? And now, if you scan this QR code, it's a bit different, but if you scan it, it will take you to the GitHub repository that is the base for this presentation. So I'm going to do it. If you're curious after the presentation, you can go to the GitHub repository and you can do exactly the same as I'm going to do in the presentation. Okay? Exactly the same. So I can prove that it works and you can deploy the exact same thing I'm going to deploy. So let me actually exit the presentation. Okay? Let me change this thing in here. Just a moment. So we can see the same because in the other screen it's a bit too small for me. So a second because I want you to have the other part in here. Okay? I hope it's readable. It's fine. I can make it bigger. It's fine. Okay? Great. So let's start. This presentation basically starts with the very first component. Why? Because it's the component that every single component you can see the information in here. In case you don't know what I'm talking about. So the very first component we are going to deploy is StreamC, the StreamC operator. And the StreamC is the streamC operator. Now I can see that there are changes from the port address database to Kafka. So if I don't have a Kafka cluster to send those changes, I can't do anything. Okay? So for this presentation I already have an openJiv cluster deployed and available. It's basically very similar to a Kubernetes cluster, in that sense. La primera cosa que faré per aquests quarts familiars amb OpenShift és que hi hagi una mena aquí i hi haurà la cap d'operator. És un lloc on hi haurà operators i hi haurà l'operator de streams, que és l'operator de Kafka. I ara, què passa aquí? Això és creant un nou deployment en Kubernetes, en l'OpenShift Cluster. És una mica més o menys de temps per instal·lar streams. És la més clara cosa en la presentació. Perdona, perdona. Però, com veieu, l'operator ha estat creat. Ara estem esperant per l'operator per ser prou. I, quan és prou, continuaré amb la presentació. Bàsicament, això ens permetirà crear unes setmanes de definicions customes per la Kafka Cluster, per la Kafka Connectors, per la Kafka Connect Cluster, etc. Aleshores, veuré tot l'informació per la Kafka Cluster. Si és prou, si no funciona, i així, veuré tot. He promès que hi haurà 2 minuts. Si veuré l'operator, com he dit, ara, aquí, hi ha tota aquesta informació. Hi ha una etapa per la Kafka. Això és per la Kafka Cluster. Obviamente, no he creat una Kafka Cluster encara. No he fet res definit. Són per la Kafka Connect. I la mateixa cosa per la Kafka Connectors. Aquestos són els 3 elements que utilitem avui. La primera cosa és la Kafka Cluster. La segona cosa és la Kafka Connector en temps. I l'altra cosa és les Connectors que s'han desplegat a la Kafka Connector. Continuant amb la presentació, la primera cosa que faré és desplegar la Kafka Cluster. He tingut un avançament de definicions customes definits en aquest fil. Després que has aconseguit la repositoria, pots veure què hi ha. Això és un telèfon, i el telèfon que l'ha fet és un telèfon. I ara, a la costa de fer-ho, per exemple, ho faré perquè la ressona el telèfon del telèfon. A la resta de la resta de la ressona, la ressona de la ressona de la ressona del telèfon. I la resta de la ressona de la ressona de la ressona es fa a la ressona de la ressona de la ressona. i l'extremament consisteix en tres coses. Una és la clàssica, que és un coordinador per la clàssica, l'altra és el broker per a Kafka, i després aquest operador es va deployar una altra cosa, que és el operador per a aquesta clàssica en particular. Així que, com veieu, tenim tres potes de clàssica. Tothom és usually deployed in a-cluster manner, i aquí hi ha tres potes per a la clàssica de Kafka. Quan això és prou, creu una altra pota que serà l'operador d'aquesta. I, quan això és prou, la clàssica de Kafka serà prou. Ara, el que volem demostrar és que, si aniré a la instal·lació de la clàssica, veieu que ara hi ha una clàssica de Kafka creada a l'extremament. He creat la clàssica de Kafka manualment, utilitzant l'oci-appli comandant en el terminal, però la clàssica de Kafka és stillada per a l'extremament. I, com veieu, ara la condició de la clàssica de Kafka ha canviat. Ara tenim una clàssica de Kafka deployada. Continuem amb la presentació. I la següent cosa que deployaré és l'interface d'un user per a la càssica de Kafka. Què és l'interface d'un user? És el que vaig mencionar amb l'interface d'un user. I l'execucaré el comandament i pararé en el meantime. Això és l'UI per a la càssica de Kafka. A l'aigua m'ha d'obrir les meves tòpiques, les meves missatges, les esquemes, tot. Ok, doncs ara, si anem a la clàssica de la meva clàssica i anem a les deployments, vaig crear una deployment per a la càssica de l'UI. Vull veure que hi ha una deployment, hi ha un bot per a la càssica de l'UI, l'interface que funciona. Ara, si anem aquí, les routes... Un moment, per a l'interface, però veuràs que ara hi ha una molt bona UI per a la càssica de Kafka. En quin sentit? En el sentit que... Òbviament, no hi haurà cap tòpic aquí, a part de les tòpiques internals, perquè no he fet res. Però veus que aquesta interface d'un user és already connectada amb la clàssica de la meva càssica de Kafka. Ara, anem a deployar la segona cosa que és... Em porto el meu espai. I això anirà a deployar l'Apiguria Registri Operator. En el mateix sentit, és un operator de Kubernetes que et permeti managesar l'Apiguria Registri. Són amb la càssica de l'Estream for Kafka, això és el mateix, però per l'Apiguria Registri. Ok? A veure, anem i deployar l'Apiguria Registri Operator. Same thing. Fine, I have the Operator installed, but I don't have any Apiguria Registri instance available to me. Ok? So, now, same thing, as I have done with StreamSea. I have the only difference is that for StreamSea I have installed the StreamSea operator from the Opensiv console and for Apiguria Registri I have installed the Operator manually. We also have the Operator available in Operator Hub for Apiguria Registri. It's just more convenient for me to do it that way. Ok? So, as you can see here, this is going to create an instance of Apiguria Registri. And you can see two things. The very first thing is that the new Apiguria Registri instance has been created and the second one is mentioning that a Kafka topic, Kafka StreamSea has been created. What does that mean? That means that Apiguria Registri can work into different manners. Ok? The first one is a more traditional, if you will, in the sense that for storing all the information it's going to use a relational database. It can be Postgres, it can be SQL Server, it can be MySQL, and you get the idea. Ok? Or the alternative for those who have a Kafka cluster already available, like us in this case, you can use a Kafka topic for storing all the information, for storing all your information, all your schemas, all your rules, everything that Apiguria Registri manages. Ok? And now, if I go here, you can see that I have a new, another route for my Apiguria Registri instance. The first time it takes a bit of time. The cluster is a development cluster, so it's not really, really fast. So you can see here that they have now my Apiguria Registri instance without information, but it's available. For now we are only deploying the different components. We are going to use the demo later. Ok? So, the very next thing, for those who don't remember the beginning of the presentation, one of the last things we need is Elasticsearch. So, for deploying Elasticsearch, the very first thing I need are the CRDs for Elasticsearch. Ok? So I can create an Elasticsearch instance, I can deal with all of that. So, let's go ahead. And you are seeing Unchange. I wanted to do it for the presentation. You are seeing Unchange in this case, because I already have the CRDs for Elasticsearch installed in this cluster. Ok? But you would have seen created in this case instead of Unchange. And the second step, as with StreamC, as with Apiguria Registri, is to install the Elasticsearch operator. And in this case you can see that all the resources required for Elasticsearch, the Elasticsearch operator are being created. Now, as I was saying, the very same thing. Ok? I need to create my Elasticsearch cluster, so I am going to go ahead and create Elasticsearch instance. As you can see here, Elasticsearch has been created and also a route for it to be available has been created. Ok, now, if I go to routes, if I go to Elasticsearch, if it's already available, if not, I will go to the bot itself to see what's still not available. So, you can see here that it's being idealized. Now, if I refresh this thing, ok, now it's available. The classical message hello world in this case for Elasticsearch is, you know, for search. It's quite obvious, this way you know that the Elasticsearch instance is working. Ok? So now, what else are we missing for this presentation? I said at the beginning that the very base for the presentation was going to be the PostgreSQL database. Without it, there is no presentation, without the PostgreSQL database. Ok? So, the next step is going to be to install the PostgreSQL database. For convenient reasons, as I said at the beginning, this PostgreSQL database already has a few tables created, five tables in this case, and some data, a few rows for each table. Ok? So, now I have the PostgreSQL database. Fine, it's there. I don't need to check it now. And the very next thing is going to be when I was talking about the StreamC, I talked about the Kafka cluster, also keeper, and I talked about KafkaConnect. KafkaConnect is a runtime that allows you to deal with external data, in the sense that allows you to deal with, in our case, with PostgreSQL data, and to send data to Elasticsearch. But to be able to deploy my connectors, first of all, I need a KafkaConnect cluster. Ok? So, let's go ahead, and let's create a KafkaConnect cluster. This is going to create, basically, two pods. The first one is going to configure the cluster, and the second one is going to be the actual KafkaConnect cluster. Ok? So, let me filter this thing here. As you can see here, the first one, the build one is running. Once this pod is done, the KafkaConnect cluster will be configured, and another one will appear here, that will be the actual KafkaConnect cluster. As you can see here, this one has been completed, and, as I promised, another one has appeared, that is the actual KafkaConnect cluster. Now, we need to wait for this one to be ready, but, same as I said in the meantime, same as I said with Kafka, with the Kafka cluster itself, if you go here to KafkaConnect, you can see that the KafkaConnect cluster also appears in the StreamC user interface. So, I have created it manually again, but it's still managed by the StreamC operator. Ok? Now, this status is not ready yet. Let me check the connect pod. Usually, it's just that I'm not patient enough to wait for it to be ready, but sometimes it works checking the logs, as you can probably imagine. For now, everything is going smoothly. I actually think it might be ready. So, it was that... I'm just not patient enough to wait for it to be ready. Ok? Now, getting into the interesting part of the presentation. Ok? I'm going to deploy the very first thing, the PostgreSQL source connector. Source in what sense? In the sense that this is the connector... Sorry, let me deploy it. So, source in what sense? In the sense that this is the application that is going to take all the information from the PostgreSQL database, it's going to serialize those changes, publish them into Kafka. So, all the changes from the PostgreSQL database will be available in the Kafka cluster. Ok? Now, if I did everything correctly, if I go to install operators, and the Kafka connectors are at the end. Ok? You can see that now I have a PostgreSQL connector with the condition ready. And now, getting into interesting things. What happened here? Please. Collecting to the changes in PostgreSQL is it only insert update or also table changes? Take the changes. We will see it in the presentation. Taking the changes. Ok? So, as you can see in here, now, for each table in the database, as I promised, there were five tables. Ok? For each table in the database, a new topic in the Kafka cluster has been created. So, for instance, I already had some data in the database. If I go here, if I go to the messages, here we have two interesting things. The first one is that in here, you can see that the Kafka UI is not able to des-realize the key for this Kafka message. What is going on? I'm telling you that the information is a string. And as I said, that is not true. All of this is Avro binary data. So, that is not only that this Avro binary data, but that schema is not present in the message itself. It's present in the schema registry. So, if I change this, if I change the key server, you can see already that the value server is set to schema registry. If I do this, you can see that the key can be des-realized on the fly in the Kafka UI. And if I go to my Apigurio registry UI, and I refresh it, you can see that for the values, for the keys and the values for those messages, a new schema in the schema registry has been created. In the sense that I have this, this is the customer that they was showing. And if I go to the content, you will see that this, basically this corresponds, this is an Avro schema that has been inferred from the information in the database. And this is the schema that is now present in my schema registry. So, this is the schema that is going to be used to serialize and des-realize all the Kafka messages that are going to be on this topic. This allows what I was saying at the beginning. This allows to enable schema evolution in the sense that if I add a new column to the database that is in some fashion incompatible with evolutions, that message might end up being in the Kafka cluster because the producer won't be able to serialize that data. Okay, that is an interesting part. And now, let's continue with other interesting things. The next interesting thing is, we have to separate parts, okay? The first one is information from the Postgre database going to Kafka, and the second one is information from Kafka going into Elasticsearch. For doing that, I need to create the Elasticsearch sync connectors. Connectors, because I have five topics, one for each table, five sync connectors, okay? So, if I do it, you can see that one, two, three, four, five. And in the StreamCUI, you can see that my sync connectors are already present in there, okay? Now, the very interesting thing, here it says what I'm already saying that you can see it in the StreamC. And now, the other interesting part is that this is an extension that is mentioned in the guide. So, don't worry, you don't get something of what's going on. This extension that I'm going to use is mentioned in the GitHub repository and in the guide, okay? So, if I go to this extension, here I already have my Elasticsearch cluster connected. If I go here to indexes, what happened here? I already have five Elasticsearch indexes created, one per each topic, one per each table in my PostgreSQL database. So, if I go to customers, I'm showing customers because they are just convenient for this case. You can see that I have four. Customers, that is information that they have available in the database. If I go to the Kafka UI, you can see the very same thing. You can see that there are four Kafka messages, there are four customers, okay? It's the very same thing. So, I have the information in PostgreSQL that information is flowing to Kafka and finally getting into Elasticsearch. So, all the data pipeline is finally connected, okay? But now, let's do something more interesting. This is fine. I took the data from the database, the data is in Kafka and the data is in Elasticsearch, but let's use this, right? And let's see how fast this is to actually send information from one place to another, okay? So, what I'm gonna do first is to connect to my PostgreSQL database. Okay. And now, let's... And as we seen in the Kafka topic in the Elasticsearch, if I do this, I have four customers, okay? Fine, great. What happens if I do this? Basically, inserting a new one, okay? What is gonna happen in here? It's already in my Kafka cluster, it's already present in there, in there, and what is gonna happen in here? It's already in Elasticsearch, as well. A new row has been inserted into the database and in a matter of milliseconds I'm actually slower typing to move from one place to another than the data flowing from one site to another, okay? So, in a matter of milliseconds, the data has moved from the database to Elasticsearch. But let's make it a bit more interesting in the sense that let's update this customer email to something actually different. And you can see that the email has already changed en Elasticsearch. I'm not even going to KafkaUI, I'm not even going to Postgres, it's already changed in Elasticsearch to see and to prove how fast this kind of setup how fast this kind of deployment is. And finally, kind of the kind of the same, but very similar in the sense that I want to show something a bit different in the KafkaUI, okay? I'm gonna delete this customer in particular. Fine, if I update Elasticsearch, disappeared, no more. And if I go to the KafkaUI, something interesting is gonna happen. Instead of just one, now I have two Kafka messages for this customer in particular. What's happening here? So, the first one is pretty obvious. I have some information for this, related to this message and this was the information I had. The third name, John, Doe and this is the email. And then, the data after, obviously, because I have deleted this customer in particular, there is no more, this customer is no more in here. The second message, and that is the interesting part, is a Thompson message, in the sense that when Kafka runs no compaction, it's gonna detect that all the messages associated with this identifier can be removed. Because they are no longer relevant for dealing with this data. There is no more John Doe in the database. So, it doesn't make any sense to have John Doe in the Elasticsearch index. So, why having the messages in Kafka? Whenever compaction runs, just get rid of those messages. We are using storage for no reason. Okay? And this is basically all I wanted to show in the live demo. The conclusion, as you can see in here, is that is what I was saying, is that I'm actually slower typing and moving from one user interface to another than the data actually flowing from one component to another. The human in this case is slower than the actual components. That is the very interesting part. And, as I was saying, you never miss a bit. You have here messages in Kafka for updating everything. And that is it. I mean, this is the live demo. As I said, in the GitHub repository you have exactly the same set of commands I have executed. The only thing you need is either an OpenGIF cluster or a Kubernetes cluster, and you can do just don in the presentation. And now, let me go back to the presentation just because I want to show the classical one last thing in the sense that in this presentation I have talked about a few projects. The first one was StreamC. StreamC is already a CNCF sandbox project. I haven't mentioned one of the cool integrations of these two projects, that is Keyglock. Keyglock is already a CNCF incubating project. And I mentioned it here because it has a nice integration with both Epicurio Registry and StreamC. And for Epicurio Registry we are a younger project than StreamC and Keyglock. We are already present in the CNCF landscape. And last week, if I remember correctly, I submitted the sandbox proposal for CNCF. And that's all. Thank you very much for attending the presentation. And we have a few minutes left. If there is any question, I believe. I didn't understand exactly where was Deepism involved in the demo here. It's in the KafkaConnect cluster. It adds capabilities to KafkaConnect. Epicurio is usable with other streaming storage than Kafka i has a bunch. No, no. We are working on a new version of Epicurio Registry. We are aiming at having support for other streaming platforms. I'm actually working on the issue for having support for Apache Pulsar. We are not there yet, but we are trying to get there. Thank you. If you don't have any questions right now, don't feel free to drop an email or anything. Everything is open source. You can even go to the GitHub repository and open any issue if you want.