 Welcome everyone. This is our talk, Metamorphosis, when Kafka meets Canon. Talking here today, we have Jakub Shouz, who is a principal software engineer at the HeadHat messaging team, and myself, Otavi Pisci, who works as a senior software engineer on the HeadHat Fuse team. On the agenda today, we will discuss a little bit about challenges of application integration. We will cover the basics of the project that we want to introduce you. We will finish with some live demos and closing comments. So as application developers, one of the challenges that we always have is to integrate systems. We continue to have to make our systems talk to other systems and talk to the external world. This is a challenge that keeps evolving, and with the introduction of tools such as OpenShift, Kubernetes, Kafka, there are a new set of functionality that we can use to simplify the job of integration. And this is what we are going to cover today. So, what is Apache Camel? Apache Camel defines itself as the Swiss knife of integration. It is one of the largest and most active Apache projects. It's widely used all around the world to support multiple products and projects. Camel is an open source integration framework that is very flexible, supports multiple data formats, protocols, products, and tooling. Developers using Camel can create routes that implement enterprise integration patterns to interconnect systems and define mechanisms that allow them to define rules that can be used to filter, transform, or process exchanges that are going from one system to another or multiple systems. Those enterprise integration patterns such as the content-based router, splitter, aggregator, they can be used to solve problems such as how do we move data from system A to system B or C based on the contents of the data being transported? Or how can I combine multiple pieces of data and send it as one to a REST API? To interconnect to these systems, Camel uses what it calls components. Components are the primary extension points that allow Camel to talk to external systems. Camel has more than 300 components supporting multiple products, protocols. For example, Camel has components to talk to Salesforce, to talk to AWS S3, AWS SQS, and so on. As well as protocols. For example, AMQP, OpenWire, HTTP, FTP, and so on. So now that I gave you an overview of what Camel is, I would like to present a quick demo that shows how Camel can be used to transport data from a GMS broker to a Kafka broker. This is not a live demo. It is something I recorded. So Camel has multiple domain specific languages that can be used to create these routes that do the work of moving data from one point to another. This is a very simple example that I wanted to show, to show you how Camel can be used for that. It is a simple example that contains basically two classes and the work doing here is rather simple. We have, for example, the system basically reading some parameters from the environment to know from where to read data from and where to send data to. After we read those parameters, we use those to instantiate the GMS to Kafka route which is what actually configures the route on Camel to set up this GMS to Kafka exchange. After this is created, then we have this is passed to the Camel main runtime which is then configured via the route. The important part of the route configuration is it is on this configuration method. Basically it is doing three things here. First of all, it is configuring the SGMS components, passing the information for what is the broker that is going to be accessed and what is the address of the component. Just go back a little bit and what is the address of the broker then followed by a formatting of the URL used to configure the route and then the actual configuration of the route on that from to. Now, with this app and running, then we can actually launch an environment running this application with a mock set up to simulate data flowing from this GMS broker to the Kafka broker. On this example here, I have an environment that is set up using Docker compose which is going to bring up GMS broker, this integration tooling and the Kafka environment with Kafka and Zool Keeper plus one tooling that will simply read from the Kafka and display it on the screen. It takes a little while to get started and this shows the route that was configured. If you are using some, if you are having the configuration outside on a particular file, for example, this is what you would have configured there. We have the usage of two chemical components here, the simple GMS two and the Kafka component. They are used for executing this interconnection between those systems. With this app and running, we can go and access the management console of our GMS broker to insert a message and simply simulate some traffic flowing to the GMS broker. We have configured here this queue, called demo queue, which is the source point for this exchange. Here we are adding some data to simulate the message and send it. So, done. Basically, what is happening here is that Camel is reading the message from the GMS broker and executing the route to Kafka, which is done read by that custom client. Now, this is nothing overly complex and probably many people here already know it and the idea was to show what was the steps and what was the process in order to configure an integration solution based on Camel. If you are planning to run this as a support for a product or, you know, to run it on your customers, there are many other qualities that you would need to take into consideration for that. Security, scalability, reliability and so many other things and although Camel has awesome support for that, these are all things that you would need to take into consideration and add it to your application. If you are a Kafka shop and if you are expertise relies more on the Kafka side, you may not have the experience or may not be comfortable with adding those features. And this is what we will leverage further on the Subsequence slide. Right. So, that was a great introduction to Camel. Now, we will do a quick introduction to Kafka itself. How many of you heard about Apache Kafka and I hope all hands will be up? How many of you actually used it or are using it? Right, that's not so many. So, a lot of people know Kafka for its messaging broker or streaming platform and so on. And they don't realize that Kafka is more than just the broker. It's more an ecosystem than the broker itself. The broker is, of course, at the center of it, but there are the different consumer, producer and streaming APIs. There's the Mirror Maker and today Mirror Maker 2 as well. There's a lot of third party integrations if you use Spark or different frameworks. A lot of them have Kafka support built directly into them. And one of the components which are part of this ecosystem is Kafka Connect. How many of you know or used Kafka Connect? Right, that's significantly less hands than what we had for Kafka itself. So, Kafka Connect is really an integration framework which is part of the Apache Kafka project. And the idea really is that you have some external system. You use the Kafka Connect to get the data from the external system into your Kafka broker. And then you can do something with them, of course. And then you use again Kafka Connect to dump the data to some external system again. And the Kafka Connect framework is using something called connectors which are basically plugins which do the actual integration. So, it's really just a framework where you plug your own connectors or connectors which you downloaded, which someone else created, and then you use them. And so you can really go and write your own connector. It's really quite easy. You basically just implement two interfaces in some Java classes. And a lot of people are asking, does it really make sense to write your own connector when you can write your own kind of consumer producer and so on? And the advantage of Kafka Connect is that it does some things for you. So, A, it's distributed and scalable by default. And I will talk later a bit more about how it scales and distributes. It has automatic offset management, so you don't need to care as much about things like reliability and commits and so on as you would if you write applications really using the consumer or producer API. It has built-in support for some simple transformations when you need to transform the messages. And it kind of works very well to transition between the things like streaming and batch processing and so on. And so here are some key Kafka concepts which would be useful later when we are showing the demos. So we already talked about the connectors. The connectors are always either sync connectors or source connectors. So basically on this picture, the source connectors are used as a source of data for Kafka. Sync connectors are used to dump the data somewhere else. And these are always separate connectors. So if you just want to ingest the data from something into Kafka, you don't need to write both sync and source. You can write just one of them. And then the connector itself, it's kind of a bit of a virtual entity. And it has something called tasks, which are again basically always in the sync or source tasks. And these tasks are what's actually doing the integration itself. And then two more things to mention is the key value converters and the transformers. So the transformer is what does the transformation when you use it. And the key value converters is what tries to convert the data which you are ingesting into Kafka or sending out of Kafka somewhere to kind of convert them between the different systems and make sure that in Kafka, they are usable as a regular Kafka messages. In the external system, they are usable with whatever is native there. And then the Kafka connection is really running two different modes. One of them is the standalone mode. There is always kind of a single worker. And there's some configuration file locally which the work reads, creates the connector, and stores the offset of the Kafka messages which it processed locally and so on. And it's useful, for example, if you have some application running on some VM, you want the application to read some data from the file, but you don't have the data in the file, you have them instead in some topic in Kafka. So what this standalone connect can, for example, do, you can kind of run it on the same VM. You can read the messages from Kafka and kind of simply dump them into a file. And then the application can read them from a file. So that's kind of one of the examples where the standalone connect can be quite useful. In the OpenSheet or Kubernetes Vault, you can, for example, imagine this as a sidecar pattern with multiple containers in the same pot. Then kind of the bit more interesting mode is the distributed mode where we really have multiple workers running as part of the Kafka Connect cluster. One of them is always the leader. And they synchronize through the Kafka broker itself so you don't need to have any additional software to the leader election synchronization and so on. And they store everything inside the Kafka broker itself, so the offset, the configuration and so on. The workers themselves, they are basically stateless. And the way it works is, you have here, for example, the cluster is free to workers. And when you deploy the connector, the connector has one or more tasks. And the tasks are always scheduled into the different workers. So you can see here the connector one has just one task running here. The connector two has two tasks, three tasks, actually. The connector three has two tasks and so on. And then one of these would die. It can reschedule somewhere else and continue working somewhere else. So this was especially interesting in the time before everyone was using Kubernetes and OpenShift. To some extent today it's a bit competing with that because a lot of this functionality can be easily built even without Kafka Connect. You can just have pots which scale, they restart and so on. But it's still quite useful. And in the last demo at the end I will show how you can use it on Kubernetes and OpenShift as well. Okay. So what happens when you mix these two super flexible, super powerful projects? So let's meet the Camel Kafka connector. The Camel Kafka connector is a Kafka connector that was built on top of Apache Camel. At the head hat we started it as a proof of concept to evaluate the feasibility of whether this would work. And after all successful proof of concept we donated the code to the Apache Software Foundation and it became a sub-project of the Apache Camel. The idea is that it makes it possible to reuse the Camel components in a very simple way. In general, basically what you need to do is configure a properties file, fit it into the Kafka connector in time, and that's it. As I said, Camel has support for more than 300 components. Initially, for the Kafka connector, we are focusing on a selected list of 11 components. However, it's very likely that many of the components from Camel already work out of the box. However, it is necessary for us to ensure that some tests are executed and there is some quality assurance and that's why we are selecting these 11 components initially. As a project, as I said, it was recently donated to the Apache Software Foundation. It was well received by the community. The project is already receiving some contributions and we are initially working towards a stable release. We have been working through our issues and to do to make sure that the project has all the flexibility and quality assurance that would be expected for something that is available to the Apache community. I would like to show you a demonstration of Camel Kafka connector running in standalone mode. It's basically the same thing as I showed on the first video, but this time running through the Camel Kafka connector. As I said, the idea is that you basically configure the integration in one configuration file and you feed that to the Kafka runtime. The configuration file, if you look at it, it basically has two big segments. The first one is used to configure the connector-specific things. What is its name, how many tests, connector class and the converters for what we explained a little bit earlier. The second segment on the configuration is basically what is used to configure the integration itself. It points to what are the endpoints that you are going to use, what are the components that will be used for the integration and, of course, some component-specific configuration when that is needed, which is the case for the simple GMS2 component that we needed to configure the connection factory and the remote URI for the broker. One additional configuration that is needed is also with regards to the bootstrap servers. This points to the Kafka instance that you are reaching to. Again, with the environment in a similar fashion as the one before, we can run it through the GMS broker and send some message to simulate the traffic. Very much the same thing as before is going to happen. Once we add the message to the Artemis console simulating the traffic, this will be consumed however this time by the KML Kafka connector instance and the message will be displayed on the screen. Basically adding the contents and hitting the send for... We should have the message here. Now, as you could see, it is significantly simpler, I would say, to get started with integrations in this format, especially if your expertise is on Kafka and you already expertise it with Kafka, so this makes it quite easy to get started. However, as a simple solution, this would probably not even be enough and that's where comes the demo from Jakub. So, just give us a minute to switch to laptops. Right, so what I will show now is how you can run this whole thing on OpenShift or to borrow some Kubernetes, but I'm using OpenShift 4 in AWS. I will try to do it live, so maybe we will have some fun if you run into some issues. I will use for it a project called Strimsy, which is something that I'm working on. It's a project which is part of Cloud Native Computing Foundation and it's a set of operators for running Kafka on Kubernetes. So, I actually have already... My Kafka cluster deploy, you can see the zookeeper ports, the Kafka ports and so on. So, that's already ready because it takes a few minutes always, so I saved some time, but I didn't deploy yet the Kafka Connect. So, to deploy Kafka Connect with the operator, we need to here do several things. So, I will create some Kafka topics, which I will be using later by the connectors. So, this is some topic called Telegram topic, some topic called S3 topic. Might give you an idea what I will be showing later. I also create a user, so this demo is using authentication and authorization and so on, so I need to create a user which will be used by the Kafka Connect to connect to the Kafka cluster. And last but not least, I create the Kafka Connect deployment as well, and this will be the distributed Kafka Connect deployment. Also, to be honest, the cluster is not that big, so I'm actually running just one worker node, and I will anyway not run that many tasks to actually need more, but it's the distributed deployment. And because I will be using things like AWS and I don't really want someone to start mining bitcoins on my AWS account in five minutes, then I will use secrets to kind of push credentials for AWS and the API keys for Telegram into the Kafka Connect connectors. So, what I need to specify here is to mount two secrets with the credentials, and I will later use them to configure the connectors. So, to deploy all of this, all I need to do is kubectl apply, and the connector always needs some time to start and to pull the image, so in the meantime you can have a look at how we will deploy the connector itself. So, we again use the operator pattern to deploy them. If you would run it outside of Strimsy or somewhere on VMs and so on, there's a REST API that you can use. Jason, which he would put into the REST API interface looks very similar to this. It will be basically the spec section of the custom resource mostly copies the REST interface. And what I'm doing here is I say, okay, I want to use the Kamba source connector. So, that's the source connector which is based on Apache Camo. I want to run only one task of this. That's here. And then I need to specify the key and value converters so to make it a bit easier for us to see what we are sending, I will just use the Strim converters to get it as strings. Then I need to specify the topic where the messages will be sent. So, in this case, we will connect to Telegram if you know that that's this messaging or instant messaging application. So, we will read messages from it from some bot and then we will push them into some Kafka topic. And then this is very important part, the Kamba source URL, which is basically how we tell the Kamba source connector that we're going to use the Telegram component. And then this part is really just taking the API key for Telegram from the secret and using it in the configuration of the connector. So, let's check if the connect pod is already running and we can see that it's there and it's ready. So, I can just do OC Apply on the connector file. That will create the connector inside the Kafka Connect. And I can just check the status of the connector. So, when I scroll down, I should be able to see here that the task is running. The connector says unassigned. It always takes a few seconds to change. But the task is already running, which is important. And let's go to another window and open a consumer inside the OpenShift cluster which would connect to Kafka and read the messages. And now I can simply switch into the Telegram web console where everyone there can create these bots. So, I have created one of these and we can send here some message. And when I switch back to the terminal, we should be able to see. So, here we can see the incoming message and you can see here the hello world text. If you have Telegram in your application, you can try it and send some messages. But please don't try it because it will kill the cluster, probably. So, that works. We can try another message just to make sure it really works. So, let's try it in check. And we should again see delivered. Yes, here it is. So, we have the messages from Telegram in Kafka now. Now we should do some super sophisticated processing, right? We should have some AI which would do some customer support or something like that. I'm not really going to do that. I will pretend that someone did that and I will now just take the messages. Oh, and I can see now someone is sending something. So, I can just take the messages and let's say I'm using some AI service from Amazon AWS. So, I want to dump them into Amazon AWS SQSQ. So, I can just take another connector. And if you look at the configuration, it looks very similar. The only difference is that this time it is kind of sync connector. So, we will take the data from Kafka and push them somewhere else. And the source of the messages will be the same topic where we push the messages from the Telegram. And here in the URL, we specify that we want to push them into Amazon SQSQ which is called MyQ. And here again from the secret, I load all the things like credentials for AWS and so on. So, let's do OC apply. And we can again just double check the status. We can see that it got quickly to running. And here I have the AWS console. You can see that I have the queue already existing here. And I have already a bunch of messages there. So, yeah, let's start pulling for the messages. So, we have six of them. Let's delete them without reading because they were anyway all just test messages. And let's try to send something new. So, let's send HelloSQS. And we can see it already here. HelloSQS, so we received it in the SQSQ. So, you can really easily use the power of Apache Camel to power these connectors and use just the connector operator and Kafka Connect to integrate things very easily. To show you a bit more about some of the challenges, I want to show one more connector and this time it's for Amazon AWS S3 storage. So, let's look at the configuration. It looks really similar to the SQS one. And I want you to notice one thing. So, here we are using again these string converters. So, we want to load the files from AWS S3 and we want to send them into the Kafka topic as strings. So, let's create it. And let's go back to the browser or let's first show you... This is the kind of sample file which I will try to upload into S3. And let's see how we receive it. Let's kill this telegram connector and let's start another client which will read the messages from a separate topic which will be used for the S3 files. So, that is ready now and I can go to the browser and I can upload a file. So, upload sample one, upload and we can see that the file is there and when I refresh then the file disappears because right now the connector is created to delete the files from the S3 storage after they are pushed into the Kafka topic. But then you look into the console we can see that what we got is this, right? So, probably got some S3 object input stream object. So, it's nice that we know the class name. It's nice that we know some ID or hash or whatever that is. But it's not entirely useful, I guess. At least not for me. So, what can we do about it is we can try to fix the connector by specifying the right converter. So, here we say the value converter should not be anymore this string converter which is shipped with Kafka connected by default but we want to use this Orca patch handle Kafka connector converters S3 object converter. So, let's apply this change. Let's make sure that the update was propagated. We can see that it is running and we can see that it's generation two. So, let's try to upload another file which is, again, a very sophisticated text file. So, upload sample two. Let's upload it and it should disappear again. Okay, disappeared. And let's go here and we can see now we got the right text, the second sample file for S3 upload. So, that kind of shows that while it's super easy to kind of take whatever URL or component you have and use it in the Kafka connect you kind of need to be very careful and make sure you have the right converters and you are able to convert the messages. Otherwise, you might get this instead of this and the connector would be kind of useless. So, and that's also one of the things which is kind of still work in progress from the beginning we decided to focus on few selected connectors to make sure that we kind of figure out what's the best way how to do these converters and how to make sure that these things are working. Okay, so back to the slides. So, I hope that you liked what you saw because we are quite happy and excited about this project. It combines the features of two very great Apache Software Foundation projects and as a kind of guy who is more on the Kafka side it's amazing for me because it opens the doors for many new connectors for Kafka and many new integrations and it also brings kind of this stability and maturity and experience of the KAML project into this connector world. So, if you work with Kafka Connect you know that there's a lot of great professional connectors supported by companies but there's also a lot of connectors who someone wrote two years ago put them on GitHub, they work at that time nobody maintains them, they don't work anymore or it's tricky to get them working and so on. And in Apache KAML we have 300 or something like that components, connectors which are there, they are ready to use and the community has a huge experience with maintaining them and keeping them working. And on the other hand what kind of, I think Kafka Connect brings into the game is a bit more of this simplicity and distributed nature which the Kafka Connect can offer and for a lot of people especially already using Kafka Connect the way how they can run these connectors in Kafka Connect is kind of easier and better than to for example write their own Java code for the KAML integrations on their own. So it's kind of as a Kafka user you can get a lot of new options and as a KAML user you can also get kind of this jump star into the Kafka world because you get the first class Kafka Connect integration. So that's it for this talk. We wanted to mention monomorphing we are hiring and we are not really saying here just as a redhead in general but we have some open positions in the teams working on these things so if anyone would be interested you should definitely get in touch with us after the talk. So that's it and if you have any questions now's the right time to ask. Is the demo code available somewhere? I can make the demo code on OpenShift available I can add the link to the slides and to the talk on the DevCon page. Can you listen for yours? Yeah the code for the videos are also available I can make it available in the same place. So actually yeah this is where you should be able to find the slides and if you give us still the evening or till tomorrow we will add the links for the code and so on and you should be able to find the slides on the DevCon page as well. Any other questions? Then I guess thanks for coming and staying here for so long I think this is the end of today so thanks. Thank you.