 Hello everyone, this is Feymen from Beijing, China. It's so glad to be here. It's my second time to attend KubeCon EU and I was looking forward to attending this meeting in person since we have a couple of friends and partners in Spain. But unfortunately, due to the pandemic, we can only give this speech at the front of screen. So I hope everything could be recovered soon and we can meet everyone and have a cup of coffee in person. So before we get started with this presentation, let me briefly introduce myself. So my name is Feymen Zhou. I'm from KubeSphere team. I'm a senior community manager and QingCloud KubeSphere. And I'm also a CNCF ambassador, CDF ambassador and fluent member. My skills include but not limited to Kubernetes, Linux, Flunbit, FlunD, DevOps and serverless. And I'm really enjoying technical writing, advocacy and outreach and host events. All right, in this talk, I will demonstrate how to build a cloud native login pipeline on the edge with FlunToperator. So in this presentation, I will walk you through the challenges of logging in Kubernetes, especially in the enterprise environment. And next, I will introduce two popular logging solutions you might have ever used or heard about it, the Flunbit and FlunD. And next, I will introduce and demonstrate how FlunToperator empowers Flunbit and FlunD. And then I will give a little bit deep dive into FlunToperator and talk about these architecture and workflow. And finally, I will give a live demo and talk about this use case in CubeSphere. All right, when it comes to the challenges of logging Kubernetes, we always receive some demands and complaints from different teams for security and compliance reasons. For example, our developers said that, hey, we have a huge amount of logs produced every day. They come from different data sources and data formats. Our administrators said that, hey, you have to make sure to troubleshoot your logs in a lightweight and secure solution. You also need to keep everything traceable. And our security teams requested that, hey, could you ship the logs and data to multiple destinations and outputs to audit and visualize them? Oh man, all of these things are in a chaos, right? So in the real enterprise environment, we actually have some logs come from different places such as a metal servers or virtual machines. They can also come from embedded devices, edge, container, port, TCP or UDP. And all of those data are in different data formats, such as JSON logs, Apache logs, NGX logs or container logs. And in some typical cases, you might want to ship them to different destinations such as elastic search or open search, loggy, Splunk, MongoDB or S3. Well, considering data security and reliability when processing logs in a multi-tendent environment, you have to isolate the log data and make it only visible to a specific user or in a specific namespace. Okay, how do you guys debug your Kubernetes workloads in your daily job? I think the native method such as kubikata commands is the most popular way for you to retrieve the logs from a specific container. Apart from that, if you are running your applications and infrastructure on the public cloud, for example, agile on AWS, you might also adopt the logging solutions that are powered by the cloud providers such as Stackdriver, CloudWatch and something like that. So you might also find that there are some SV solutions such as Splunk, Summologic, Datalog, et cetera. So far, we also see a lot of popular open source solutions such as ELK, Loggy, Flonbate and FlonD. But in this talk, we will only focus on the open source logging solutions like Flonbate and FlonD. You know, handling data collection as scary is complex and collecting and aggregating diverse data requires a specialized tool that can deal with the scenarios like different sources of information, different data formats, data reliability, security, flexible routing and multiple destinations. That is why Flonbate comes in. When we take a look back at the history of Flonbate, you will find Flonbate, this project was started in 2015. Now it has been a sensitive sub-project under the umbrella of FlonD ecosystem. Flonbate was written in C, which is a lightweight and zero-dependencies project. Flonbate is also applicable. It has around 70 plugins available. Last but not least, Flonbate is quite lightweight since it has low CPU and memory usage. Okay, Flonbate also has monitoring and stream processing capabilities that are not listed in this slide. So far, Flonbate had reached 1 billion downloads and had been adopted by thousands of organizations such as AWS, DigitalOcean, Microsoft, KubeSphere and so on. As you can see from this graph, this is a data pipeline that represents a flow of data that goes through the inputs, filters and outputs. The input plugin, which is used to get the information from different data sources. Parts of which is used to convert from unstructured data to structured data. Filter is used to match, exclude or enrich logs with some specific metadata. And output is used to define the destinations for the data. For example, remote services, local file systems, logi, Kafka or something like that. Okay, Flonbate is a data collector which allows you to unify the data collection and the conception for better use and understanding of data. Okay, you will find that Flonbate is much mature than Flonbate, which was started in 2011. And it has also been a since that graduated project. Flonbate was written in C and Ruby. It is also plugable and has around 1,000 plugins available. So to summarize, Flonbate allows you to build your own unified login layer. And this layer allows developers and data analysis to utilize many types of logs as they are generated. The most important thing is it mitigates the risk of bad data slowing down and misinforming your organization. Okay, at this point, let's take a look at the comparison of Flonbate and Flonbate. This table describes a comparison in different areas of the project. So you could find that both Flonbate and Flonbate can work as aggregators or forwarders. They both can complement each other or use them as a standalone solution. That is why Flonbate operator comes in and supports managing both Flonbate and Flonbate. So Flonbate was born to facilitate the management of Flonbate and Flonbate. It allows you to manage the life cycle of Flonbate and Flonbate. Before we dive into it, let's look back at its history of Flonbate operator. This project was open sourced as Flonbate operator by CubeSphere team in January, 2019. After eight versions iteration, it has been donated to the upstream Flonbate community in August, 2021. So after it has integrated the Flonbate into its operator, it has been renamed to Flonbate operator in March, 2022. In April, Flonbate operator has reached the 1.0, which marks the maturity of this project. All right, this is the initial reason that we founded this project Flonbate operator, as we have seen that the Flonbate cannot reload configured gracefully and it does not support dynamic configuration. It requires users to restart its Flonbate pod and reload it manually. So it is not intelligent, especially in the production environment. That is why Flonbate operator comes in. Okay, let me give a general introduction to Flonbate operator. As we mentioned earlier, the Flonbate operator, which is used to deploy and destroy Flonbate demo set or Flonbate state for set automatically. Second, it has custom configuration, which allows you to select the plugins like input, filter, output, via labels. So as we mentioned earlier, the dynamic reloading is also the most important feature that has supported in the Flonbate operator. It has supported update configuration without rebooting Flonbate and Flonbate pods. Multi-tenant log isolation has also been considered in Flonbate operator. You know, Flonbate supports multi-tenant log isolation through label router plugin. Last but not least, Flonbate provides part both deployed components. Either Flonbate or Flonbate can be deployed separately. As we can see that all the both Flonbate and Flonbate are able to collect, process, and then forward logs to final destinations. You know, all the both Flonbate and Flonbate are able to collect, process, and then forward log to different destinations. They have their own strengths in some different aspects. So Flonbate plays a role as a logging agent on each node since it is super lightweight and efficient. While Flonbate is more powerful to perform advanced processing logs capability because of its rich plugin system. As we mentioned before, Flonbate operator was used to manage the Flonbate as its inception. So if you only enable Flonbate, then the workflow will be quite simple. As you can see from this diagram, the Flonbate component defines the Flonbate demo set and its configuration. Meanwhile, Flonbate operator provides a customized Flonbate Docker image. Flonbate config selects the input, filter, and output plugins and generates the final configuration into a secret. So how does Flonbate operator manage Flonbate and its CRD to make it works better with Kubernetes? As you can see from this diagram, each CRD such as class input, class filter, and class output represents a Flonbate configuration section which are selected by cluster Flonbate config via label selectors. Flonbate operator watches those objects, constructs the final configuration, and finally creates a secret to store the configuration which will be mounted into the Flonbate demo set. So the entire workflow is showing as this graph. So at this point, to enable Flonbate to pick up and use the latest configuration whenever the Flonbate configuration changes, a wrapper called Flonbate watcher is added to restart the Flonbate process as soon as Flonbate configuration changes are detected. In this way, the Flonbate part is not required to restart it to reload the new configuration. The Flonbate configuration is reloaded in this way because there is no reloading interface in Flonbate itself. So you can learn more details from these links as below. All right, as we introduced before, Flonbate is much powerful to perform advanced data processing because of its rich plugins. So we added Flonbate support in Flonbate operator and renamed it. Now you can receive logs through networks like HTTP or this log and then process them and send those logs to the final destinations such as elastic search, tough car, and S3. Flonbate operator provides three kinds of mode that you can use as you want. They are Flonbate only, Flonbate plus FlonD and FlonD only. So Flon operator includes CRDs and controllers for both Flonbate and FlonD, which allows you to configure your login pipeline in these three modes as you want. Flonbate only mode, that means if you just need to click logs and send those logs to the final destinations, all you need is just Flonbate. So let's take a look at the Flonbate only. As you can see from this graph, the Flonbate CRD class Flonbate config selects class level plugins and generates the final configuration into a secret. Then the other plugins like class input, class filter, class output and class parser are selected by class Flonbate config with label selectors. All right, let's take a look at an example use case of collecting Kubernetes application logs and send the logs output to Kafka. So you could define the plugins you want by the label and selectors. For example, it has defined the filter plugins like Kubernetes, Nest and Modify plugin here. Next, let's take a look at the Flonbate only mode. If you only need to receive logs through network like HTTP or Syslog and then process and send those logs to the final destinations, you just need to enable Flonbate. Next, let's take a look at how those Flonbate CRDs work. Flonbate is used to define the Flonbate state facade and its configuration. A customized Flonbate image is required to work with Flonbate operator for dynamic configuration reloading. Flonbate config which is used to select class level or namespace level plugins such as input, filter and output and generates the final configuration into a secret. Similarly, the class Flonbate config which is used to select class level plugins and generates the final configuration into a secret. The other components such as filter and output are similar. They're used to define the namespace level and class level configuration respectively. Again, let's take a look at an example use case of using Flonbate to receive logs from HTTP and send the output to stand out. You could define your plugins via labels and selectors which is similar to the previous sample. Apart from the Flonbate or Flonbate only mode, there's also a strong combination of Flonbate and Flonbate in Flonbate operator. It has more flexibility since you could leverage their different strengths respectively. If you also need to perform some advanced data processing to the logs or route them to more destinations, then you just need to enable Flonbate and choose the Flonbate plus Flonbate mode. With its rich plugins, Flonbate plays a role of log aggregation layer and it is able to perform more advanced log processing. You can forward the logs from Flonbate to Flonbate with ease using Flonbate operator. Next, let's take a look at a real case study from our team. You know, CubeSphere is an open source container platform built on Kubernetes. CubeSphere has a built-in login console. It allows users to search the logs and configure the log collectors such as Kafka, Flonbate or Elasticsearch. CubeSphere adopts Elasticsearch serves as the backend login service with Flonbate as a log collector. It runs Flonbate demo set on each node to collect the container logs and application logs. In this way, different tenants could search the logs in a unified login console and you are able to configure the logs only visible to the specific tenants or in a space. In order to help you guys get started with Flonbate operator, we have prepared a workshop with step-by-step labs. This workshop has involved a lot of typical and interesting use cases. For example, it starts from installing Flonbate operator and helps you to deploy Flonbate and FlonD. Next, it involves three kinds of modes that you can leverage Flonbate operator to send logs to different destinations. For instance, we will use Flonbate only mode in this demo to collect Kubernetes application logs and send them to Kafka or Elasticsearch. Typically, you can also enable Flonbate plus FlonD mode. It has also some typical use cases in this section, especially in the multi-tenant scenario. Finally, you can also leverage FlonD only mode to use FlonD to receive logs from HTTP and output to stand out. Feel free to try it in your local machine or your edge devices. In this demo, we will collect Kubernetes logs and forward it to Elasticsearch by using Flonthor operator with Flonbate only mode. As we mentioned earlier, we have prepared a step-by-step workshop and walk you through how to deploy Flonthor operator and play around it. You can check out this lab from GitHub and try it yourself. All of the demos, sample code, and documentations are available at GitHub. All right, let's get started with this lab. I have already prepared a K3S cluster and a MiniCube cluster before this lab. And in order to make this lab much efficient and convenient for you to set up in your local machine, I finally choose to use MiniCube cluster for this demo. Any other Kubernetes distributions or native Kubernetes are also supported in this lab. You can clone the demo repository to your local machine and get all of those shared scripts. For the first step, we will deploy a Kafka cluster and an Elasticsearch cluster. So you can choose to forward those logs to Kafka or Elasticsearch as you want. After a couple of seconds, we could verify if all of those Kafka cluster resources are running. After all of those Kafka cluster resources are ready, we could go ahead and deploy Elasticsearch cluster with its Helm chart. It might take a couple of minutes to set up the Elasticsearch cluster so we can take a look at the manifest of front operator and to see what has been defined in its YAML file. Since we choose to collect Kubernetes logs and forward to Elasticsearch using front operator with front bit only mode, so we dig into this folder. As we can see from the architecture diagram of front operator, all of those front bit CRDs such as front bit, class front bit config, class input, class filter and class output have been defined in those YAML files. The destination such as Elasticsearch has been defined in the output CRD. If you want to define the other destinations such as Kafka or Locky or S3, you can define it here. Let's go back to the command line and to see if Elasticsearch is running or not. All right, everything looks okay. The Elasticsearch cluster has been deployed successfully and now it should be ready to store the logs. Okay, at this point, we could find that the two destinations, Elasticsearch and Kafka are ready to use. So for the next step, we are going to deploy the lock forwarder and processor, you know, front bit to collect Kubernetes logs and ship them to these two destinations. So we could use front operator deployment script to set up it and wait for seconds. After we see the front operator part is running, that means we are ready to use front operator to deploy front bit or front D to process logs. All right, let's enter the manifest folder and choose front bit mode only. As we mentioned earlier, all of those CRD YAML files have been defined in this folder. So next you can use kubectl apply to create all of those resources. Here we can point the targeted folder to this place, apply it and then wait for seconds. You will find all of CRD files. You will find all of those CRD resources have been created. So let's verify then. First, let's verify if the front bit demo set has been set up and running. The front bit demo set looks fine. So go ahead, let's check the CRD component of front bit. As you can see from this front operator diagram, you might need to verify the status of those CRDs. Let's first check the status of front operator itself and then we can get the status of cluster front bit config. Next, go ahead, we can check the status of class input. Next, let's go ahead and check this class input status. Copy and paste it and change it to cluster filter. And we can see that the Kubernetes plugin has been defined in the cluster filter part. As we have defined the elastic search as a destination in this login pipeline within the output plugin, so the result is looks like as the same as our expectation. All right, at this point, we could scroll down to the guideline of how to check the status and configuration of the elastic search and to see if the logs and index has been created in the elastic search. At the same time, we could also find the front bit part is running due to we are using front bit only mode here. So next, let's verify the front bit configuration that generated by the front operator. So we can copy this command line and paste it in our terminal. Yeah, we could find the plugins that defined in this front bit configuration. It includes input, filter, and output. As we have already defined the elastic search as a destination in the output, you can also change it to Kafka or other destinations as you want. All right, at this point, we could check the logs from the elastic search so we can copy this command line and paste it in our terminal. Wait for a second, yeah. After querying the Kubernetes namespace bucket in the elastic search, we could retrieve the key value types of logs and data from the elastic search. That means the logs have been collected and shipped to the final destination, you know, the elastic search here. All right, at this point, we can also double check the result from the elastic search index. Normally, it will create a new index for the logs that retrieved from front bit. Last but not least, as we have already deployed Kafka as an optional destination at the beginning of this demo, so you can change it to Kafka in the output CRD component.