 Hello, everybody. Welcome to the CNCF webinar about FluentBit and the common release of FluentBit 1.9. My name is Eduardo Silva. I'm one of the creators and maintainers of this project called FluentBit. I'm also a lucky founder of this company called Calitya and has been a long time as CNCF maintainer and engaged with the community. In the past, I used to be a software engineer at Oracle and a principal engineer at TreasureData. Before getting started with the news about FluentBit 1.9, I'm sure that many of you are new about FluentBit. Some of you may come from the Fluent ecosystem or maybe you're just learning about logging and metrics. So I will take some a few minutes and take a few slides to share some concepts about FluentBit and why this is important. The first thing that everybody has to remember that we are a CNCF project. We are a graduate of the CNCF under the umbrella of FluentDit. FluentBit's a parent project made original for logging and FluentBit is a subproject of both in our graduated states. One of the problems that FluentBit aims to solve pretty much like FluentD, we are from the same family, is like sometimes you have many data that comes from different sources and you want to do some data analysis. But when you're talking about distributed systems and we have applications of bare metal servers or you have applications in Kubernetes, how do you extract all this information in a smooth way so you can perform your data analysis later. And when you do a data analysis, pretty much some people like to aggregate the data into just one single place like Elastic or Kafka so they can have different subscribers to extract this information. So if you think about the complexity of having different sources of information from different places plus different formats, you might think that you need a solution for this and that is FluentBit. And FluentBit is a lightweight agent that is installed on-prem in your cluster and your Kubernetes nodes or just in your normal VM or bare metal machines. And you can configure it to perform data collection from different sources like files, system g or just receive data from the network like TCP or firewall messages over syslog. Having this is really important because one of the challenges sometimes as a developer is like build the application. Then the next stage is deploy the application but after deploying this application you need to monitor this application. And there are many areas to monitor an application or how to instrument the application. FluentBit as of today cares about two spaces of monitoring or observability. One of them is logs which pretty much text information that the application writes out to the standard OPPO interface or to a log file or syslog but also collect metrics from pretty much any endpoint that this application can expose. FluentBit has been getting a lot of traction, primarily I would say because of its performance and its low system resources usage. It's not the same something that a tool that is processing thousands of messages per second consuming two, three gigabytes of memory in your system that has something that is consuming less than 100 megabytes. Every megabyte counts I would say. And as I said before, FluentBit is a CNC-affiliated project under the umbrella of FluentBit. Now getting up a layer, getting more than doing a deep dive of the project, FluentBit is kind of an engine that has support different input source of information that could be logs, metrics that detect them in packages and by their representation and generate some internal events. Those internal events then could optionally go through a filter chain where this data can be enriched or can be modified. And one of these use cases is for example, if you're running, I don't know, in AWS and you're processing logs from X application, you would like to add some important information like not just application itself, but also where this application is running, what is the hostname associated to that, the instance ID, the instance name, the zone where this was deployed. So all of these kind of modifications can be done in a filter phase. After that, in the pipeline, we had this concept of buffering. As soon as we get the data out we filter it, we just store it temporarily, right, either in memory or on disk, because this data needs to be prepared to be delivered to multiple, one of multiple destinations. So buffering is really important because if you're going to send the data, many things could go wrong. One of them are DNS, they are network outages or sometimes your remote endpoint is down for a couple of minutes, couple of hours, and you don't want to lose your data, right? So we need this buffering in order to make sure that everything that you have in there, we persist until you can send it. Of course, we provide mechanisms to say, I want to provide just a few gigabytes of data for buffering, but no more than that, right? And the output destination side is where we provide connectors for different back ends. Kafka, Elasticsearch, Opensearch, Prometheus, Remotewrite, there are plenty of them. In general, we have around 100 plugins between input filters and outputs. So FluentBit is a very, very subtle agent, like we call it like a Swiss knife, because it can be deployed in a Kubernetes cluster environment of machine or any kind of old system that you need to take your information out from it. Now, who's using FluentBit? Now, what they say FluentBit is used by all the major cloud providers. For example, if you go to Google Cloud and you deploy a Kubernetes cluster, and you inspect what pods are running your node, you will see that FluentBit is there. Same case for Microsoft, AWS has its own distribution of FluentBit called AWS or FluentBit, which comes with specific connectors for AWS and more custom, not optimization, but custom setups for AWS customers. And so under the many, it's not just for cloud providers, right? We have many companies using it or integrating with it, even Splunk, Neuralig, LogDNA, Datadoc. So what is important here is that FluentBit is a real vendor neutral solution, right? As a vendor neutral solution that we aim is like, allows you to deploy this and choose your vendor, your backend, and switch to the next day to any backend that you want, right? It gives you the freedom to take your data, control your data and set it where you want today and give you the flexibility to change that destination tomorrow. Now, about FluentBit updates, right, we have a lot of news to share. Actually, I'm pretty excited about this release and the whole team, community and companies working together on this release has been tremendous work, right? And sometimes it's not just a code base, there are many areas, and some of them we're going to discuss here in this webinar. One of them is like, the biggest news is like the success that we just crossed the one billion deployments, right, if you go from our Docker Hub registry. And this is a huge accomplishment. I would say that just few projects hit this or runs at this scale, right? This took like a few years. And nowadays, I would say that FluentBit is deployed one to two million times per day, which is insane. And of course, that means more traction, more backs, more hands over waist, but a bigger community. And that's what we aim for. As part of a development process and a project is sometimes not just to write code, right? We want to make sure that things gets right. So a few years ago, we get many complaints about sometimes we push some future that break things or the lack of automation in the development workflow. Now, in the last year, the community team has taken this seriously and had created a full automated test release process based on GitHub Actions, right? Now we have staging builds, test process before to perform a release. Years ago, that did not happen, right? We just have a code base, create a tag, create a turbo, push images, packages, and that's it. But now we have a full system that makes sure that every bit that we're going to ship out to users, customers from a company perspective and partners, it's kind of certified and to avoid recreation. And we have any kind of, a lot of test smoke testing for package installations. We are testing now, if we do that one version and we upgrade to another one, the upgrade process does not fail. Those kind of tests were not there before. And now they are with flu embed 1.9. Now, if you're using containers, right? We just make sure to ship a lightweight container image. Lightweight, I mean that it's not in a big size that has a data that is not needed, right? For that, we choose these trolleys, what time ago. With these trolleys, it's a kind of a container-based image that just contains what is needed. There's no shell, there's no external binaries that also can put some kind of security concern on your deployment. But these trolleys was just in there for, well, 64 bits, right? But now we extended all the support of these trolleys for all the architecture, which is M32 and ARM64 and X86. Now, when you're contributing also to flu embed, to GitHub submitting a PR, we have more than 30, 40 checks for that PR, running some LinkedIn shell checks, actual LinkedIn compiler version and making sure that every contribution from the community will not break anything. And this goes from CentOS 7, which is a very old distribution to the latest ones, right? An old major distribution. And also we have integrated a new security system. So every time that we push an image, we are testing the development workflow and new container images created, yeah, we are running a security scanning, trying to trap any kind of a problem like a CDE in an external library that we might be consuming or that is part of the distro-less image. So I would say that this brings a new around automation and making sure that things gets right. And you can go to sleep without problems, right? Once you deploy a new version. One of the other big things is like flu embed has a very old archive website. With the help that's sponsored with the community, we just could create a new website which is kind of, for my opinion, kind of outstanding. Congrats to the Ari, who's the designer of this website. And we create a website that is also continued in the line that is open source but the framework now it's in Hugo. It used to be in Jekyll and Ruby but now has been translated to Hugo and this site is fully available in the source code of it in GitHub. So if you are interested in contributing documentation, articles, yeah, now we have all the pieces in the framework to provide that future set. Now let's get to the other fun part of this release, one that night. I'm sure that you wanted to know, A, what's coming out, what are new features, what new things are around. Okay, the first one is like we have implemented a new configuration mechanism. Actually it's not a new. We refactored the whole configuration mechanism. So now we support not just the classic mode of flu embed to configure the pipelines but also we support YAML, right? And you may be asking, why YAML? Well, most of integration services or when you want to do a integration with APIs and make a connect flu embed or deploy flu embed, the classic mode is not so friendly. It has a special indentation. It gets complex. When you go to the cloud native space where everything is YAML, yeah, flu embed was kind of more complex to manage. And we wanted to change that. Now we support YAML as like a native support but also classic mode. So if you provided that YAML file from it will work in the same way that in classic mode. Also, you can see here in the example that we have implemented the logical concept of pipeline. So you can separate things that you might think that logically something that generate from some source with attack will go, must go to a specific output, right? So sometimes we have found users that they have huge configuration with different logical pipelines, but all of them mix it together. Yeah, we wanted to fix that. And now you can define multiple pipelines. So when it's time to maintain the pipeline, it will be easier from any kind of perspective. And also from YAML perspective, we support includes. We support that in flu embed for a long time. But we just make sure that when you provide a YAML file to flu embed, you can include other files in the case that you segment or you have more different pipelines to include. And this is always expected performance, right? Flu embed, as I said some minutes ago, it's always, we care about performance, low memory footprint, but not sacrifice any resource, right? If you're going to use something that must be right, optimize it. Now when the project started to grow, it started as a single threat product, right? It's fully asynchronous, even driven, and all what you want. Now when you have events for coming from network, IO, timers, scalability, waking up coroutines, because we also run with coroutines, you get these saturations of events in the main event loop. And AWS has done an understanding job analyzing how we can improve this. And they have contributed the concept of priority queues to flu embed. So what they say is a way to prioritize what kind of events that are being reported by kernel needs to be processed first, second, and in the third state. So for example, events that comes from the scheduler, or we need to flush a coroutine, or we need to dispatch a task, that is high priority. Those events reported will be processed first. Secondly, we will process all the network IO events. And this might be some confusion, but all the tests and all the same integration shows that this has been the optimal way to increase even more the performance two, three times. And finally, everything that was about task initialization when it's time to flush something and other normal events, they got the last priority. And this has been a really, really good improvement. So if you were surprised because 2.0 Bitcoin scaled a lot, this will surprise you more. Also talking about our performance, in the last year in 2021, when we released flu embed 1.7, we implemented the concept of threading. As I said, we didn't have any threading. It was single thread, fullest increments with coroutines. We implemented this threading model in the output plugins. So everything that is about converting the payload from message pack, which is internal representation to, I don't know, external JSON that is expected by elastic, it buys plank, everything was being done in just one single thread. You can imagine that that adds a lot of latency and delay things in the pipeline. So we created threads. We move all those expensive tasks to different threads. Every thread can handle multiple coroutines, thousands. And we just demonstrate that we can scale up the performance of five times using a few threads. But this feature was not enabled by default. So the user has to go discover and put in the output section of a conflict, say workers 1, workers 2, or any kind of number that you wanted to scale. We found that this was not ideal. Most of users were struggling with performance. So we just changed the defaults. And for majority of the output plugins now runs with a default number of threads. This is a very lightweight thing. Don't think that you're going to use double memory and three times memory is not like that. So most, for example, plugins like HTTP, studio files, plank, elastic, open search, all of them now runs by default in separate threads. You can override that behavior. You can put workers 0 and you should be fine. But the defaults are optimal for majority of use cases. Everybody who wants to know more about performance or wants to hit something higher, they can adjust the values without any problem. Now let's talk about the input plugins. Now we're going to prefix this input plugin with log input plugins. As I said, we are in the lofts and metrics space. And shortly we're going to jump into traces. So I'm going to describe a bit about the log input plugins that we have done on that area. Tail is the main plugin that allows you to follow files on the file system. And every six months we found that users has more challenges and some of them say, I have 50,000 files. But when fluent bits start, it's taken two, three minutes. We never thought that we were going to hit those use cases. So we optimized the tail input plugin and now processing or to get started to process 50,000 files, it just takes a few milliseconds, right? It's less than a second. So these kind of optimizations are quite, quite important. We have users that just tell files from the tail, but we have others that say I want to process all the files that I have from the beginning. And this enhancement tries to fix that specific use case. We had an issue like number 300, which is from 2017 if I'm not wrong, which is about asking for Kafka input plugin. Users wanted to have fluent bits to behave as a Kafka subscriber, right? They were sending data to Kafka and they wanted to subscribe from it. So we just implemented, and I'm saying that now it's experimental because we are testing a new architecture for this plugin, and from mid 1.9 chips with an input plugin for Kafka. This is quite a working fine. And you can subscribe to many topics. You can get all your data in. And even we found some interesting cases where users are sending data to Kafka using fluent bit. And from the other side, they have another fluent bit consuming those, right? Doing a filtering, modifying this information and generating a new topic, right? So yeah, they're using also fluent bit as like a kind of stream processor for Kafka. And we're really happy about to hear about this use case because for us, stream processing has been always something really interesting where you can add a ton of value. And this is not about to replace Kafka. Actually, it's hoped to add more value to it. We had thousands of users connecting to Kafka and now this brings a new kind of level of integration. For our Windows users, which are not a small amount. Actually, you might surprise that most of a financial institutions just run 100,000 service with Windows service, right? And they have the face the same issue. How do I collect my information, my log information for my system? We used to have, and we still have a plugin called Windows, which allows you to get the logs from the Windows event log system from Windows. But it was just associated to classic channels. Now, this new plugin called WebDB Log allows you to pull data, consume data from non-classic channels that you have in your service. And we've got many companies and users expecting this implementation. I'm happy to say that it's already out with one at night. Now, let's jump to the filter plugins. This is not the full list. I'm just going to talk about one new filter that we have because when I'm recording this, the release is happening just in a few days and we still have some more filters that we're going to add to this release. The new filter is called Nightfall. Nightfall is a vendor. It's a specific service that makes sure that if your records contains any sensitive information, like API keys or PII, it can do data redaction on that and make sure that you're not going to chip any sensitive data. This is a third party service. This is a contribution for Nightfall to Flow and Bet. So thanks, Nightfall, for contributing this. And yeah, the filter is ready to go. Just go ahead to the documentation and you can start trying it out in this service. Now, in the output site for logging, there's some news too. One of the biggest ones is that the Flow and Bet project has partnered with the OpenSearch team. One of the missing pieces of OpenSearch, which is a form from Elastic, is the lack of an official agent for the users. For another angle, also our users, some of them were migrating to OpenSearch. They have their own reasons. But for us being a vendor agnostic project, a framework agnostic and everything, it's like we want to make sure that our user has the possibility to switch to a different service to a different project as a backend, but they have the right implementation as a connector for it. So Flow and Bet has, is chipping now a new OpenSearch connector based on the old ElasticSearch connector that allows you to have this kind of first class series in connector for all our community. So if you're using OpenSearch, please switch to this new connector and I'm sure you're going to get a good experience with it. Other users, not others, thousands use Flow and Bet to send data to S3 packets, to Amazon S3. And we got this really interesting use case. The data they were chipping is going to be consumed for analytics use cases. And they needed to have this in Apache Arrow. So a contribution from the company, care code, implemented all this Apache Arrow encoding for the S3 connectors. So thanks for that. So now if you're relying on Apache Arrow and you want to use cases S3 packets, you can do this with Flow and Bet now. Now, another interesting angle, Flow and Bet and metrics. We always get this question, what is Flow and Bet about metrics? What do we think about metrics? And part of the story is that when we started Flow and Bet, one of the first plugins that we wrote for it, because that's years ago for embedded Linux was how to collect CPU metrics, disk geometrics and so on. But at that time we handled all that information as struct blocks, not as a real metrics with schema. But since one year ago, we started turning into implementing a native support for metrics payloads that allows Flow and Bet to connect to other ecosystems. Our vision is like Flow and Bet is like the core on all these ecosystems for observability. And we provide all the tooling, all the connectors that you are able to connect to different systems, different protocols, different frameworks and also keep being vendor agnostic. And that's our mantra, keep being vendor agnostic and talk to everybody. So we look at this as a Flow and Bet is like an ecosystem where all these new frameworks, metrics or instrumentations, library or distributed systems can be connected through Flow and Bet. Now, one of the new input plugins for metrics collection is the NGINX plugin. The NGINX is quite an interesting plugin because NGINX, as a web, the NGINX web server, it chips or can expose metrics in JSON format. So this new NGINX metrics plugin, what it does, it connects to this NGINX service, retrieve all the JSON payload, convert it to a metrics payload and then it can process it in our Flow and Bet pipeline and send it out to any kind of metrics endpoint. And not just NGINX OSS or open source but also we support the NGINX plus version which is an enterprise edition by NGINX. Another angle is that we're talking about metrics logs but now we are also able to collect metrics from Windows in a native way. Right now this experimental plugin is able to collect CPU metrics from the Windows system and now doing the 1.9 release cycle we're going to add other kind of metrics samples, for example, disk, memory, storage, file system and so on. So this is an ongoing work and if you're interested in some special collector for Windows that is not there, please just open an issue we have and we try to prioritize it. If many people are interested in the same thing, right? Another interesting angle for metrics is like in my personal vision, this is Eduardo talking, right? I see that the industry for monitoring for metrics space runs on primitives. That's a fact. And when Fluembed is starting integrating with metrics the first approach is like for us it's like to understand what are the challenges from the users. What are the challenges from Fluembed users doing metrics collection with different agents, different systems? And we found a lot of feedback that says I am using Fluembed but also I'm deploying another kind of agent that does metrics scraping on collection. Why cannot integrate the same functionality inside Fluembed? And this is a result. As of today and with this release we got a new plugin for Fluembed which is called Prometheus Scrape that allows you to scrape metrics from your own applications or remote endpoints that expose Prometheus metrics, right? In addition this is not new but it was there but we did a couple of enhancements in our Prometheus exporter but that's in the output side because on one side we can collect locally metrics like we have one plugin to collect node metrics but now when you go into the output side we can put that metrics in a Prometheus export format so other services can scrape them or if you want you can use our another plugin called Prometheus Remote Write that allows you to take all these metrics and dispatch this metric to a third-party service to a Prometheus agent or open telemetry collector whatever is waiting for this kind of format. So we are really happy about working together with the Prometheus Prometheus team because this is not about separating things between logs, metrics or traces, right? If you go to any kind of production environment you will find that most of technologies are there, different projects there's no one single tool for everything, right? But all of them have the same need which is kind of integration and through embed I am to solve that. The next really interesting topic is about open telemetry. We had many questions about what is the fluency and fluency position against open telemetry and I would like to take this space to make some clarifications. For us it has never been one project against other. If I talk about for example fluency story when elastic search was the default backing for logs in the open source and log stash was kind of competitive for fluency we had a bunch of users integrating with fluency we built plugins for log stash for bits and that moment so everything is about to provide the user the flexibility to solve the problem that they have it's not about to replace technology. And for us when we talk about open telemetry we see this vision of a unified framework for telemetry and for us yeah it looks great but now we have a responsibility we have thousands of users a big part of them maybe we jump into open telemetry so now how do we extend our scope to support this journey for them jumping into open telemetry so on fluency 1.9 we are launching our first connector for open telemetry one of them is open telemetry input that as of today is just to support metrics over OTLP and during the release cycle of 1.9 we are going to add support for traces in the output side now we support only metrics but same story as in the input side we are going to be able to be able to ship traces and you might be asking does it mean that you want to replace open telemetry I would say there will be many futures that are going to overlap with one to each other but our intention is that the user get the flexibility to integrate older systems we have seen the production we have seen with customers from a company perspective that switching telemetry, switching agents is not something that happens from one day to the other it takes one to two years but from our standpoint we want to make sure that we solve the problem today but also we open the door for the integration to where the market wants to standardize and this is really interesting and we are really happy as a fluent project we are going to start participating more in open telemetry we are talking with our from a company perspective with partners understanding what are the needs of fluent but they are also jumping to open telemetry and there are many gaps they want to fix for their own specific use case so if you want to talk about open telemetry we are really happy to say that we are jumping into it we are supporting and embracing open telemetry to make sure that also that prey can succeed we are both part of the CNCF and yeah this is just an interesting journey the observability so fluent bit it's something that I used to be around all the observability space metrics, loves and traces right now our primary focus I would say that will be metrics and traces for this year and as I said at the beginning we want the fluent bit becomes the part of this central neural network where we can take any kind of payload or connect scrape information and deliver a high quality format for a different destination that is expecting that so you might hear a lot about these kind of implementations POCs and things running in production already but we are really happy to hear from you about the use case, even concerns or anything that we can add as a value to the project well that was the quick webinar, the quick presentation about fluent bit 1.9 I'm sure that many things could happen in a few days when we see the release and recording this in a few days before we are you know putting the live stream for this but if you have any questions please feel free to write me an email I'll be happy to sync jump into a Zoom call also a reminder that we have our fluent bit community calls every two weeks you can find information on our fluent bit website at IO ok, thanks so much for attending enjoy your day, good evening your night and I hope to see you soon thank you