 Hello Hi guys, how are you? Good happy to come back at Kubecon Who's your first time here? Maybe you can raise your hand. Oh Really? Oh Man, that's great. That's great in the meanwhile they bring all the audio stuff. So we are going to do some quick In Conversation so well first of all welcome. I think that Kubecon is one of the best conference around Not because of just Kubernetes because of technology and community So, yeah, really happy to hear that. This is the first time that everybody raise the hand. Yeah, that's awesome Okay, maybe some question who's a flu and D or flu embed user already? Okay, now who's interested in metrics? Okay Let's get started is this working too. Yeah, awesome Okay So, yeah, hey everyone. My name is honor. I'm one of the co-founders maintainers flinty fluent bit project And of course we have Eduardo creator of the fluent bit project as well So, yeah, really when we start looking at fluent bit and fluent D The way we we've begun to think about these projects is really a Swiss Swiss army knife for observability How do we look at all this data start slicing and dicing it making it very accessible? As part of the ecosystem and how these projects start to fit together Kind of in the in the broader set of the ecosystem, right? Many of folks first time at Kubecon so many projects going around How do we look at this ecosystem at a little bit of a broader level? Yeah, so one of the important things is like a When you are thinking about observability in general, it's because you want to analyze You know your application hold this stuff is working, right at the end of the day Managing infrastructure or managing the services that run behind the scenes You don't want to be worried about that, right? It's like if you're using a Linux system, you don't want to know how JLPC is working, right? You just want to make sure that the application is running is getting the right resources memory location Everything and if we go to a higher level from observability your final goal always will be how do I? Analyze my data, right? But in order to analyze your data. You need to Get the data first right in a central place most of the time And this data that you get come from different sources and from different kind of natures, right? you can think about locks you can think about metrics and you can think about traces and I Guess that most of you are really working in some companies And I'm pretty sure that 99% of you will say we don't have just one solution for everything, right? You just don't have a mysql. Maybe you also have public sequel you have readies and your architectures always grows They always evolve and all this transition. They're always more components being added And at some point it's and from a development perspective the same thing happens You will get some applications reading in Java some others and no GS with JavaScript so you face this problem that we have different natures or different origins and in our case of Serpability different source of information and At some point you will you will find that you are instrumented with primitives Maybe you're using flu in the you're using fluent bit, but the goal is always the same How do I analyze my data, but you have to fix it for the first problem? so we try to see that and Communicate that the fluent ecosystem as the name said is fluent it aims not to compete with other projects even from an agent Perspective or company perspective. We don't try to compete to be a drop in replacement We try just to be a solution that can be integrated in your architecture in your own environment So no matter what you're running your applications No matter what you're using for instrumentation and the instrumentation at the end you can use fluent to connect different ecosystems so and And our ecosystem has a really interesting thing the first one well everybody knows that it's all fully open source But one of the things that you might not realize and you might are facing this in the enterprise Is that sometime exist with the concept of vendor locking? Because since your goal is to do analysis makes sense You're going to say oh, I'm going to invest in Splank elastic open search And then they will tell you these are the tools that you need to do all this data collection Right, but some of them are not real are open source But they will tell you okay if you use this you get married to this technology to this stack That is a vendor looking Because after one year when you want to switch it will be a huge pain What about if now you think oh, there's a new fancy database that I want to use Right, what would be the solution? Oh, you just no need to replace database. You also have to replace all your agents And that becomes a problem the fluid ecosystem trying to be to avoid this vendor looking for you and as I said a couple of times we are Agnostic no matter what is the product no matter what is the project or the ecosystem or the tool inside the same CNC as a stack we try to be compatible with all the standards and In the fluid ecosystem, well if you are using it, I think that is extra information who use flu embed No, but there is a standard same flu embeds Amazon Google Microsoft Azure when cloud providers choose something is because is Production great and now a quick recap of the past present and future of the fluent deep project and the ecosystem that we have You know everything is started with fluent D almost 11 years ago a solution to collect logs For our ex-company to send all the data to the cloud We made it open source and these breaks succeeded the people built around thousand plugins is a huge ecosystem Right, but it was not just ready For the cloud environments, right because fluent is reading a ruby is more heavy So when you go to deploy this in a hundred or thousand of machines, right? You know using 200 megabytes of memory or 400. It's really expensive and CPU intensive in general At the same time we started doing some innovation We come up with the next generation of tool which was fluent bed part of the same family But really didn't see optimizer for performance and making sure that there was a really good solution for Everybody, but our same philosophy philosophy. We didn't want to have a drop-in replacement Right, we said that fluent bed is going to integrate with flu and D or you can use them separately or just independently and now one of the trends that we have is that Well send the cloud providers migrated from fluent D to fluent bits like the default the factor standard most of users are following the same pattern and I would say that no a day user does not need a thousand plugins Most of you might creating your own micro services your own application and you are just instrumenting By using or sending the data to a custom HDD PN point For example fluent D supports like a thousand plugins Connect in between connectors source of destinations in a fluent bit. We support a hundred But as of now, I think we have not get any more requirements, right? We support splank elastic open-search HDD P and many others And the highlights of all production rate high performance when you run an agent and you see that this agent Doing nothing is using 600 kilobytes of memory means something right of course when you get more data you start processing data you need more memory to handle all that load and We always try to have to be very low resource consumption You can write a tool that can process millions of messages per second Right, but maybe you need a cluster of five machines to process that right So we always try to optimize for performance and see how much how far we can go with fluent bit And we have demonstrated that in the last year. We pretty much Start in doing 4x or 5x times in performance improvements by doing threading optimizing the core memory allocations and So on and how you can use fluent bit. There's a couple of distributions Of course the the upstream version is that what most people use you can get it from Docker hub You can get your packages for Ubuntu sent us We have AWS that they have their own distribution of fluent bit Which is similar upstream, but the difference that they they optimize for their own a Amazon services connectors, so they provide different connectors is more friendly So if you're running on AWS and you are a customer. Yeah, you can use that one Calypcia the company that we Founded with an rack which is on top of the fluid ecosystem We have our own fluent bit distribution to but this is mostly tied for as an LTS version, right? Enterprise ready. So if you're going to run a fluent bit And you want to make sure that there's no breaking change for the next 18 months you can use a Calypcia for fluent bit and Google ops agent Google created this own agent for their own customers That has two components has fluent bit for all the log management And they have a small open telemetry agent for metrics and traces So they ship this for their own customers. So if you are Google customer, you can use Google ops agent But at the end of the day, you can do the same with the upstream version, right? So we're not trying to impose what to use just go with upstream and then if you need something more specific Yeah, it's like I do use vanilla Kubernetes. Do I use open chip? You know, it's a matter of personal decision And yeah, this is fluent bit We aim to collect data from different sources and send that data to multiple destinations We do buffering the try logic We back up in the file system in a very optimized set way Now as a community base fluent bit has been deployed and since if I know this like one more More than one billion times in total, right? We didn't put the fraction here, but this is about order of millions, right? So last year we pretty much 600 millions and you can see how the number of deployments today this year is going So this is just insane and this is thanks to the community and people who's here And actually most of the features that you see in fluent bit and fluent D is because of the feedback that you provide after these sessions in the conference Everything Kubernetes filter, Lua scripting And I don't know elastic search open search all of that So don't think that you're going to just consume information here You can also help us to build a roadmap on what will be the future of the project and that's really important so Investments everybody cares about. Oh, this is a program reading and see right memory safety is an issue What about languages? It's a common topic. So the only thing that we can do is to making sure that every version gets a Most tested every time more improving and so on so we bested a lot on CI CD Regression checks sanitization making sure that it's running fine fine on all architectures We have a security team working with a Google OSS fastened technologies We gotta talk about that in fluent con which was our conference this Monday Pretty much fluent bit is being tested with random input on different interfaces different functions and trying to make it crash Yeah, the first year it crashed like crazy, right? But this has been running 24 by 7 for more than one year and most of these corner cases these bags has been fixed already and We started solving the locks problem But we said ah We just solve locks right anything that we think that metrics is interesting and there's more problems So we started extending our scope to metrics on all two places And this is a question. Oh metrics and traces, but there's other projects, right? And that's this is a primary topic of these Presentations so we live and work Elaborate more about that. Yeah, awesome. Thanks and water And yeah, if I if I talk about logs metrics and traces, you know when when we talk to the community What we were finding is folks had a bunch of common use cases So as folks were gathering logs many times there were metrics in those logs things They wanted to extract and what we found was folks were writing gigantic lua scripts Extracting all these small decimal points from logs doing all sorts of additions and then Trying to hack it together with node exporter or something else and and get into a permitious format And so we took our philosophy of how do we integrate with these ecosystems makes things easier for all our users? And that's really what set us last year on how do we integrate much much deeper into to a metric sense And we'll talk about traces as well So first logs, right? This is what we've been doing for for 11 plus years with fluent D When we look at it from a project side, you know logs are unstructured data You can have some unstructured logs where someone might write in logger. Hi, my name is John or hello My name is Jill. We have structured logs things well known like engine X access logs or structured schema like Syslog you could have schema list logs and as you are gathering these logs You might want to do some processing right the most common use case of this is if you are Gathering these logs and Kubernetes you want to enrich them with container pod namespace all of these contextual clues to help you debug and troubleshoot much much faster The same can be said where you might want to reduce that data right one of the big use cases that we see from a Community perspective is folks who are gathering petabytes and petabytes of data Might not want to pay for those petabytes and petabytes of data or send it to perhaps a less expensive Less used data store so that things can be a little bit cheaper And that's one of the other big use cases that we see with logs is how do we filter those out? How do we reduce them or how do we send it to to multiple locations? And I think metrics are also really fun because when we started fooling bit back in in 2015 Its initial use case was for embedded Linux and when you're running on embedded Linux those things are like IOT devices wind turbines Robots within Warehouses the main information that folks wanted to capture the time was CPU memory thermal kernel All sorts of metrics from those and that's what those are actually the first plugins that flump it had it didn't have a tail plug And it couldn't read log files And so that was something we had way way back then and we kept on seeing folks use it But when we saw those log-based metrics being used last year We realized hey those are not what folks need and in this new day and age What kind of the standard we see is is probitius being used almost everywhere? And open metrics being being that that compatibility layer So we went in and made sure that fluent bit can speak the language of those metrics at it counters gauges histogram summaries We created libraries of called see metrics. It's in the project You can you can go and look at it and then make these things all exportable So within all these metrics that we go collect we're able to export them into into various formats And now what we've gone and done this year is said, you know open telemetry is coming with it with a giant wave Let's go make sure that fluent bit can also speak that and be compatible So the the first step we took there was with open telemetry metrics and making sure if you're using the open Telemetry metrics SDK you're collecting those things We'll be able to collect that it and send it out and we have a small demo that we can show Here as well. So what does this look like in actuality when you look at the fluent ecosystem? It's really made of all these different plugins inputs outputs and in the input side We've added a prometheus scraper so you can scrape your custom metrics if you're running as a sidecar So right next to your Kubernetes applications, you can go ahead and grab those metrics Also mimic node exporter metrics So, you know, we're not doing the full breadth that the prometheus node exporter team is doing That's an awesome piece of software does fantastic things But there was a lot of commonality with what we had done with our plugins And we wanted to make sure that we collect, you know 80% of those give you a good dashboard if you're using things like Grafana or other things to visualize on top And then from an output side, what are the two main ways to get these metrics the prometheus exporter? So if you're scraping those metrics, how can I just plug into what already exists and also prometheus remote rights So a lot of services these days, whether they're third-party services or projects like M3, Thanos, Cortex I'm sure there's some others having the ability to just remote write into directly into those services And not being exclusive So another big thing is like we didn't want to say you must only do remote right with scrape Or only do node exporter metrics with exporter You can kind of choose, combine, send it in three, four different locations Use five exporters if you want, send it to seven places with prometheus remote right Be very flexible in the same way that we are with logs and other data sources What does this look like from a configuration standpoint? So the metrics also are unique in Fluentbit in that our metrics don't go through the same pipelines that logs go through So metrics are almost treated as an independent type You can, in this configuration example, see, hey, we're collecting node exporter metrics And then we're outputting them to the port 2021 We could easily add a prometheus remote right as another output if we'd like as well Add our TLS settings, our service, and whatever add label tags that we might want as part of that for our dashboard Yep, prometheus scraper, this one is actually brand new to Fluentbit 1.9, which was released March of this year And this one allows you, just as you would with any scraping, go ahead, grab something Get in the format, in this case we're using hashicorvult And then we're using an output to prometheus remote right with additional labels And with open telemetry, yeah, so similarly, we started off with HTTP So input and output, so you're able to ingest with HTTP and output with HTTP We're looking at additional protocols there as open telemetry supports more than just HTTP And then, of course, on top of that, all of the tracing side as part of the roadmap, which we'll cover here in a bit So what does this look like in actuality when we put it all together, so we have a quick demo And this is very, very simple, I have a JavaScript application, I'm instrumenting it with the standard of the day, which is open telemetry All the great work that that team's done for building those libraries We send those via the metric SDK to flip it And then we have prometheus already in place, so it scrapes that via prometheus scraper And then we just visualize it with Grafana So let me go ahead and close this out And let's go ahead and switch into, let's do a couple things, I'm going to go tab by tab here So first let's start with the application, so this is my JavaScript application You can see I'm using, let me go ahead and increase the size here I'm using the OTLP open telemetry protocol metric exporter And I'm sending it to Fluentbit on its 8080 port So within Fluentbit, I have a read from the HTTP open telemetry And then I have an exporter, which is that prometheus exporter that I mentioned earlier And if I go to the 2021 port, what does it look like if we just viewed it raw? This is really all it is, just a simple counter from open telemetry, test up, down Then prometheus, we look at prometheus and its scrape config It's grabbing from Fluentbit on the 2021 port that we were exposing And finally come back to Grafana and how do we visualize all that So it's a very simple pipeline, something that is very easily replicable Something that we're going to continue to add more and more features to But of course that's what we are here for this conference and interested in your feedback On what is needed, what can we add, what can we make sure works really, really well And especially when we look at some of the use cases that has made Fluent successful in the log space With filtering, reducing data, enriching data How can we bring some of those same ideas to how we're scraping and Modifying some of these metrics from open telemetry and open metrics So with that let me switch back to the slides here And let's talk a little bit about roadmap And the first big one is traces, right So when we look at logs and traces and metrics, because we treat them as independent data sources One of the ideals is how can we start to make some really meaningful correlations Or interactions between those pieces of data And especially when we look at Fluent bit and what we've done with logs We've had some of these notions of stream processing So you can write SQL today on top of your logs Common use cases might be I'm using InginX access logs I want to group by error codes with the SQL query Select star from group by error code or response code And then pump out a smaller amount of logs Similarly trying to bring some of that logic over to traces Bringing some of that logic over to metrics And in order to do that we need to introduce the same style of Hey, here's a brand new format of traces So our idea is to bring that into Q4 of this year And also extend the current open telemetry input output So instead of you having to say open telemetry metrics Or open telemetry traces Have it all live within kind of that same configuration set that you saw before Then the other biggest one Something that is always continuous is performance improvements When we look at Fluent bit and again it's success That we've seen within containerized Kubernetes environments It's really that high performance low footprint And being able to scale that to We have banks that run on 100,000 servers We have super small startups that will run this on The K3S super small micro Kubernetes clusters So want to make sure that that performance maintains And to do that we want to have a better handling Of how we treat all these events Because we're not just going to be collecting logs anymore We'll collect metrics, we'll collect traces And make sure that all of that is async has a good higher performance We're also going to be introducing threading support So we added that on the output side with Fluent Bit So you can add more threads So if you're for example sending terabytes of data You can add this worker setting Which is now the default in 1.9 And really pump that data through And now we're finding the bottleneck coming on the tail side When I'm reading that data If I'm reading a petabyte per day How do I make sure that I'm reading maybe from 10,000 files Or as container density increases I increase the amount of files that I have on that note So we're going to be introducing input plugins in a separate thread And of course optimized core operations So if you're really interested in some of these performance improvements We had a couple of talks from FluentCon Which are recorded and already on YouTube Things from AWS about improving the event loop Also things from again the folks in the security side Who are doing some of the fuzzing And ensuring things are compatible and well Developer experience So again we know Fluent Bit is written in C And it might not be the most appetizing thing To go and try to conquer But we do have Golang interfaces for the output side And we're going to bring that to the input side as well When we think of some of those use cases Especially in context of metrics, traces These can be things like SAS data sources Things that have APIs that have really great go SDKs for And you might want to go scrape that data, collect that data Bring that data in, do some correlations Generate some Prometheus metrics Generate some hotel metrics Or what not from So yeah that experience is coming in Q3 We'd love to talk to folks who are interested In writing more of those plugins Always interested in getting the community feedback there So yeah in this last one You know when we think of our ideals And how we keep building the project And alongside the community With Bruce Lee We think of be like water Fluent my friend, be fluid So yeah with that I think we Have quite a bit of time for questions And with this super packed room We'd love to answer as many as we can Oh okay Thank you Is that okay Mike? Hey wait I don't know Thanks, yeah do we have Mike? Okay first of all like applause For speakers, good job Amazing, amazing And yeah let's say we have a couple of questions Actually virtually so we try to kind of make it And yeah who has many Maybe we can start with in person questions Anyone has any question already have one Okay I will try to bring you Mike So thank you first You mentioned about traces Which sounds great I guess my question is Have you thought about doing it the other way around Which is having fluent bits emits its own traces So that we can follow the path of our logs And kind of like know that when a log is emitted It is sent somewhere And then we would be with distributed traces Be able to know the full path until When the log is sent to the database And how long that takes basically Yeah can you hear me right? Yeah okay so So we handle fluent bit traces internal information That's primarily the question When we talk about traces Traces are hard in general Right so the way that we're going We're starting approach right now because This is in development right Is not to do traces processing Not do the traces correlation in the agent Because we might need to do buffering Some kind of indexing but more complex So in traces just to clarify traces What we're going to do is initially As of now take the traces Is in a row format And allow send that row traces To destinations and backends That they have all the intel to process Those traces and correlate them Kind of Grafana or there's some cloud services That accomplish that From internal fluent bit traces We have not make any decision But this is really interesting We might think on how to ship the internal Events of fluent bit in a Yeah we are shipping the outside information But we are not shipping the internal information So I'm smiling because a few weeks ago In one of our community meetings We talked about this idea of being able To peer in to what is a plug in doing What information is flowing through So we started a discussion on it in GitHub So we love participation in there Try to spec out what those requirements look like We have some ideas but of course There's a lot to go back and forth with security You don't want people just peering in Over a HDP port of all your logs If they're sensitive So a lot of stuff there But we'd love for more community feedback And what we can do there Nice, nice So observability for observability Yeah Any other questions in this room Hi, first of all, thank you You said in Q4 Flamppet will get traces support But I think right now we can use forwarder plug-in Forwarder output for sending traces From the Kubernetes to maybe open telemetry Or other places Will you suggest that? Yeah, it will be okay Or you will suggest that we should wait for Q4 Or legit trace support Yeah, good question We will answer a little and then I'll hand it to Eduardo So the question is Hey, should we wait till Q4 Or should we use some of the existing integrations That already are there So to clarify Some of those existing integrations Are from Fluent D, Fluent Bit Using the forward protocol Which is protocol over message pack over TCP To the hotel collector And that's a great way There's some good sessions on that Or recorded content on that There's a guide on the hotel documentation On how to do it But this is actually different So this would be receiving the traces to Fluent Bit So in Q4 you'll be able to receive those traces If you're using the Fluent Forward protocol Via app or via Fluent D or Fluent Bit You can already integrate with the hotel collector Great, great, thank you Okay, any other Maybe last person from the room We have others as well Great, let's go Hi, the last time I checked the project I remember there was no Yachto layer for your project Like for Fluent Bit I think there was only a recipe Hanging around in the documentation somewhere So is there a reason why you don't provide your own layer Or have you been thinking about it? Yeah, as I mentioned briefly Fluent Bit started as a project for embedded Linux Originally, right? And then we switched the focus to the cloud Because at that time embedded folds There are no standards, right? There was no community around it But in the container space evolved quickly Now those BitBit files from Yachto Are there for historical reasons And we aim to support them But we as maintainers actually We are so busy with so many requirements That we cannot longer maintain that recipe It might be broken We just try to update the version But if you are using Yachto or BitBit files And you can fix it Please submit a PR and I'm happy to get that You know, process through There's no more reason than that It's maintained, yeah Okay, let's take maybe a few virtual questions So we have first about OTLP How about adding OTLP support into Fluent Bit in and out? Yeah, that's what we are doing Actually, where's my camera? I guess it's very generic as well Because OTLP means also metrics, logs, traces And there's another question about logging Okay, so OTLP for us Is more another protocol that we integrate with We integrate with syslog forward Well, I don't know We have NQTT We have a bunch of protocols that we support So for us, OTLP is another protocol That we're going to We're starting supporting right now Initially with metrics Because we were experimenting with metrics Now the default case is traces But we're going to do just route traces In, yeah So we support that already in the input And in the output side Yeah, we did all the protocol Above conversion to C-line, layer And that is functional already What about OTLP logging? Yeah, we just heard that OTLP The log spec was just released But there's not much content that we can add at the moment Because of the following The way that we handle the project And we manage as a community Is like we try to optimize for what is the standard in the industry For example, when we get started with metrics The first question was Okay, what is the standard for metrics in the industry? And what it is? Prometheus So we start to support in prometheus Then OT... For example, if I ask what is the standard for traces Open telemetry Right? Now if I ask today What is the standard, for example, for a... Well, it's metrics for logging I think that it's not standard Right? OTLP is heading into that direction So as soon as this gets more traction in the community With more use cases We're going to start supporting all these layers Great, great answer As essentially it has to organically grow Yeah Thank you Okay, last question Maybe from the room Hi, thanks for the talk Did you think about having a fluent bit As our own input source for metrics? For example, for counting logs And putting that into a metric Yeah, so fluent bit has its own way of monitoring today So you can expose over prometheus That's been there for two, three years Over a port And it gives you input logs And put bytes, output bytes Great ways to now analyze if you're sending data If you're removing data How much data you're sending to particular endpoints Where you might have a no easy log file That exists today We did introduce a few months back Fluent bit metrics It's its own plugin That will take those usually exposed over HDP metrics And ingest them as part of the pipeline So that does exist today So you could remote write those metrics Or export them to a custom port if you need to Okay, that's it for today And thank you We have 20 minutes break Applause for speakers again