 Hi everyone. I'm Sonia Kulcher and I'm a software engineer at Microsoft. I work on the Azure Container Upstream Team, which focuses on a number of open source projects, and I specifically work on Open Service Mesh. Today, I'm here to talk about how we added logging to Open Service Mesh using FlintBit, and how you can use it to create a really customized logging experience for whatever product or service you're working on. Real quick, I'm going to give you a little intro to what Open Service Mesh is, and then talk a little bit more about why we chose FlintBit as our solution. Then I'm going to dive deeper into how we actually integrated FlintBit into Open Service Mesh. Finally, I'm going to go into a number of demos, a three-part demo, to show you just how much you can do by changing a few toggles in FlintBit, and hopefully that'll give you some insight into what you can do in your project. Just to start off, what is Open Service Mesh or OSM? Think Service Mesh but Open Source. Open Service Mesh is written in Go, and it's a CNCF Sandbox project, which is what it has in common with FlintBit. It's also super lightweight and supports the SMI spec. Now, if you're not familiar with the Service Mesh, what basically happens there is if you onboard a number of microservices onto your Mesh, you can then control ingress and egress traffic as well as access-based control between the apps in your Service Mesh. In our case, we use an onboard-based control plane that can inject an onboard proxy into every pod to enforce policies and routing rules. Why did we choose FlintBit? FlintBit is a lightweight solution, and that really aligned with OSM's goal of keeping things lightweight. But more importantly, it's super customizable and pluggable, which means that we're able to offer our users a pretty vendor-neutral approach to logging because there's so many inputs, filters, and output plugins available in FlintBit that users can really configure to their own customized logging experience. There's also tags and matching that allow users to control how data flows through the logging pipeline, which means they can send logs to various outputs and change the nature of the logs in their pipeline. It's also important that FlintBit is cloud-native, which made integration very seamless in a Kubernetes environment. And then most importantly, it's governed by CNCF. So as I mentioned before, Open Service Mesh is also a CNCF project, and so we can offer our users a commitment that both projects are vendor-neutral and quality-gated, so they will uphold the same standards. And we can also rely on long-term supportability because of this governance. And so those are some of the reasons we had when we chose the solution. But now let's go deeper into what's going on under the hood. Like, how did we plug Open Service Mesh and FlintBit together? So we have, FlintBit exists as a side card to the OSM controller, so it's basically added to the OSM controller deployment. In the pod spec, we've added the FlintBit container, and then also outline a number of volume mounts that grant FlintBit, the FlintBit container, read access to the log files. The bigger piece in this is really the FlintBit config map, and I would say that's where the magic happens because this is where users can change a lot with really minimal onboarding from their end because it's really, really easy to configure all of the different parts of the FlintBit config map without spending too much time on the actual configuration focusing more on just figuring out what pipeline is best for them themselves. So let's take a quick look at what the controller deployment actually looks like. So as you can see here, this is just a piece of the OSM controller deployment, and this is just the piece that adds in the FlintBit container. So the first line shows you that we have FlintBit as an optionally-enabled sidecar, and that value to enable or disable FlintBit is tied to a CLI flag that we added. So the OSM CLI has control over whether FlintBit gets deployed or not. Next up, we have the environment variables. So this section is really great because firstly, it allows outbound proxy support. So we were looking at a scenario where if a customer is running FlintBit behind an outbound proxy server, they should still be able to send their logs wherever they wanna send them. And so FlintBit makes it really easy to do by just defining these values in a secret and mounting them as environment variables for the container to have access to. So these HTTP and HTTPS proxies can be, proxy values can be set up with their URL and optionally also a hostname, a password, and a username. The other piece of this is if you want to export any other or set any other environment variables for your config map to have access to, you can set them right here and then reference them. And we will see more about how those values are used when we take a look at the config map. The most important piece of this are all of the volume mounts. And this is where the FlintBit container actually gets access to the pod logs. So these are a number of volume mounts that are required based on the different clusters that this could be deployed on. There could be different sim links to the log locations. And so these are the volume mounts that we have defined. And I haven't shown this, but there are the corresponding volumes to these volume mounts also included in the controller deployment. So now going into the section that I said holds all the magic, the FlintBit config map itself. So just to note, I am not going deep into the service and the parser sections here. Those are two other sections that aren't as customizable because they're pretty much set up to what we want them, how we want FlintBit to function in this situation. So I'm going to focus on the sections that users can really plug in and out. So firstly, there's the input section. The input section in our case is the tail plugin. And so that tails a custom bot in the Kubernetes pod. So here is also where you tag the logs with a certain tag. And that really defines what their journey is. And I'll show you what that looks like in just a minute. Next up, there's the filter section. And the filter section is really great because this is where you can decide what goes into your logs. You can add or remove keys, you can modify the logs that already exist. And the best part of this is that you can really stack filters together. And that allows you to do a lot more with your logs than the information that they naturally contain. You can also refine whether or not you want to forward a particular log. And then again, you can use the tags that match the tags to match the tags you set in the input section. You can also use the tags to reroute where the logs go, which we will see in our demo. Lastly, there's the output section. And there are a lot of outputs. And as you will see in a minute, this really speaks to how truly vendor-neutral Fluentbit is because there are just so many available output plugins. And that list is consistently growing. So in the outputs section, you define one or more outputs that you would want to send your logs to and then use the tag matching to route your logs to the different outputs available. So this is just a quick look at the Fluentbit config map. I'm not going to spend too much time here just because we will be looking at this for quite a while in the demo. The two things I do want to point out, however, are that the path is picked up in the input section and this is where the log file actually resides. And then, and this is based on the volumes already mounted in the OSM deployment. Secondly, there is the environment variable piece. And so right here, you can see the modify filter is adding a key value there with the environment variable that we had set in the OSM deployment. And this is absolutely optional, but we just, I just wanted to show that this is possible to do. So now let's get into the demo. So just to go over what this demo is going to show you, there are three pieces to this. Firstly, we're going to take a look at a very basic pipeline. So here we're going to use a simple input that's just the tail input. We're going to use one filter that adds one key value pair. So that is the modify filter. And then finally, we're going to output our logs to the Datadog plugin. And so I have set up already a Datadog instance to send these logs to. Just wanted to stress that these, this is not the only route possible. This is just one of many possibilities, but it does give you a basic demonstration of what you could do with your data pipeline. After we've done that, we're going to add a little spice to that pipeline. So we're going to add an additional filter that adds a bunch of Kubernetes metadata and then also sends the same logs to a second output. And then lastly, we're going to make things even more fun. So we're going to read out a certain portion of the tags to a different output plugin. So if you wanted to look at two examples of a data endpoint versus one where, you know, you can send messages, Slack is a good example of that. So I'm just going to get started with the demo and we're going to make things more complex as we go on. So before we dive in, I just wanted to take a quick look at the Fluent Bit documentation. And right here, you can see that there is the data pipeline that shows you all of the available inputs, filters and outputs. As you can see, there are a lot of available inputs, but we're going to be focusing on the tail one because that is what's most intuitive and makes most sense in a Kubernetes environment. You can feel free to explore any of these based on your use case. Next up, there's the filters and it looks like they're with your filters than inputs but these filters are very, very powerful and they each of them have a lot of functionality within them. And then there are, like I said, a lot of outputs and this really shows how vendor agnostic Fluent Bit is and we're going to look at three different output plugins in this situation. So let's get started with this data dog pipeline that I had shown you in the slide. So like I said, we're going to use the tail input. We're going to add one key value there. So this is the type. So imagine our type is a test data versus you may have production data. And so that kind of just delineates the type. So let's take a look at our Fluent Bit config map and keep our documentation up with us at the same time. So like I said, there is the service section. All I'm going to point out here is that we've turned daemon sets off. And so one other alternative, if you wanted to set up Fluent Bit in a Kubernetes environment is to use a daemon set. And we decided that it made more sense to stick to, we decided that it made more sense in our use case to use, to tail logs from just our one controller pod but you can actually get logs from every single pod if that's what you need. The next thing I'm going to point out is that there is a tag being set here. And so this tag, this coop start tag basically, it is used as a map with a matcher in all of the other sections. And it is effectively like a regular expression that includes the entire path that the log is actually coming from. And then the path referenced is from based on that volume amount. So now let's take a look at this modify filter. So the modify filter just requires, in our case, there are quite a lot of things you can do with the modify filter but we're just going to add one key value there. And so the key here we're going to add is type and then the value is test logs. And then next let's move on to the data dog output. There are again, a lot of defaults already defined that we can continue to use but there are three values of interest to us that we need to set for our instance. So one is the API key. That is something you can get once you set up data dog. It's automatically kind of set up for you when you enable logs. And then there's the data dog service and the data dog source. These are just human readable values, extra keys to add on to tell you in your output where these logs coming from. So something you want to do behind the scenes is replace this API key with my API key. I obviously don't want my credentials out there but that's something that you shouldn't know is happening in the background and I will be replacing. So now let's go and check out our data dog instance. As you can see, there are no logs there and we can get started with the OSM. So let's just install OSM and enable fluent bit with the config map that I just showed you. So now that OSM is installed, let's see what is going on back in our data dog log viewer. And so right here you can see pretty quickly all of the logs have started flowing in and all of the keys and values that we had defined including the original values like the message and the log level as well as the file that the logs came from. All of that is visible here. And then all of the extra stuff we had added like the source, the service and then the type key. Those have been added here as well. And those really make it easier to query your data once it's already in the log output. So now let's get back to work pipeline. What we saw a second ago were pretty simple logs and so what we're saying now is we want something more robust. We wanna add Kubernetes metadata to our logs to know more about the pod that these logs are coming from. Secondly, we wanna use a second output plugin. So we're gonna be using the log analytics output plugin to just show you that the same data can be sent to different places. Right now we're not rerouting or anything, we're just using a pretty straightforward pipeline with a second stacked filter. So coming back to our config map you can take a look at the Kubernetes filter documentation. It is actually pretty in detail right here but we can rely on the default values that are defined in this documentation. There's a lot more to learn from the Kubernetes documentation including how exactly this filter is able to pick up the information it needs from is able to pick up the information it needs from the logs but we're just gonna add the defaults for now. Secondly, there's the Azure Log Analytics output. This is actually pretty simple. It only requires four values to be defined by you and two of these are actually just values you need to pick up from Azure portal. When you take a look at your agents management in Azure portal you should be able to get your customer ID and shared key. So right here I'm just ensuring that my matcher matches the tag that I had set in the input and then I'm going to add in the customer ID and shared key. So these correspond to the workspace ID and primary key in your log analytics workspace. Again, I'm gonna make those changes offline because I don't want my credentials on the internet but definitely make sure to get those and add those in here. So now that we have that set let's just take a quick look at our log analytics instance. Right now we don't have any data appearing there. It's pretty empty but we're gonna just go apply that config map that we just created and that should have the data flowing pretty quickly. So I'm gonna apply that config map and then the other thing I'm gonna do is also delete the existing OSM controller pod so that it restarts and we can make sure it picks up all of the changes in the config map. This is not necessary but it's a good safeguard just for the situation. And as you can see now the logs that are in Datadog are a lot more robust. They have all of this Kubernetes information that has been added on as an annotation and you can learn a lot more about the environment that these logs are coming from. So these logs just got a lot more useful and then as you can see the original key value pair that we had added with the type is still a part of this. Let's now also go check out log analytics and run that basic query again. And as you can see all of the logs have shown up and all of the Kubernetes metadata that we saw in Datadog is also present here. Cool. Now that we have that going I wanna envision another scenario. Now looking back at Datadog you can see there are some errors and emergencies and warnings, some different logs. And so what we wanna be able to do is say if we have error logs we want to alert someone whether it's yourself or somebody on your team you want somebody to know something's wrong. So let's branch out our existing pipeline. So what we're gonna do is if normally this poop star label existed we're gonna use a rewrite tag filter that will allow us to check for a certain condition. So the condition we're gonna check for is the word error. And then we're gonna re tag that as a page or a pager tag and then send those logs to Slack. And I just wanna note that this doesn't stop those logs from being sent to log analytics and Datadog but it just gets sent to three places instead of two. So coming into the rewrite tag bit the really important piece here is just the rule. So the rule has four parts. The key, so it looks for a key then it checks for this regular expression as the value. It then rewrites the tag to new tag and then it asks you whether you want to keep the original tag data and send it to the original outputs or whether you just wanna discard it. And so we're gonna stick with keeping that data so that we're not getting rid of anything that could be used in our databases as well. So let's quickly add that rewrite tag section right here. So it's a new filter, it's fairly basic. All we're gonna do is add name. We're still gonna match the same poop star tag and re-emit these tags, these logs with the new tag. So now let's take a look at the Slack output. So the Slack output plugin requires you to set up an incoming web hook for your Slack channel which I already did, it took me three minutes because of great documentation, we love that. And so all I need to do is add these three lines and I will be good to go. So I'm gonna add the output and then I'm also gonna change the match. And so we need to make sure that we're now matching the new label which is the pager label that we just set. And then again, behind the scenes, I'm gonna change this web hook to the web hook that I'd already created for the Slack channel that I have set up. Let's go through the same steps again, apply the config map and then delete the pod. So now let's go back to DataDoc. Oh, actually let's check out Slack first. We can already see some of the logs coming into Slack and you can see that they are fatal or contain the word error in them. And we come back to those in just a second but let's check out what's happening in our other outputs. So we still have output coming to log analytics but how do we know that it's a correct output? I'm just gonna quickly check using the metadata that we already added. So as you can see this pod identifier matches the pod identifier in log analytics and let's take a look at DataDoc and you can see now that the metadata is pretty useful for making sure you're getting logs from the right place. Our Slack channel is now pretty full. We have a lot of logs flowing in and probably somebody getting really disturbed if they're the person who has to deal with all those logs. So coming back to this pipeline, as you can see, we were able to both stack the logs and send them to a variety of outputs and change the root of the logs along the way. So looking ahead, I hope you were able to gain something from watching this talk today. What I will say is that there are so many output plugins constantly being added and the rapid development cycle of FlintBit really make sure that the new features that we have needed along this way have been consistently added. A great way to engage with what's going on with FlintBit is to join their Slack. And if you wanna know more about what's going on with OSM, we have a monthly community call that you can definitely join in on. I've also included links to the OSM and FlintBit docs so that you can also take a look at some of the code I referenced today. And just wanted to say thank you for watching today and definitely feel free to reach out. My Twitter handle is included at the bottom of these slides and this has been fun.