 Okay, we're going to get started. Taylor, could you please turn on the recording? Ready to go. Okay, I'd like to thank everyone who's joining today. Welcome to today's CNCF webinar, using machine learning for autonomous log monitoring. I'm Sanjeev Rampal, a principal engineer at Cisco and a cloud native forum ambassador. I'll be moderating today's webinar. We would like to welcome our presenters today. Larry Lancaster is the founder of Zebraim and Gavin Cohen, Vice President of Marketing at Zebraim. A few housekeeping items before we get started. During the webinar, you are not able to talk as an attendee. There is a Q&A box at the bottom of your screen. Note, this is different from the chat window. This is the Q&A box. Please feel free to drop your questions in there and we'll get to as many as we can at the end. This is an official webinar of the CNCF and as such is subject to the CNCF Code of Conduct. Please do not add anything to the chat or questions that would be in violation of that Code of Conduct. And basically, please be respectful of your fellow participants and presenters. With that, I'm going to hand it over to Larry to kick off today's presentation. Hey, thanks so much. Hey everybody. So this is really an exciting opportunity for me to talk about something that I think whose time has come and that's autonomous log monitoring. So with that, let's get right into it. So machine data is my life and I've spent most of my career sort of taking telemetry for products in the fields and then turning that into tools and business intelligence and deliverables back to end users and so on. And so what I'm going to be talking about today is kind of motivated by that history, having sort of delved in and built sort of platforms on top of machine data so many times. You know, I kind of got to the point where I was feeling like, you know, this should mostly just happen automatically at this point. I mean, after doing it so many times you kind of get sick of doing it. And so then I'm going to talk a little bit about sort of what motivated Zebrium today. So really it started for me sort of looking at the state of log monitoring, which to me kind of leaves a few things to be desired. But the biggest problem from my perspective is kind of twofold. One is sort of the mess of the data itself in the sense that, you know, it's not structured sort of formally with a grammar and so on, log data. And so it's hard to build on top of it because you end up in a situation where you always have to be sort of, you know, maintaining things and you have to be aware of the semantics of the data that you're working with. And so it's a very manual process. So when I look at when I look at sort of what comes out of that, you know, in terms of monitoring solutions. I see that for one thing, sometimes it ends up being slow anyway to figure out, you know, what's happening when something when there is an incident, just because you kind of have to go digging through logs yourself anyway. So that's, that's annoying and suboptimal. The fragile part I touched on, but this is actually, I mean, I'm not sure how many of you have sort of, you know, had to ever build sort of scripts or reg X's and parsers or tools on top of log data. But like, you know, what'll happen oftentimes is, you know, you'll have something that's working, and then a developer does a really nice thing than that developer maybe someone somewhere else in another ecosystem, because you have a deep stack or they may be closer to you but they may not owe you a notification and they'll do something really nice like fixing a spelling mistake. And the next thing you know, stuffs breaking and so there's, there's all this kind of stuff that happens that that that makes sort of dealing with logs and using them, sort of a manual process. And finally, alert fatigue is is another and I think this kind of stems from, you know, given how little sort of visibility into the actual semantics of what the data is talking about the tools have. Typically what you'll have is people set up alerts on okay, you know, if I get, if I get more than, you know, 1000 errors an hour, then page me or, you know, I was talking to one guy who was kind of like yeah, you know, we have these alerts set up and I like I basically look and I say, you know, if if it's noon and I've gotten more than 5000 alerts, then I know something's wrong if it's less than 1000 I know everything's okay. And so that's kind of the state of things that that that kind of annoys me. And the reason is that if you think about it at least for me I mean logs have always been used for root cause. It's almost always the case that that you're going to go to logs at some point in fact if it's a new sit if it's a new problem if it's an incident of a new kind. I think thoughts are very good you'll end up in a log file, but why aren't so given that I mean clearly they have the information that we're looking for so why aren't they better at helping us monitor things in the first place. And I think I think it kind of drill it kind of boils down to this manularity when I say logs are stuck in index and search what I mean is there's a person searching right so in the at the, there's a problem with that. I mean it hasn't that hasn't always been a problem especially when it's kind of your app you understand things. You're very infinitely familiar with the stack. You know you can quickly search that through things and figure out problems. So let's look at that model so so 20 years ago when I started in the valley there was kind of a shrink wrap model. Of course it wasn't you know always delivered by box but you get the idea you had an incident and there was a user or a customer or you know maybe a few users at one customer that had hit a bug and there was a support department. The world was completely different back then. And so you'd had an incident on a monolithic application. There may be up to 10 I mean 10 would be pushing it log files that people would need to be familiar with and look true. And you would have them indexed and you would search through them and you would figure things out and that was great. But things have changed. I mean so nowadays you've got an incident that can affect tens of thousands of users. You've got dozens of services at least you've got you potentially a thousand log streams if you count all the different containers and services that you're running. It's kind of a zoo and yet with all the pressure of you know 10,000 times as many people anointed you you're still stuck indexing and searching through the data to try to figure out what's going on. So to me this is kind of unacceptable. So I believe that the future doesn't have to be like this. I feel like we've kind of gotten to the point now or at least to a large degree. We can let systems help us kind of pinpoint the root cause indicators in the vast swaths of log data without us having to go through the same process that we've been doing for decades. If you think forward 20 years from now I do you really think that we want to be doing this 20 years from now my answer is no so let's get started with that. What I wanted from a tool is something that will kind of characterize incidents before I notice that's kind of a like a really ambitious statement. So let's let's talk about a little bit about what that what that means and what it what it's taken. So what it means is, you know, sort of automatically to detect incidents when they're happening so things that are weird start happening. For example, if you were to hire a new guy to come in, you know, in DevOps space, maybe he's checking out, you know, the system, he's monitoring alerts. He's looking around. He starts seeing things go, hey wire, typically he'll he'll have a good sense that things are going wrong. Even if you may not not know what there may be he may be capable of figuring out that hey something's going on here. So let's start with that. Let's let's automatically detect incidents without a whole bunch of alerts and rules being configured and so on. And then let's go find stuff that's germane to root cause indication within the logs that that we found and I think what's interesting from this perspective is that as we've accumulated probably 100 and I don't I want to say 110 120 real world incidents across over 30 stacks that people have been generous enough to share with us. We've found that there are some sort of fundamental ways that software behaves when it's breaking that lets a system go in and find clues for you to to root cause that. I talked a little bit about why it's so hard to do this from from the perspective of monitoring software. But essentially, you know, you've got ambiguous parses. You've got formats changing. Another problem is needing experts to interpret the data. And so, you know, because apps apps are bespoke. So what do what do I mean by all that. So, so like, you know, if you go look at a typical log file in var log or whatever on your Linux machine, you'll you'll see that there's, you know, there can be various different formats. Some of those may be, you know, also repeated indices log some may not depending on your configuration. But what what you'll what you'll find is that sort of the stuff that you'll find tools giving you sort of connectors and pre built sort of ways to get at that log data. They're mostly concerned with the prefix, which is kind of the stuff to the left of, you know, where to the left of the log line so you'll maybe sometimes you'll have kids or function names will have, you know, like timestamps you'll have severities you have some of these are really long and complicated sometimes they're very short and simple. And that's all very important information. So we need to grab all that. But everything to the right also needs to be structured if you want a machine to be able to tell oh this. This is a number in here and it's going up and it's not usually this higher or you know this very particular kind of event is happening a lot a lot more often so you need to kind of you know programmatically get into that semantics of that data and that's kind of what's what's been hard about it. So the first thing that we do is we we structure the logs to a relational level where if I wanted to run queries about what was going on in my logs I could do it. Now we don't have you do it. We have a database on the back end that has that information and and customers can get access to it, but that generally that's not that's not what we're what we're doing we build stuff on top of that. But the point being you know, if you imagine in your mind sort of a table. That gets created from a certain kind of log. So here's a very simple example of a very specific kind of log. And you can see that you can see that, you know, over time there's some numbers changing and it's kind of clear to the eye what those should be called. And what we want is we want we want software that basically creates a table just for those kinds of things at least logically creates such a table and then has columns that are typed for each of those. Sort of variables and and based on this now you can start to do more interesting anomaly detection from the user's perspective. I think what's most important is that you don't want to have to. You don't want to have to sort of give a whole bunch of prior information to the system. You don't want to have to sit there and configure it all day. You don't want to have to get something wrong or right. You don't want to have to go look and make sure somebody else has a connector for X because for your application logs at least nobody's going to have a connector for X. And so you want something to you want to system that can just get in and brought the grammar like a person would and and then iterate on based on its understanding of that grammar in the background that you're having to worry about it. And that way you can embrace free text logs you know structured logging is cool. We certainly you know can can can rock that but it's annoying. I think sometimes at least from a developer perspective that you know it's really hard to read structured logs for a human. And there's really no there's no need that we should have to retranslate our entire infrastructure for machines. Why can't they just translate it for themselves. So you know a big chunk of your stacks always going to be free text logs and we believe that that's important and valid. So once you've done that now you can do anomaly detection on that data. And so like I was mentioning earlier we kind of have this sort of ways of looking at the data once we've got it structured properly that has yielded amazing results for us in terms of stuff that just tends to happen when things break. And and people always ask oh well what kind of you know what kind of models do you use and so on. So without getting too much into it what I would like to say is we don't we don't do any well so in this part of you know in this part of our software stack we're not doing any deep deep learning because this is all kind of real time. You start in we start ingesting logs in line we structure them in line everything just kind of you know gets gets better in the background there's no batch processing. It's not going to cost you an arm and a leg to run the service but so so what the way we actually are able to pull these out is by looking at point process statistics on the event type so if you think of each event type. So in a given log file or maybe a thousand unique event types right those tables I mentioned there may be a thousand of them virtually speaking kinds of things that can happen in that are expressed in the logs. So if you think of each of those as sort of like a like a like a point process where it happens with certain particular frequency or it happens with particular values or it happens in relation to other kinds of events and you build up that matrix. You can very quickly sort of narrow down coincidences that are anomalous and that's kind of the approach that is yield yielded amazing results for us. Again we don't to just to do not just the structuring with the anomaly detection we don't want you to have to tell us too much about application behaviors. So in our you in our UI of course you can set up alerts you can do all that stuff you can you can actually set up a very complex alerts if you like on this event has to happen and then that kind of an event with you know with the same. You know the same pod name or not or you know this parameter has to be higher than that within the event and there have to be three of them and you can do all of that. But you shouldn't have to do it to start getting anomaly detection working to start getting autonomous monitoring on top of your stack. And the great thing about taking this generic approach is that it works great on anybody's app. I thought I talked a little bit about there are other people kind of approaching the same space and I think it's a fascinating space there's a lot of brilliant people out there now looking at ways to structure this kind of data. There's sort of I guess what I call a community of academics that look at it one way and there's there's some folks in industry to look at it different ways. I guess the deep learning one for me was interesting because I kind of have this I have this story. So I went so there was a there's a large sort of tech company that you guys would all know their name and they have a CTO of sort of their services division and I went and spoke with him and he said yeah so this is a problem for us. So we've got all these logs and we've got all these you know different products and we we gather all the logs together and and so we decided we wanted to go do some some learning on that on that data and try to understand what's happening in in customers environments. So so what we did was we went out and we bought all of our senior engineers DGX one workstations and we sent them for training on deep learning and then they start to set them loose on it and what ended up happening was after six months we kind of abandoned it because we found that they were having to spend all their time structuring the data rather than actually building models on top of it. So it's kind of been it's kind of been my my learning that if you structure the data right you have a lot of options and you don't have to jump for the most expensive trendy one right away you can try other other methods as well. So we tried to do sort of a Swiss Army knife machine learning approach. It's interesting when you look at log data and you might find this interesting so like if you look at log log data typically you'll find like for example if I just take a terabyte of some of some applications stacks log files you know out of some environment and I look at it. Typically about half of the event types I see within that corpus I'll only see once or twice. So what does that tell you it kind of tells you that if I've got this vision in my head where I'm going to onboard someone and I'm going to get start looking at their data and all of a sudden I'm going to have this massive corpus of data that I've learned over what the you know exactly what that thing looks like and and all of that and all of the different permutations and distributions of that event type. I've got another thing coming it doesn't work that way. And so so kind of what you need to do is, basically we have kind of a four stage pipeline, which depending on how often how many times we see an event type will do a different stage of that pipeline will have the primary effect. So there there is a layer that if I've only seen the event type once and there's numbers in it. I'm going to assume those are parameters until it's proven otherwise. The next step is basically reachability clustering like these lines are kind of alike each other. I'm going to I'm going to assume that they are until proven otherwise. And once I've had a few examples, then there's sort of a naive base classifier that kicks in with a global fitness function that basically says, Okay, here's the kind of blob of stuff that's related to each other and here are the columns and then we become really sure of it. So on the back end we kind of shuffle things around into these buckets and then eventually it sort of hardens into a structure. The great thing about doing it this way is that you start out with something that works but it gets better the more data that you feed it. And the great thing about using sort of point process statistics to cross correlate among event streams is that the anomaly detection gets better the more complexity you have the more cross correlated streams that you have So with that, I'm going to hand it over to my colleague Gavin who is just an absolute whiz with the demo and he can do it in a sort of a time efficient manner. And then we're going to come back and we're going to we're going to have answer some questions. So I'm going to go ahead and stop sharing. Okay, just making sure everyone could see my screen. This is Gavin Cohen. So assuming everyone can see this, what you're looking at here is the overview screen that appears immediately after logging in. Now just to set some context for what I'm about to demonstrate, we've deliberately ingested just a tiny data set. So 24 megs, 180,000 events. And there's absolutely zero manual configuration that took place. Yes, and no one built rules. There was no pre learning. You know, there was zero knowledge of this data set until it came in. And essentially, our, you know, ML went through the, the events. And what we uncovered from an overview perspective is a bunch of exceptions and events that have error or high severity, but they're really interesting stuff here. There were 155 anomalous events. So events that broke pattern compared to what we would expect based on that small data set. And then one incident and the incidents are the things that we really care about. These are the things that bubble up as correlated sets of anomalies that we believe anomalies. They're not happening by chance. They're happening because there's, there's something changing in the behavior of the software. So what I can do here is click on the incident and you're taken to sort of a root cause description. So it seems reasonable postgres stuff, which is what it calls it. And that comes directly out of one of the events. And if I click on it, you get the detail of what we found. So let me explain this a little bit. The data was collected from an Kubernetes set of pods running the Atlassian stack in AWS. And what we found is in this pod postgres master, we see a couple of messages saying the postgres stopped. And then in a different pod, we're seeing this sort of correlated set of events coming from JIRA. So a different application. And in particular, this one that says it's, you know, it's, it's having postgres postgres issues connecting to the database. So clearly related set of events. So out of the 180,000 events that we ingested, we literally just detected five that encompassed this incident. And it turns out this was exactly the problem that occurred in this situation. Something shut down postgres and immediately after the rest of the applications running in different pod started noticing. So that's what we see that we've detected. Now, if you take this a little bit further, we make it really easy to sort of confirm the diagnosis that we've found and to maybe troubleshoot more or find out what really happened. So with that, I click on browse the incident. And I'm taken really into this sort of interesting kind of almost log manager view, but I'm filtered at the moment. So what you're seeing at the top of the screen are a set of visualizations of the data set. So the entire space of this data set encompasses 180,000 events. And then we break down some visualizations and I'll come back to those in a minute. And because I clicked on the incident and filtered on just those five events that make up the incident. Okay, so we can see them all now together, much like they would look like if you correct those lines out of the different log files, in this case two different log files syslog and the gyrolog. From here I can do sort of two, you know, one click workflows. The first one is peak. And what that does is lets us look at the surrounding events or that event in context of the source of this. This log stream. So it's the particular syslog stream that came from just that pod, which is the where this event originated, and I can see all the other events surrounding it maybe that yields more information about the, the incident. I can also go back to where I was, and I can do something different, which is I can unfilter so I can turn off this incident filter that we have on now. And now what we're seeing is we stay in the same place, but we're seeing those incident events surrounded by all the other events from all the other pieces of the environment. So in this case you see other log files by confluence and if you scroll around, you'd see all the other Atlassian components spinning out messages, but now you can see the incident in the context of everything. And so this again is another mechanism just to quickly kind of understand and troubleshoot what what occurred around the the incident itself. Now, maybe just to kind of peek under the covers for a moment. There's this pretty cool visualization of the data that I've just brought up here we call an x-ray. What we're doing is on the x-axis is is time and the y-axis is the event space spanning all the different log sources. And we're drawing kind of a colored rectangle everywhere that we find an anomaly represented in the location on the y-axis to where it came from or what particular events it came from. The brighter the little rectangle, the more anomalous the event. So the most anomalous are these sort of very bright white colors that you see just above my cursor, my mouse pointer. Now what you see here is very typical. There are always anomalies. So there are always going to be events that break pattern. And you don't want to alert on those. You don't want to create incidents on those because there's really not enough context to say they are or they're not a problem. Sometimes they're problematic. Sometimes they just events that break pattern or that, you know, new events occur for whatever reason. But if you sort of go to this section here, which is where we found the incident. You see this really tight band of correlated anomalies. And that's kind of our trigger when we see sets of very high likelihood anomalies, the very bright rectangles correlated across different either parts of the application or different log files or log types or log streams. That's our indication that there's an incident. All of these things broke pattern. They're all anomalous. They're tightly correlated. There's some very high probability anomalies in here. And that's our incident. So that's how we pick these five out of 180,000 events to make up this incident. We saw this, this sort of band and we pulled out what was most anomalous then that became the event. And then the root cause is identified kind of by the leading edge of that. The first sort of anomalous event that happened that seemed to trigger everything else. And so if you remember in the event, we showed you the root cause Postgres stopped and then the symptoms, which were in this case Jira noticing that couldn't talk to the sequel. And actually, if you go further and you look at some of the yellow, the other anomalies around it, you'll find that they, they relate to some of the other applications that also started having problems once Postgres stopped. So that gives you a good sense of what's sort of happening under the covers. The very last thing I'll show you sort of speaks to what, you know, what we did to get to this point. And Larry sort of spent a bit of time in his presentation talking about how we structure the data. But there are a few interesting things you can see here. If you look at this log line as an example, wherever there's a blue piece of text with discerned that there was a variable part of an event. So in this case, all these blue pieces are variables. And in particular, so this is the Postgres stop message. The word stopped is actually a variable, meaning we've seen an event of this type somewhere else where then there's a different value for stock. So what I can do is I can chart it. And what you see nicely here is over here we get on, you know, at 417, the stopped Postgres. And then over here we get somewhere later, two minutes later, we get a starting and a started. So the same events type that we've categorized through the machine learning with different values for that variable. To get this, I didn't have to parse anything. I didn't have to tell our tool to, to, you know, what to look for it automatically parsed out that as a variable because it saw a bunch of similar events that had the same structure. That's one idea. Obviously, you know, to make to aid in what we're doing here. You also need to be able to do things like search. So as an example here I can, I can do sort of, you know, full reg X searches. I'll go here and I'll search for the text. milliseconds. And sorry, I missed types something there. And what I'm doing is I'm getting taken to an event that the fight where where there's a match for the text by search bar. And once again, you can see here's an example of the end event that it found in the search. The blue is the variable text. And this time you see, you know, there are a couple of metrics that it's found as variables. So I can pick one of them. I can do my display charts and you get sort of an interesting plot of that value inside all the events of that type and how it changes. And I might want to look at, you know, this one that looks like an outlier and go and, you know, click on it and I can get taken there and and so on and look around. And this is all around being able to learn the structure of the underlying events and then be able to pull out this data without having to to manually build any parsing rules. So there's a whole lot more, but I'm going to stop here and hand it back to Larry. Thank you. Hey, awesome. Gavin, thanks. Let me go ahead and share my screen again. Then wrap this up. So, so let's talk a little bit about where we're at right now. So, so we're picking out application incidents Kubernetes incidents and even some security types of incidents. This seems to be working pretty well. So recently we've had some exciting validation so my data reproduced a slew of real world incidents from so they kind of manage Kubernetes clusters among other things that they do there. It's amazing company, but using litmus, which is a tool that they're involved in. And we were able to pull out those incidents that they recreated 100% of them and pull in a root cause indicator and put it into that incident page without anyone having to tell the system. Anything. So this vision that I've outlined for you while it sounds ambitious is actually coming true. So that's been very exciting for us. Next stop for us is to bring in more stuff. So it's so one now that we've got sort of sort of a baseline of autonomous monitoring that seems useful in the sense that if if I weren't me I would want to use it. So the next stop is what, okay, so now what, you know, things so incident, sort of the, the sort of the evidence that you would use to root cause incident doesn't stop at the log file. Sure, it's a very rich source. But, and it was probably a great place to start but but the next thing we're doing is bringing in Prometheus metrics. You know, if you're, it's very easy to deploy in Kubernetes, especially we deploy a scraper and then we look for anomalies in those time series and cross correlate them with log log events so that's going to be in a very soon upcoming release. And I think what that's kind of what we're doing there is we're, you know, we're saying kind of this should be a one stop shop for incident root cause detection for the unknown unknowns for which you may not have, you know, created rules or so on and so on or, you know, maybe it's not reasonable to do so. It's, it's, it's that kind of thing that that we feel it's time to do. So, so that the machines can start helping people do their jobs, and people can up level and do more strategic work than digging through log files. So thank you very much for your time. Here's my contact information we'd love to hear from you. And we're going to open it up for Q&A now. That was great. Thanks, Larry and Gavin. So we now have some time for questions. Please, if you have a question, drop it into the Q&A tab at the bottom of your screen. I see one question there from Nikhil. So Larry the question is, what kind of what is the actual learning algorithm that is used are you using some kind of neural net. Right. So I guess I touched a little bit about upon this during my presentation. So there's really kind of two separate, you know, sets of analytics that we do sets of machine learning. The first is of the structure of the data. And so I talked a little bit about the Swiss Army knife that we use there. And so there's sort of a continuum of approaches that we use that apply more depending upon the frequency of given event type. So the first stage would be heuristics that relate to sort of nesting, nesting indicators, numbers, special types that that you might you might imagine such as, you know, we might so, you know, floats versus ants and so on. So a lot of stuff like that. The next step is sort of reach ability clustering, which takes sort of a more global view of the lines that have been seen. And then the next stage is a naive base classifier with a global fitness function. And then finally, when we're going back and we're kind of sort of amending the structure learning we've done. We like to use LCS for that. So LCS is actually kind of the state of the art for for learning blog structure. It has some weaknesses in low cardinality data. And so for me that's actually kind of like, it's kind of like polish on a car. So it's the last thing that you do. And so with the with that sort of palette of tools, we found that to be very effective. And then and then there's the question about the question about the anomaly detection. So as you saw, there's a couple phases there. So, so the first phase is kind of determining that something is anomalous in and of itself. And when I say in and of itself, I mean that an event of a specific type happened anomalous in isolation and that will have to do with for if it's an event type where you have lots of examples, you have a good sense of, of periodicity of distribution of values of parameters within that event. So you can speak specifically to that. And then, and there's other things. So for example, for us, severity is a free parameter. So you may see an event type event of certain type happen that you may see happen with a difference very. So there's various dimensions upon which that anomalousness can can be can be sussed out. But then the interesting stuff stuff happens. Finally, when you look at these event channels as independent point processes and you develop statistics from those processes. In terms of their auto correlations, their cross correlations, and also the correlations of their activity as a sequence of events versus the values that you see in the parameters. And what you end up with then is something where you can really, really start to hone in on incidents. So hopefully that gives you a good explanation for for what works for us. Thanks, Larry. Maybe I'll tear up a question from myself here. How do you see the Zebrian product. In combination, you mentioned you're going to be ingesting Prometheus metrics. So what would be your target configuration there in terms of does it coexist with the Prometheus based stack or an elastic search or, you know, if kind of a stack. So what would be, you know, where do you see this coexisting with those technologies. Yeah, you know that it feels so I'm not sure if you mean that in a tactical way or in sort of a product positioning way so maybe technical like what would this be designed to complement one or more of those technologies. Yeah I mean so so really I think what we're trying to do the value we're trying to deliver is, you know, yeah sure you know you can have a you can have a log manager. You can have all that functionality but let us do the work to so that's a completely different value prop right and so part of that making that better is to take data wherever we can get it. And to apply it not to the problem of being a log manager and not to the problem of being a metrics alerting dashboard but to the problem of root causing an incidence. So, when you look at it from that perspective, you know you start thinking a little bit less about sort of well how, you know, because nobody else is doing that they're not trying to be that you might say well there are normally detectors in metrics in the metric space at least. And that's true. I'm not sure how far I would take that in terms of root cause detection for new issues if I don't have logs. So I don't really view those as sort of competitive and I really think that sort of like someone who's using Prometheus just to browse through their environment. We may be happy to do that we may take their Prometheus configuration if they share that with us and and use that for our collection so that we know, you know, I mean we're going to want to know what's exporting we're going to want to have access to all of that and of course Kubernetes makes it very easy to get that information but in general we're not trying to come in and say oh just use us you know that's not that's not where we're coming from in the end, you know if you find that you don't need to buy a certain other tool that you know is costing you money, because now you've got something that that will display the data for you will let you search the data and chart the data and all that and it's finding incidents for you that's, that's great I would be like success for us but that's, I don't think that's really where we're focusing we're more focusing on building the value of finding the root cause and giving it to you first. Excellent thank you. Any more questions. Well looks like that's it so thank you Larry and Gavin for a great presentation. Thank you to everyone for joining us the webinar recording and the slides will be online later today. We look forward to seeing you again at a future CNCF webinar. Thanks and have a great day. With that we are signing off.