 Good for me to start. Cool. Thank you. Thanks everyone for joining me today, my first FluentCon. Happy to be here talking about a topic that I've spent a lot of time with in the last couple of years, but it's not really my background. But I wanted to share with you some of the findings and work that I've seen happening out there in security and how FluentE, FluentBit can be used to build use cases around security and SIM. My background, more of an operations type person, practitioner for a long time. I've run various types of organizations. I switched in my career and became an analyst at Gartner for a few years covering what's called observability today. Back then it was monitoring IT operations. I'm currently the CTO at Logs.io. We're an open source SaaS company. We build a lot of different technologies for operations and security based on an open source platform. We run a lot of different teams there. I'm based in Florida. It's good to be in the West Coast. It's been far too long. So today I'm going to talk about all the great food that we're going to eat in LA over the next few days. I'm sure all of you have very big agendas of Asian food and Chinese food and of course Mexican food that we're all going to be eating. Lots of tacos on my plan because in the East Coast we don't understand Mexican food. So it's great to be back here where the best Mexican food is. But besides that, we're going to talk about SIM. We're going to talk about data collection, parsing and enrichment, things that are really critical to a SIM strategy. Talking about ingestion in storage and then correlation and use cases and users and what I hear and what we hear from security practitioners. How to build your own if you want to do this. Talk to lots of companies that are doing that, trying to do that, various parts of the journey. And I'm going to talk about food a little bit too because that's kind of the theme of what I'm doing in LA this week besides talking tech. So what SIM is, is it's this generally legacy way of trying to collect all the signals from your different security technologies and bringing it together so that you can understand the threat level in your organization. And you try to aggregate it and correlate it because when someone's attacking you or you're trying to understand an attack that's occurring, there's going to be different fingerprints left across different devices. One could call these logs. Events depends on what you want to call them, but generally they're the same thing. You want to store them centrally, analyze them, alert people that bad things are happening, and then folks investigate them. So the tool can help people do threat hunting, forensics, other types of investigation. Some SIMs, especially the newer style, we store raw data. We store all the logs in information. Older systems would correlate it and store an event. Like this was something bad that happened, but I'm not going to keep all the data that goes into it. So SIM doesn't necessarily mean you're storing everything all the time. And we're always trying to figure out ways to make this more efficient because obviously it's super expensive to store all of this security data. So there's a lot of challenges in the market. So there's a lot of different types of food. There's a lot of use cases. I've outlined many of the ones that I hear about regularly. And then a lot of users. These are different users. I know most folks here have different types of use cases and operations, but totally different users. Some companies will tell you DevSecOps is a thing. I can tell you out of roughly 200 of our customers using our SIM. And over 1,800 users using us for DevOps. We have about two customers, one of which is a really large company and one of which is a company and one of which is a company and one of which is a company and one of which is a company owner. So by and large, this is not a thing that people do, even though it would be way more efficient for them to store this data once and use it for two use cases. It's not really what happens because the buyers are different. So like these users down here, they have a separate budget. They have a separate set of needs. And the things that they're doing are also quite different. When you look at some of these and I'll just pick out one of the use cases to give you an example. Let's say threat hunting. One can say how is threat hunting any different than doing analysis of an incident or an outage. The issue is that when I do threat hunting, I want other information to be brought into the tool that gives me more context from a security perspective. What was the attacker's IP address? What network did they come from? What's the AS number? Do I know anything else about when that same attacker did anything to my network? These are totally different things than we think about in ops. So these are some of the challenges. Different use cases, different users. And the market dynamics for those that are interested in SIM, this is the latest magic quadrant from Gartner. You can take a look at almost all of the vendors on here. I think there's one and a half of these vendors that actually build cloud products. Most of them are on-prem, legacy, appliances. This is the market, but yet people spend $3 billion a year trying to solve this problem. It's crazy how old it is in terms of the technology. Proprietary agents, no open source really to speak aside from one company, maybe one and a half that do a little bit of open source. So it's a really legacy market, obviously, right for disruption. And you say, well, you could build this yourself, right? I can use ELK. I can use some of the other open source tools out there. The challenge is that there isn't a lot out there. There's probably two really interesting projects. Elastic's obviously gone a proprietary route over the last year. They don't really have an open source option. LogStash is commonly used. And so we always recommend people use FluentD and FluentBit for these types of use cases. There's also a new elastic search replacement called OpenSearch. A lot of people are starting to use that if they want to keep things in the open source community. There's also a project I'll talk about called The Hive that's a really cool tool for helping do incident management workflow and some enrichment that I'll talk about. But it's not really a sim. So you have to do a lot of things to give you some ideas of how you could do something like that. So similar to a lot of the discussions today about enrichment, some really interesting talks and some great insights that I picked up. Most of the data types for sim are still syslog. We're talking UDP 1980s technology here. And then there's log files. But there's a lot of syslog out there still, especially from network devices. And it's still a big challenge. So this stuff is very old technology in general, what people want you to support in your system. And then there's metadata, which I'll talk about. So there's insecurity threat feeds are critical. There's all these companies that create feeds that say these domain names, these IP addresses, these subnets or AS numbers depending on how you want to look at it constitute botnets and people that are doing attacks on the networks. And with sim, you want to take this data and stitch it in real time. So threat feeds are really critical to sim because they provide great context. And then there's a lot of look up lists, whether it's GOIP that one's easy, we do that. But you can get into some pretty tricky ones around data center management, CMDB. There's a lot of different ways that you want to do look up tables in real time. And Honorog and I have talked about this a lot. He was on a podcast that I sometimes host and look up lists are an interesting topic. And I'll touch upon that. And where I think FluentD and FluentBit can help a lot at helping federate look up lists because right now it's really difficult to do this at scale. And so there's a lot of different data sources in sim. They're usually security or infrastructure focused. And a lot of these are traditional infrastructure. And when you move to the cloud, a lot of this changes quite a bit in terms of the technology itself. But you're talking about the perimeter, meaning the firewalls, the IDS, load balancer is various things like that, proxies, reverse proxies, infrastructure itself and then the host. So you want to make sure that hosts are not compromised. You have to understand the integrity of the host. These are all critical things in compliance. They're critical in security so that you understand when changes occur. And then there's lots of cloud data sources. Everything from our typical cloud providers, to the cloud providers, to the cloud providers, to the cloud providers. And then there's all these different things that are really listed up there to really specific ones like I want to get my Salesforce data and understand how that's being changed. I want to understand my office 365. And then there's so many federated identity systems that people use today. It's a big challenge. And there's more and more of these things. We have a huge amount of plugins and technologies to make this easier. There are sometimes some weird ones where we have to still go to Logstash because they have plugins that do things like office 365 where there aren't good options in the open source world today, unfortunately. And many of these get pulled via API. So if you want to get audit data from Salesforce, you've got to hit it, grab the data. It gets a bit tricky. Some of the newer systems out there, and we do this on our platform, you can subscribe to a stream. And I know AWS pushes this a lot. It's way cheaper for you to subscribe to a topic and get information than querying an API over and over again where they charge you every time you hit it. So stream processing is very new today. We have a lot of people who are using it. Almost no one's doing it today. But I think that that's the way of the future because of the volume of data. And the fact that when we're polling APIs, it's extremely inefficient and expensive. And the cloud providers are making us pay for that. So anytime we can stream data in, we're looking at all the ingredients that go into our guacamole over here on the right, all the important stuff, you have to understand schema. So a big challenge in SIM and in logging in general is understanding what a message is. If I get the same type of message from a firewall, a switch, and a host, how do I know that this is going to work? The answer is it's completely inconsistent, and there's been a lot of attempts at creating schemas. ECS, which is the elastic common schema, is open source. It's one way that people try to normalize the data. Open telemetry, we just incorporated, and I do some work on hotel as well. We just incorporated flexible telemetry. So we're going to be a way to do the same type of thing, normalize the data so that I can say this is a user that's authenticating. And we can start to understand different types of telemetry without necessarily needing to build custom parsers. In security, there's also a ton of frameworks for compliance. Here's a few examples. There's a lot of compliance needs that my organization have, and there's many other ones that are specific to an industry like PCI for payments, HIPAA for healthcare, and it goes on and on and on. When customers come to us and they're under a compliance requirement, they want out of the box reporting, they want best practices in the SIM. It's really hard to implement that. It's really hard to implement that. Building of rules and parsing and views, it's a big challenge when you start to get into compliance. But everyone's under more and more regulation, and that's just the state of security today. So I mentioned enrichment. So the most common one that everyone here deals with is this data. And something that is definitely a big focus for open telemetry. We're just starting to make recommendations of using eBPF and open telemetry to provide enrichment. So there's a lot of great work happening where logs are soon going to be enriched with additional data automatically, and we're going to start seeing more and more information come in. So we're going to start looking at who a user is, what group are they in, what type of employee are they, full-time contractor, external. These are all things that when we enrich the logs with this data, the logs become way more descriptive and useful, and it shortcuts the forensics that we do. And so we start to inject additional data that is in question about the user in question. And these all increase the size of the data, but they make it way more valuable. Similar to spices or herbs as the picture shows here. We put herbs into our food because it gives us a great set of flavor, it gives us way more depth in what we're eating, and this is the same type of thing, so very similar. The other way, the challenge with enrichment is some of this data you want to centralize. So when the data comes in, I want to enrich it as it's before it's stored right on the edge of that. But in many other use cases, it's better to do this at the edge because it's more scalable. So for example, based on the scale of the environment or how much data you're bringing in, how do I update these lists? And a lot of the challenges when I'm looking at threat intelligence or I'm looking at asset information, how do I update that data and keep it relatively real-time? It's a hard problem to solve, either doing it centralized in one point or doing it on the edge. And I think we're going to need to do more and more of this on the edge, which is that we're going to need to do more and more of this on the edge. So I'm going to go ahead and log data, which is something that is not well done today in the industry. And I'll give you an example. If a user authenticates and it's okay, should I keep every authentication? Maybe not. Should I keep maybe 1% of the authentications or should I keep it? Maybe I want to keep it when one user authenticates several times fails and then authenticates. That could show maybe they figured out something with a password or they figured out some other type of two-factor authentication. So we have to think about these use cases. If we do that on the edge, we can actually reduce the data volumes that we're dealing with. So there's a lot of things we have to think about as we move more computing to the edge and how we deal with some of these challenges with enrichment. So lots of problems for the community to figure out and lots of cool features that we can build in open source to make unique ways of dealing with the data. So enrichment I think is going to be a key for them in open source. I think it's an interesting area for sure, especially with cost reduction, cost control. We all deal with these challenges. So a little bit about ingestion and storing. I had to pick like someone eating a taco for ingestion. So the filtering is important. The pipeline is very simple. A lot of folks, us included, use a lot of Kafka to help manage pressure on the back end to do various things in our pipeline. We use it for building machine learning models. We use it for doing real-time alerting to circumvent the fact that we have to store it and then analyze it. We try to analyze things in the stream as much as possible. We have to do a lot of research. We have to search the data in. How do I archive data, restore data? Because in compliance, some of this data like PCI, you have to keep seven years of transactional data. That's expensive. I want to put that on glacier. Keep it really inexpensive. But I need to be able to reconstitute it when I have to. I want to be able to do that when I have to. Providing a lot of archive capabilities in the ingestion pipelines now. I hope that's going to advance a little bit and be more useful for these use cases. But hotel is very early in logging today. We're going to keep working on it and hopefully it will improve over the next year as well. We're going to keep working on it. Some most organizations keep 90 days of log data. That's kind of the standard. If you were to ask me of those customers, what do they keep? It's usually 90 days. They want to be index, searchable, available. That doesn't mean they're going to search it all the time. They mostly are searching five terabytes of data a day. That's the challenge. It gets really bloated. It's a lot of data. Someone sending you 5 terabytes of data a day for 90 days, you're talking about a huge amount of storage. A lot of cost. Some of the things that we're thinking about and some of the capabilities that are going into the open search project which is open source is how do we filter the data for incidents going back a long time? How do we filter this? How do we make it more intelligent in the way that we store? Then, of course, moving things down to lower cost. SSD to spinning disk to S3 to Glacier as an Amazon example because that's where I do the most of my work. Then do I have to actually restore it? How can I make things searchable? We're working with Amazon on UltraWarm capabilities in the open source. This lets you store data in low cost but yet keep the indices in hot storage so you can do some basic searching. It's a big challenge. I think open-source tooling is going to get a lot better here because everyone complains about cost. It's the big challenge correlation. Let's talk about that for a couple of minutes here. So kind of the way that you have to think about data coming in and you can substitute the word event for log is you basically have an event and then you have a set of rules that help you classify whether the event is something that I care about or that I want to be alerted on. There's usually some type of mapping into a framework and then is it going to generate an alert and the challenge is if we just do those first three steps we're going to end up with a lot of alerts. So we have to start doing correlation. Is this alert that I'm about to send related to another alert that I've already sent or that I'm detecting? So you have to start doing correlation and grouping so that you don't overwhelm your system, create too many tickets, page too many people for the same thing. And so it starts to get really tricky because how do you normalize all of that data? And today this is usually done centrally but I think it needs to be done in a more distributed manner. I think we have to start thinking about correlation at the edge because it is a big challenge and everyone does it centrally today and I don't think it's very efficient. And it can also take a little while. When you think about you being under attack if the delay is one or two minutes before you know about it they could already have your information. They could have those critical assets already completely controlled. So that's the challenge. It's a scale problem here with correlation and we've been doing it the same way for 20 years in the industry and I haven't really seen it change a lot. So most of the correlation today is rule-based and you can go in the exhibit hall tomorrow and talk to all these vendors that tell you they have some magic AI correlation system and they may have a little bit of more advanced correlation but there's always a rule set underneath and the rule set is generally regular expressions or some type of parsing of strings to figure this out. In open source there is an alerting system that's part of OpenDistro and it does some pretty nice anomaly detection some interesting pattern matching so if you're looking at building something in open source it's a pretty cool way to go but it's still missing a lot of things like multi-query correlation. We've actually built some of this in our product but I don't see anything in open source that does a lot of the more complex alerting today. So there's still a lot of things that you have to build yourself if you want to build it. I think the open source is going to continue evolving. There's a lot of folks working on OpenSearch right now so that's definitely a good thing to keep an eye on. So how does the user use the data? I do want to touch on a couple of use cases here. So how do you simplify the problem which is billions of log messages down to log messages or events down to hundreds of alerts down to dozens of incidents. So that's kind of the goal is we want to simplify the problem, make it digestible to the security operations team and a good example here which is what we call user behavior anomaly detection over on the right. So we have all the authentications. There could be billions of those. We maybe have hundreds of authentication failures and maybe we've got a dozen times where users logged in from two different geographies. And maybe the users connected to a VPN and they're on the wrong one, that's fine. Or maybe the user the credentials have been stolen and there's some nefarious organization in some other country trying to compromise the account. So this is kind of an example of how you distill the problem down and there's lots of ways to try to identify those incidents. So the idea here and this is zooming out so there's SIM and then on the bottom is SOAR. That's basically the operation, that's the orchestration and automation platform for security. So you'll see companies today, many vendors that build SIMs, they are also building or buying SOARs to try to own not only the incident investigation but the automation to try to remediate the incident and that's what SOAR is about. So the idea is the SIM is the intelligence, the brain, it takes all the data, distills it down, creates an incident, opens a ticket or an incident in the tool. And they often have some basic incident management in the tool or you could use an external tool, maybe a service now or JIRA or security specific tools, there are many of them out there that do ticketing and workflow like the Hive, which does that. And then you can create automation. So a lot of our more advanced customers, they want to change a firewall rule, lock down a port, do something on an end user's machine, you would do that with the SOAR in security and that's really where a lot of the automation comes in and how these link together. So if you want to build your own with open source tools, you can kind of try to, and I've deleted a lot of stuff that's secret stuff that we do, but you could try to build your own and this is kind of some of the architecture of our back end systems. It's not easy to do, it takes a long time, how do you scale it and manage it, do all the alerting, everything else in between, never easy. Or you could try to use some open source technologies and do it yourself. Lots of good ways to ingest data, lots of ways that you can put stream processing in place, trying to aggregate it. So most folks that are using open search, elastic search usually run Kafka if they're at scale. So all of this kind of comes together. It's really exciting to see that Kafka is simplifying. I'm sure all of you hate ZooKeeper and it's finally almost going away from Kafka. They just released non-production version of Kafka 3 that has no ZooKeeper. I really want to upgrade our stuff to it because ZooKeeper is a bit archaic to say the least. So I think Kafka is making some awesome improvements thanks to the community and the folks at, who was at Confluent, right? Yeah, Confluent.