 Good afternoon. Two quick notes before we start. First, the disclaimer, this talk reflects my own views, not those of my employer. Second, we are going to move quickly through an awful lot of material. Please don't worry about taking in everything you see on the slides. Instead view this talk as field notes to the deck, which I will make available later on Twitter. Sorry that I haven't had a chance to do that beforehand. It's been a busy day. So a big thank you to the AI Village for having me here today to offer a buyer's guide to the market promise of technology, sorry, of automatic AI enabled detection and response. I originally thought about titling this presentation at least it's not blockchain because it feels like we're at a point where we can have a sensible discussion now about AI. Those of us who have been early adopters of technology promising AI driven wonderment have been through these painful early phases of the Garner hype cycle. Buzzword saturation has long since passed. If we've been accountable for getting value out of said technology, the excitement of our mission has soon turned into a harsh realization that we need to build our own manual if we're going to succeed because if the vendor had given us one, this is what it would have been called. As we have moved through the phases of anger, denial, acceptance and possibly some bargaining, we found ourselves wrestling with problems that are as much about the perceived value a technology has in the eyes of those who have nothing to do with making it work as they are about the technology itself. With that, I thought this would be a good place to take a quick audience poll. By Chef Hans, how many vendors are there in the room who sell technology to IT teams or people in big corporates? Okay, cool. Now I know who I have to run away from at the end of the talk to avoid physical abuse. And how many are here from the buy side? I will assume most others. Okay, good. Together we are stronger. So can anyone name this event? No, not that event. So this is Bobby Fisher playing 50 concurrent games of chess in 1964. And this is what we as buyers face. Well, vendors just need to play their game against us. We have to play our game against lots of vendors. And that means we have to be able to process lots and lots of patterns of play, lots of possible outcomes, both short and long term, in how different moves on different chess boards affect all the other games that we are considering. Unfortunately, there is no guidebook for doing this, even in areas of technology that are fairly mature. And when it comes to data science, there's even less guidance about how to build a stage appropriate capability that is the right fit for your organization. So the first goal of this talk is if you're thinking about doing AI, hopefully by the time we're finished you'll have something more than the equivalent of this empty pamphlet holder to work with. The second goal is to help you quickly slice through sales narratives like this so that you can work out which vendors are innovating versus those who believe R&D stands for rebrand and deceive. And the final goal is to make sure that when you've found that vendor, who you think just might have something that can solve a pressing problem, you have a decent set of questions to establish how they are approaching some of the key challenges in operationalizing and scaling data science techniques. So why these goals? Here's a story from outside of the world of cybersecurity to explain. Most of us are familiar with car ads. They follow a fairly well-worn formula. Last year, Lexus decided to let an algorithm write a car ad. The only prior evidence I could find of someone using AI to write advertisements turned out to be a hoax by a comedian, so full marks to Lexus for pushing the boundaries. However, when you read the review that this article gives the project, it does not fill you with confidence that this was the best use of time and money or that it was any more successful than paying comedian to make some stuff up, which rather leaves you wondering, could they have got a better result more efficiently, which leads us to the key takeaway from this talk. If you are going to purchase AI technology or develop your own in-house, the ratio of cost of value you are likely to experience is likely to be high cost for low value for a long time. This means your business has to be committed for the long haul if they're going to do this. And if you follow Data Sciences on Twitter, what you see is a continuing trend in their commentary, which suggests that for most of us, there might be better ways of efficiently unlocking business value, namely the careful manual analysis of a few high quality data sets. So with that, let us commence our journey through the five sections of this talk. Starting with what is this AI thing everyone is so tired of hearing about anyway. There are no shortage of snarkful definitions, but snark can sometimes be fairly accurate. If we translate this tweet, what it's saying is AI is when you describe what you are trying to achieve, ML is the stuff you need people to do, which gets you to the point you're trying to get to. This is echoed in this definition from Danny Lang, who has run AI programs at places like Amazon and Uber. AI is the expression of behavior that we recognize as an intelligent. Underlying all that is ML. Every single piece there doesn't feel intelligent, but when you put that together, it's AI. So where are we today with progress in the wider landscape of our AI as opposed to security? We're going to zoom out a lot and we're going to look at AI trends in general across the last five years for around the next 10 to 15 minutes. So first, let's categorize where ML and AI sit in the context of a slightly modified OODA loop of sensing, sense making, decision making and acting. As Sunil you points out who developed this slide and presented it at RSA earlier this year, AI capabilities are distinct from robotic automation capabilities. It's a very important distinction. This is important because a model generated by training data, sorry, generated by training an algorithm over data does not lead to automated decision making and acting. It may over time enable that, but as this talk will assert, we are an awful long way away from that today. So when you read things like this, buy a beware. As an example, answering the question, is this a cat, which is not a decision, is very, very different to answering what should I do if this is a cat in the context I find myself where the answer to that question is important. So second, in anything to do with data analytics and AI indeed is part of that, we need to acknowledge that the quality of downstream possibilities in the OODA loop is almost entirely dependent on the data you have to play with. And as we will soon see, there are lots and lots of trade-offs that need to be made in the different environments that we face, which will impact what data we can gather, analyze and transmit. And although upstream limits on data will indeed constrain what ML and AI can deliver downstream, there will also be mathematical limits that we run into, even when data is abundant. Here is a representation of the data that we have to look at. And this is a picture of the Algo interdependencies from a talk by Tesla's head of AI. What the graph here is showing is that over the years, computer vision models and the availability of data sets at scale have constrained what he calls the possibility frontier to the zone of not going to happen. Note the optimistic hockey curve at the end of the graph, and remember it as we look at this picture. So this shows how deep learning categorized images which were spun at various degrees on an axis. A school bus and the motor scooter that is categorized as a parachute. As Gary Marcus states, whose blog these images come from, if deep learning is failing at the very area that it's supposed to be best at, then maybe we need to adjust our expectations about its ability to replicate general intelligence. Moreover, if this is the case in areas where there has been a lot of focus from the data science community, we can assume that there are going to be real challenges in more complex problem domains. For example, the world of government intelligence. So to paraphrase the question by the head of the CIA's technology unit here, if you cannot use millions of cat photos to start solving your problem, how do I know what inputs I need to collect to be in with a cat's chance in hell of trusting the end product? Now, if you don't know what data you need until you see the output, we might be thinking, does that mean that we're stuck in a catch-22 position? Well, perhaps. But in terms of the intelligence cycle, the question on the prior slide is really only asking for a rational feedback loop between the planning of data collection and the current material being provided. Now the challenge for using ML to inform complicated decisions is that either the data may not be and may never be available at the scale required, or ML may not be able to actually learn fast enough over the data to provide analysis in the required time frame. Cue this fantastic blog from Uber, where they explained that if there is a variable that changes faster than you can train a model, then rules are actually better than ML as they found when they were developing fraud analytics. So when we apply the optimistic Tesla curve to areas with more nuance and less data, what may happen? Well, what will probably happen is we'll find a lot of data scientists asking the question, where is our benchmark data set? For example, in cybersecurity, we lack an image net equivalent to work with for something like intrusion detection, despite the fact that that technology has been around for an awful long time. Tesla's head of AI makes the point in his talk that there is a keen shift from algorithms to data once you start applying AI, where it has real-world consequences. And this point is reinforced in a fascinating article published at the end of last year, which talks about the way that the military are using AI to develop robotics. As the gentleman who is quoted in this article makes the point, the success of machine learning up to now has been based on the availability of enormous good clean nice data. Talking about some of the robotics videos that no doubt you've seen on YouTube, he continues learning to walk and crawl are themselves truly difficult tasks. If you go into more complicated environments, and this is admittedly in a military scenario, for example tunnels and holes in the urban rubble, tele-operation becomes exceedingly difficult. The army will have to collect data it needs to teach robots to manoeuvre each of these obstacles, tunnels, man holes, urban rubble, hills and forests, craters and anything else someone might reasonably expect on a battlefield. The last line of this article is of this quote, is particularly important because it recognises that human machine cooperation is likely to be a far better model to pursue the machine only and that line states developing to both the limits and strengths of autonomous machines means devising new ways for humans to collaborate with the robots working alongside them. So this is one of my favourite quotes from the article because it really elegantly explains why we are going to be waiting a huge amount of time for autonomous anything that are going to be powered by AI. We know from past history that even when automation is achieved in very safety sensitive context and can be observable it needs to be understandable before it can be trusted. So sure that you've been absorbing that with no problem at all. I did say it was going to be a bit of a roller coaster ride. It doesn't slow down. Sorry about that. So let's keep this in mind as we look at how DARPA categorises where we are with AI and indeed where we may be going next. So we've gone past what DARPA called the first wave of AI in which there was no learning ability. This is effectively handcrafted knowledge. We are currently in the second wave which they characterize as the dirty secret of second wave neural nets is they are spreadsheets on steroids and in the third wave DARPA states that AI will be able to explain why it has made decisions and reason through ambiguity. But as we saw a few slides earlier in problems that are not cat related the data sets and explanations necessary to do this have a little bit more complexity wrapped around them than the problems that today's second wave is dealing with. Further as this fantastic paper from the IEE explains there are a huge amount of prerequisites to achieve AI that function side by side with humans. Not least among these is the concept of common ground. Each party can comprehend the messages and signals that help coordinate joint actions. This paper draws out 10 automation challenges and is well worth half an hour to read through. Of particular note is the requirement for joint activity where agents must be able to adequately model the other participants intentions and actions vis- a-vis joint activities state and evolution. So the authors conclude that right now today we are at a crossroads between AI as emulator versus AI as capability enhancer. So do we need to worry about the robot army taking over anytime soon? Not according to J. Der Pearl as he states that deep learning is stuck at the level of curve fitting and is nowhere near invoking causal models. He goes on in the first line to say I know if I were to say it to my mother she'd slap me. So clearly this is something that someone would feel very strongly about. But the point here is that his conclusion about what will work and deliver the greatest value is very similar to what we heard just a few moments ago in terms of what the military are finding with robots for combat simulations, namely combining expert judgment with relevant data. So if we are stuck at the level of curve fitting today for ML and AI where does that leave us with what to do when we're assessing today's vendor space? Here is a simple representation of the prerequisites for AI. And while the layers I would argue are not to scale in terms of the effort required at each stage, this picture is very helpful when we consider that most product marketing starts at the top of the pyramid and our job as buyers is to work through the layers underneath to see if the claim is being made incredible and what design choices have been made in pursuit of the demo that we are being shown that shows everything working fantastically. So following the data science community on Twitter is usually a very good place to start to find hints about what kind of questions can burrow through the layers of sediment between the demo and the truth. As Tim alludes to there will be an ecosystem that supports the technology you are being shown and by understanding the issues and trends within that ecosystem we can make informed judgments about the maturity of the product we're looking at as well as the product choices that vendors are making and how they perceive this ecosystem. As this view of the AI infrastructure landscape shows the domains of ETL and deployment have really noticeable gaps and this analysis by workbench a New York based venture capitalist firm which accompanies this picture runs as follows. Most investments made in AML and AI are typically aiding in BI reports. Certainly the automation of PowerPoint is a wonderful thing. I'm not going to stand in the way of that. But they go on to say innovation is happening in tackling new areas in engineering problems like data access management, pipeline model development all very challenging if you've had to work with big data systems and core to this are managing building and managing data pipelines. So to pick up on one of those issues data pipelines are one of the major challenges for achieving AI at scale. And this is definitely something to consider if you are thinking about hiring an in-house data science team. Because if you have not built the infrastructure they need they will spend most of their time building the data pipeline platform engineering stuff they need rather than actually doing the analytics that you hired them to do. This will be incredibly expensive for you, incredibly frustrating for them and they will no doubt leave after about six months to go somewhere that has more abundant data. So many companies embarking on data science don't realize this unfortunately don't realize the upfront investment and effort it will make nor do they have a guide for teams in terms of how to build that out more on that later. But another way to assess the market if we want it is delivered by this fantastic paper by MMC Ventures who have published a very comprehensive framework that they use to assess investment in startups. Now the good thing about this framework for us as buyers is that it really corresponds to how we look at technologies on the market because it helps them assess specific verticals as a solution as opposed to vendors that attract tackling wide problem set distributions. The first important point they make which very much echoes that of Judea Pearl is that ML is not suited to unbounded problems or causal influence. Indeed, they state for ML to be effective problems need to be sufficiently self-contained. The second point that they raise is that most startups developing ML-led offerings face significant deployment efforts to get their stuff operational and indeed they see that more than one third of the teams of startups are spent helping get this stuff deployed. Now that's very interesting in terms of the money that's being spent by these startups to begin to do the automagic stuff that we all think they do. So if you are looking for a cheap and easy set of canary questions to start with challenging vendors with, you could do a lot worse than these. Here they are. What cases will the system fail on? What is the best baseline? Where is your published research? Does this pass the business fundamentals test? I'm not going to dwell on these questions. They'll be available in the slide deck afterwards. But essentially there's some great detail. This is a blog that I think was published about seven years ago. It has a really good set of things to look at, including, you know, does this ultimately deal with the problem of the business plan working with free human labor replacing the automated component? So a really good set of critical thinking for us as buyers to take on board in the security market. So the TLDR for all this is that when models hit real data, that's where the truth comes out. Unfortunately for us, most of what we hear that is negative about ML technologies can be attributed to deployment failure just as much as product failure. So assuming that you've asked your best questions, you've got some good answers. Sometimes your only choice is to plug this into production and give it a go. So does all this uncertainty mean that building your own is a better bet? Slides like this paint a fantastic picture of the kind of visionary roadmap that your business might think about developing or following. All sounds wonderful. However, unfortunately, the market in AI is probably a little more like the market in security where both buyers and sellers lack the knowledge of what it takes to create effective solutions. This grid here is from a paper called a market for silver bullets. And what it states is that neither buyers nor sellers in our market have the information they need to understand what a good solution looked like to their problem. I would argue that is definitely the case in data analytics because it is a multi-dimensional problem. And as those of us know who have tried to tackle multi-dimensional problems in large businesses, sadly, there is a high probability of resume generating events occurring. Often that's because it is just as hard to work through all the considerations that your data engineers and data scientists are going to need to think through as it is to integrate your technical teams with those that need to take the output of what is built and actually start using it. And that's before we've got to caveatting the analytics that you're going to have to present to decision makers to avoid an unrecoverable loss of trust. I thought the last talk was absolutely fantastic in terms of the practical lessons it shared about what it's really like to try and start working with data in ML. The lessons there I thought were absolutely brilliant. And of course, if you've ever presented data to execs, typically, if there's one data point that's wrong, that's used to invalidate the whole analysis. You have to go back and solve for those education cases. And the ratio of good to bad answers, I think, is something that we're still very much working, even in counting insecurity with the data from some of the APIs for very well-known vendors that we have to deal with. So if you do decide to go and build your own, here are some dynamics to think about. Building your own means preparing for large amounts of people time and budget that you are going to have to maintain platforms for analytics and then actually developing those analytics in-house. If you have the budget to do this, sadly, oops, sorry, if you have the budget to do this, sadly, you also probably have a change control process that means that simply won't be possible. Asking vendors to develop analytics on your platform is very low incentive for them because they don't get to own any of the IP. Side note, interestingly, some security vendors are actually opening up the data they collect to other vendors for them to build on that marketplace. And that's definitely a space to watch. If you rely on a vendor platform, you will eventually incur massive amounts of spend for either data ingest or running analytics. And at that point, your business will actually be de-incentivized to do data analytics tool. Price caps will be imposed by CFOs, especially if the value from analytics efforts isn't clear. Someone from Bank once told me about the six months they'd spent ingesting NetFlow and looking at NetFlow to try and find something that was bad. That was about a £600,000 cost for absolutely no return that they were able to show. So the things that seem interesting sometimes can end up not being that interesting at all. This means that most of us when we go out to the market and buy AI solutions are here. And what's on offer here is a set of generally applicable but very narrow use cases which may or may not solve the problem that matters most for our business. The exception to this is if you work with the startup who are building custom analytics on your requests. But that's really more like data science as a service. So while there's always vendor lock-in with the last scenario, building your own is just as easily its own lock-in and you also have to do with the support costs. So maybe think twice. Right. We can now take a deep breath and reset our minds. That is our analysis of the marketplace done. Let us now turn our focus back to the world of cybersecurity and a framework for technology selection by the scientific method. A caveat. This talk looks through the lens of an archetypal security operations team responsible for detecting and responding to evil or possible evil or the alerts that are spewed out by the many poorly configured technologies with questionable coverage and operational consistency a.k.a. the security Frankenstack which everyone that does this stuff has come across at least once in their career. Bonus punchline for this section of the talk. The problems that cybersecurity vendors tell us ML will solve are in fact the symptoms of weak strategic and operational patterns in a discipline that is multi-dimensional in scope cuts across activities of many other business functions it needs help from and impacts and has poor to slow feedback loops mostly learning from mistakes both reactively and proactively. Cyber security is known I think as what is a wicked problem. It has a huge amount of moving parts with lots of interdependencies and while there are lots of generic best practices for how to arrange these parts there is no handbook for tailoring them to the reality of our business. A former boss of mine drew this picture which I then made far more messy but effectively what is showing in each control area the loop within each control area is a feedback cycle of data and those feedback cycles of data within controls also need to pass data to and receive data from other controls. Whenever a vendor comes to offer you a solution it's worth remembering that this picture or something like it is what their technology needs to fit into. So the last part of this talk is going to divide into two sections. First we're going to look at how vendors sell in the detect space and the problems that they have in meeting their promises and second we're going to look at how we as buyers can avoid spending a lot only to recognize low returns from our investments. So off we go. There are three flavors of vendor that you are likely to find in the threat detection space anomaly detection, attack step detection and time series risk scoring. They will align to one of three main deployment models and in order of least pain to deploy these are as follows software that plugs in on top of something you've already got a platform you need to feed with data in a sensor you need to install. The USPs or as I like to think of them undistinguishable selling points that every vendors marketing material talks about are as follows. The prize for the best oxymoron I found among the copy goes to find the unfindable. But architecture aside our vendors drilling the right pain points as the industry lingo goes mostly yes but with caveats you have too many alerts yes we do but it's due to the poor configuration and operational process around our tools. You should use signature you should see signature based approaches as legacy well yes sometimes but not always you need more context around the alerts yes we do but that is often from data that is unobtainable to your solution like the diaries of our execs that tells us it isn't an impossible journey it's just that they've flown to do new business you need more visibility well yes obviously we need more visibility but have you ever tried actually doing that and you need to automate the scale well yes absolutely we do need to automate but the idea of automating actions in production I think for most of us is a laughable distance away if you've ever had the impact of changing something in production which caused an outage imagining a security vendors AI to make those kind of decisions for us I think that would be reckless at best based on the current false positive rates that most of us experience. Which is to say that just because vendors have identified the problems that are correct that does not mean that they have found good ways to solve them if you want evidence of this try asking the vendor these three questions try and get them to articulate what environments their solution is relevant effective and efficient within and try to get them to demonstrate that they can deliver a cost a strong cost of value ratio even after several years of being in production it simply isn't that easy. So to be fair this isn't all their fault they have to meet aggressive revenue targets which are there to achieve ambitious valuations and that means being able to cast as wide a net as possible for the prospect and hence generalized use cases that can show value in 30 days or show value in 30 days if you're out with that kind of sales person. Second they are trying to solve a problem that is unbounded in nature things change an awful lot and as we saw earlier from MMC this is the worst place to try and apply ML and AI solutions two areas where it is least suited to deliver value this slide from a fantastic Google Knights talk provides more detail about why the TLDR of this is that where there is significant change in terms of how a population acts in this case how threat actors respond to defender actions this is probably the worst place to try and use ML. Finally even if the tech works these problems at the top of our line of long things to solve in data analytics is largely outside the vendors control so even if they do do a fantastic job after that we've got to do the job of taking that insight communicating it in a way that people trust and delivering actions that people can take that deliver value. So the TLDR on this is there are lots of undesirable consequences in all these classes of technology which we have to try and avoid and unfortunately as a result the total cost of ownership life cycle at the moment for buyers looks more like this. Please tell me if you disagree based on your experience of trying to deploy ML solutions. That's not to say that there aren't some very good vendors out there. However, they are in the minority. If a vendor's marketing sounds like this and this is like this is legit on a website marketing I'm just going to suggest probably want to avoid them. I'm still not sure if what this vendor was trying to tell me was that the demo would prove their technology sounded like magic or communicate something more valuable. So with that background how can buyers separate the wheat from the chaff a health warning about this approach? It does increase upstream cost and effort significantly albeit with the goal of achieving downstream advantages so that you can evaluate whether this approach is worth it let's take a scenario. Here it is 90 day 60 day time frame to establish whether technology Blar is going to help us solve a problem. The POC is an additional project for our team which adds workload to BAU and we're already stretched. What to do? Here are some typical choices that we may have in front of us. Our team decided that option E was the one we were going to go with because the other options stack the odds in favor of the sell side who have one of two objectives to basically get us to swap out something that we have already or to get us to add something into our already bursting at the seams stack of technologies that don't quite work right. This means they have zero to no incentive to be clear about the possible effort cost and time of all the things that determine total cost of ownership. Frictionless lightweight uses no memory. Well in that case it probably doesn't run. And if they did tell the truth the picture that they would hang on the wall might risk the sale. So in our decision landscape about remove swap out and build an ad as it's a little more complicated we are trying to answer this question across the different dimensions that concern us business tech threat and the ecosystem of how all of them combine. It should come as no surprise that the desert of information available to help us with this from vendors is for the most part just as bad as every single other vertical in cyber security which is why it's my pleasure to announce to you today technology selection by the scientific method. This has two components first canary questions second testing claims in production. So let's have a look at our canary questions first. These are designed to do a few things judge the maturity of vendor processes to provide helpful information which you would consider they should have in their back pocket find out how transparent they are about strengths and weaknesses get a feel for how they think and manage their roadmap and assess if presales rep can actually use the technology they're selling for reasonable real world scenarios. What we're going to be measuring here is very simple the amount of hours it takes them to get us this information. So let's look at an example regarding functional requirements we're in the by phase. Note that this for this to be effective it is very important that you understand all the relevant dimensions of a technology space the vendor plays in. It's also very important to tell the vendors that you're working with that the this exercise is to understand their strengths and weaknesses and you don't mind if they don't do certain things it's okay to have a score of one or zero in certain areas. So this is an example of 10 lines of an approximately 180 line criteria assessment that we ran for EDR endpoint detection and response. The vendors who came across this kind of were impressed I would say by the level of detail and the only reason that I was able to get this level of detail was because I hired someone who'd worked as a consultant for an EDR vendor and bought them in to take the knowledge in their brain and put it into a big spreadsheet so that I could run a proper assessment against all the domains that I needed someone to know about. This kind of information currently is locked in the brains of experts and as an industry I would argue we have to find a way to create and share this kind of documentation so that others can benefit from it because if we don't we are at the mercy of typically the vendor's spreadsheet about what kind of things we should be considering. If you're interested in working on open sourcing this kind of stuff please come talk to me afterwards because this is one of the key efforts of beginning to document our industry knowledge so that we can begin to level the playing field of that expertise which we need to be better able to share. Thank you, ma'am. A few guide points to developing this kind of assessment for a given problem. Sweat the detail. Aim to capture everything. Hire an expert to build and run this. You might be thinking John, that sounds like an awful lot of effort either for the vendor or for me, yes. But for the vendor if we select them they're going to make a whole load of money over multiple years. And in terms of impact to our team, yes, there's going to be impact. We're going to put in a lot more effort. But personally, I would much rather go through this than have my team dealing with a technology that isn't to a good fit six months down the line. So a second example of canary questions. This time we're after answers about success criteria for tuning algos to detect evil. Here is an email thread between our team and a nameless vendor when we were evaluating that shows how difficult it can be to get good metrics defined. I'm not sharing this to bash anyone. I'm simply saying that if vendors cannot provide us with good metrics to judge the efficacy of their technology and they are top right hand corner then what the hell is going on? Because if they're not gathering these metrics from their customers how are they understanding how they're going to improve the solutions that we rely on to protect the businesses that we're entrusted with securing. And personally I find that situation unacceptable. So here we had asked a vendor for their input weeks ago and there was a radio silence. We'd sent a long list of suggestions to them about metrics that we wanted. I won't bore you with the non-replies that came back to that but the upshot was we sent them another email that said look you know if you don't like our metrics please like we've taken the time to write you a detailed email please please change it tell us what we've got wrong. Rather than do that they decided we should switch the conversation and do testing under a simulated attack scenario. I completely agree with that. You should always read team your scenarios in production but that will not answer the question of noise in detections during what I would call peacetime when sometimes we'll tolerate high levels of false positives for certain detections if the detection is high enough fidelity. Sure comparing hosts in steady state versus during a red team attack is definitely of interest but on the flip side what about what about what's not being alerted on in steady state and how do I use that to understand patterns of life across our host and network if I'm unable to get that data. As we pointed out to them TCO is made up of lots more variables than just when you're under attack we wanted to understand the ratio of signal to noise before we made a business case to our execs. Sadly no further help came our way. We tested the product there were several misses. We shared this data with the data science team. They were what I would call defensive about the results. They said our test was invalid because it focused on non-advanced APT attack step detection. That was not the kind of test we ran and that's perfectly okay. If that's where their product plays I have no problem with that. I just wish they told me that before I ran the POC it could have saved us all a lot of time. Final example of canary questions asking the vendor prior to demo walkthrough of a few scenarios that reflect day-to-day operational needs. Here are some examples we used for our EDR assessment. Several vendors sent people to our offices to work through them. One of them couldn't actually demonstrate how their own product worked against our minimum requirements. Now bear in mind we'd sent them these two weeks before like hey like this isn't some kind of like weird test right. Here's what we want you to show us. Here are the specifics of the scenario. We'll put aside two hours for you. Get together all the right people so that you can showcase it. This guy unfortunately like you know he must have felt pretty bad but he spent half an hour searching for the how-to PDFs on their website in front of our team which he couldn't find. I'm just saying top right hand corner people like you got to do better you've got to do better. Okay now for the fun part let's say your vendor satisfactory answers your canary questions which is few and far between. It's then time to put their technology through the paces in a tester destruction. That means we are not just looking to verify if it does what we need we are looking to understand if and where it enters a failure mode and why. To do this we're going to use attack paths to structure the test of our candidate technology. If you haven't come across attack paths I'm sure most of you have it's well worth reading this section of the Verizon DBR which included attack parts and attack path analysis. Really exciting to see this added to the DBR. I can't wait to see where they take it next year. This is to me a kind of key part of what we need to start industrializing in our business and we're in the industry of security. Here is one way to think about attack paths. The summary from a threat active perspective is that there are a series of actions taken to pursue an objective and for us it's a series of opportunities to detect monitor and respond to these actions. So this is our kind of red team representation of kind of how we enumerate attack paths the way that we record various steps in a red team and then the way that we use that data to analyze controls strengths and weaknesses across our environment This is a kind of sample picture of what it will look like once you've actually graphed out a red team and the various corresponding parts relating to red and blue that you can begin to draw on. Testing attack paths is ideally about triangulation i.e. we want to measure control efficacy along multiple paths. But if we don't get the opportunity to do that and we can only detect across one path what we want to be able to do is take the results of what the data shows us about our solutions across identify, protect, detect, respond and recover and we want to be able to extrapolate those answers out across all our controls so that we can use a single attack path to understand and get much more value out of it in terms of how our various controls are functioning across our business. So threat actors can take any fork in the road they want as they're progressing across attack path. But as I say if we can only test one let's make sure that we can take those single actions and understand what is going to have the biggest impact at least cost to give ourselves the greatest uplift in our ability to protect our business. That is ultimately where money and resources get put and where we need to be able to make the case with data to our operational and executive teams to show them the reasons for what we're asking as opposed to either doing the crazy man rant and just saying this we must do this or it's going to be absolutely terrible or just going with kind of assertions and assumptions which can easily get shot down. So here's a high level guide on how to design a test to do this. We're actually going to walk through all the bullets in this slide now with some helpful visuals so don't worry about reading this. Here are some examples of technology specific output you're looking for again. I'm not going to dwell on this. This will be available on Twitter in the slides and here are some questions that you're going to be able to want to answer. Again, I think most of these questions are probably ones that we've asked and written down in our own businesses. I just documented them here so that hopefully people can criticize them, share, update as we go forward. So here's a little visual guide. I want to give a huge shout out here to Faros Security. The concepts that we drew on were only made possible by the work we did with them. If you're not speaking to them, their model for optimizing a security program is phenomenal and I would recommend giving them a call to come talk to you. It is very, very good. So first, we want to select an attack surface. There are multiple. Threat actors will not have access to all of these, but depending on where threat actors start, they need to cross across different boundaries in your enterprise. Once we have selected our attack surfaces, we want to create a credible sequence of threat actor activity which mirrors a credible attack path for our business, hopefully based on incident data or data that we have. Here's a handy translation guide from CrowdStrike to keep this in plain English and now the fun part. What you're going to do is you're going to ask the vendor your POCing to map their detection models to the stages of your scenario. And then where there's a mapping, you're going to ask them for a plain English description of what you expect their algorithm to detect. And with that information, you're then going to go to your red team handbook of tactics that you can mimic and you're going to select a bunch of tests that you want to run at a specific level of threat sophistication. Note, always start these tests from the most basic level of sophistication, more on why for that in a minute. Side note, if you want a more comprehensive list, you can of course go to MITRE. MITRE isn't right for everyone though. And also there's no dimension in MITRE that necessarily considers the skill set you need for the deployability of detection against a vendor mapping to attack or the sophistication the techniques calibrate to. So with our techniques selected, what we're going to do is map our red team techniques to the algorithms so we know where those algorithms should detect the technique that we're going to use and where that detection should fall within the attack step path. Then we can start simulating our attacks. Now hopefully, as we move from external to internal to internal privilege, we're going to be all green. And if we are, that's great news. We've passed the most basic level. We now know that we can prevent, detect, respond to TTPs all the way through that test. Also, we can use the cyber defense matrix to begin to understand at what phase our controls are functioning. If you haven't come across this Sunil Yu, this is his brainchild. It's a fantastic tool that has lots of different ways that you can spin it to begin understanding your security program in better depth. And then we can move on to the next level of our test. Ah, what's happened? Well, we've encountered failure modes. For example, here we're seeing that there were failures on detections in internal cyber once we got there. And then, effectively, all detections were missed when we got to internal privilege. Why is that? Even if we have caught detections early in the attack path life cycle, I still want to understand if something is failing later on because I want to understand if I miss something, if we don't see that alert, if for any reason it gets drowned with other alerts, where do I need to look later in the attack path cycle? Again, at calibrated to that specific level of threat sophistication so that I can begin to ensure that the attention of my security operations team is tuned in the right place. So this is a real live attack path that we graphed doing one of these tests. This was actually to test dark traces detection capability. We worked with dark trace. They flagged some very noisy detections early on at this basic level. But later, there was a missed detection for a 100 gigabyte data X-fill that we did over eight hours. Why? It turns out in their logic that after a model has hit a certain number of detections, they actually pause those detections because they don't want to overwhelm the sock. And that is a really, really sound product choice. However, what it does mean is that effectively, you will then miss detections on those devices. So we actually found from this test that we had a significant number of machines paused in our environment. We presented these findings to dark trace. We've written a blog with them about the test we wrote. And this for me is a fantastic example of where we can, in partnership with a vendor, begin to use data generated by real-life attack scenarios to drive improvements in their roadmap. I'm delighted to kind of talk about this because this is the kind of collaboration and this is the kind of presentation that I want to be able to come and talk about about how we're working with these vendors to help other people understand how they can use data to better go do this and the fact that we can actually make changes in the industry if we have the right data to do it and we can begin presenting that in a sensible way. So back to our attack path graph. We can now start asking all those questions that we mentioned earlier. Do we need to make a strategic change? Do we need to make an architectural change? What changes need to be involved? Sorry, what teams need to be involved? What are the costs and efforts? We can consider the candidate technology in that context. We can ask a bunch of questions about tech, headcount, business threat, operating environment, all the other stuff I was mentioning before. And we can begin to map that to the actions and the effort that we're gonna have to put in. So we can actually begin showing our business in, if they use agile terminology, where we're going to need to put our resources over the next two months, for example, so that they can fund a project to do that. And hopefully, by doing that, we can create a shift. We can create a shift in the TCO life cycle that so many of us experienced today. To conclude, why do we in our team go to all this effort? If I come across as angry, I'm not. I'm just really disappointed. I used to work on vendor side and when I worked on vendor side, I made every effort to be as transparent as possible. And since shifting to the buy side, what I found is that vendors are making a lot of cash out of asymmetry. And that lack of information means it's very easy for us to spend an awful lot of our business's money on things that don't work. And when that happens, what happens is we get that budget taken away. It directly affects us. It affects our team. It affects our headcount. It affects our ability to improve our own lives and shift the sometimes nonstop dumpster fire of incidents coming at us so that we don't burn out and we don't run into serious issues. This is the kind of information that we are given in product marketing today. It is the best. Protects against all threat types. Use all the data that already exists. It's really bad. Ideally, we'd have something like this without having to run these kind of tests independently. But until we do, I think the good news right now is that we have a set of amazing frameworks that we can put together to start driving the change that many of us so badly need. Ultimately, a better market depends on all of us being a much, much harder sell. And I hope that some of what you've seen here give some inspiration about ways that hopefully we can all drive the market to a place of greatness. We're open sourcing this methodology for how to go about doing these kind of tests. We're more than happy to talk to any other team that wants to give this a try about the lessons learned we've had, about what's been effective, about what hasn't. I'm very sorry, I probably don't have time for questions based on the fact that I think I've run four minutes over. But hopefully this has been helpful and thank you so much for putting up with the speed of delivery. Thank you.