 So I'm going to be talking about how we can use automated planning for kind of automated red teaming. My name is Andy. I am a researcher at MITRE. Are you guys familiar with MITRE? I see a lot of head nods and I like that. Also we're going to attack in Caldera, which is kind of a lot of where this this talk came from. And the way I like to to kind of bring this up and talk about this problem is that as a defender it's really hard to assess what we're missing in our networks. It's easy to find this like castle mentality where our networks are castles, adversaries are never going to break in, and they are going to break in, they're going to find ways to get around your walls, and as defenders if we're not really like assessing our own networks we're not going to know where those holes might be and where we should actually be looking to improve our defenses. So to try to solve this problem we can run offensive assessments and that's a catch all term and I'm going to go into it a little more specific in a bit, but the idea is to stress test your network by executing a real attack and seeing what actually happens. Did you detect the attack? Can you see how far they got? Did they make it into the walls? Did they go to the treasure? What did they actually do? And then try to understand how you can improve your detection and your prevention. And so offensive testing is great, penetration testing, red teaming, adversarial emulation. All these things are wonderful, however they're hard. They cost a lot of money to run. They require a very significant time investment. Results are dependent on the capabilities of the personnel who are executing the attack. If you run two different red teams at different times with different personnel you're going to get very different results. Exercises can also be hard to repeat. This is a problem because you're going to want to see how your environment has changed over time, how your security has changed, what defenses have changed. If you can't repeat your exercises you're going to have a problem. And then designing red team exercises is also is a kind of like time consuming and heavily invested process. And well automation can solve a lot of these things. If you can automate your offensive tests you will lower the cost to run an exercise. You just need a tool to run it. They're less time intensive. You just push a button and it goes and it does its thing. And now you're dependent on an attacker model as opposed to the actual personnel so things are more consistent. And you can repeat the test at the push of a button. And best of all designs can be saved and you just again hit a button, run the same test you ran before. And indeed a lot of people in the community have realized that automation is good. And this is kind of a snapshot of only a small slice of guess I'll like these links. A small slice of some of the kind of open source projects out there that do automated offensive testing at various degrees. I come from the mitre side that's Caldera at the top. Caldera is open source you have the link there. But there's also atomic red team which aligns to attack. There's meta I think it's by uber. Red team automation I think is the name by end game. Infection monkey that's by guard decor. And then on the right those are a lot of tools that kind of circle around bloodhound which does automated attack path generation kind of you know pluming active directory to figure out like what the trust relationships are. Then a couple of projects that can kind of build off that automation. So anyway automation is fantastic and I keep saying offensive testing but what do I actually mean by that. I'm going to kind of talk about three different categories the first is pen testing that's kind of what we talk about a lot you know penetration test they they're helpful to determine the state of our vulnerability you test methods of gaining access you look for weaknesses you know this is really this really tends to be like exploit heavy and from a castle analogy this is you know really kind of you know going up at the walls of a network. Red teaming is a little more expansive now red teams actually simulate the goals of an adversary you know this is much more end to end you're going to gain your network access you're going to move about the network and you're going to pursue an objective and here as opposed to just kind of targeting the walls now a lot more is in scope we have kind of the holes and I'll even the catapult off to the side which might be you know the red teams custom set of techniques that they like to use. Then over here this is adversary emulation and here this is like red teaming but you want to simulate the actions of a specific adversary you can call this kind of threat based red teaming and unlike red teaming this is a lot more constrained we're going to talk about one specific path and one specific threat actor and so I'm going to talk mainly throughout this talk about adversary emulation I think adversary emulation is cool. I'd say that a lot of the concepts here in adversary emulation and red teaming those generalized to pen testing are kind of they all generalize around I'd argue maybe that some of the ideas for pen testing don't necessarily generalize to adversary emulation but that's a different argument so if we want to kind of do successful automated adversary emulation we need to kind of have a set of goals and the first is to make it real you don't want to just kind of run a notional like hypothetical adversary emulation exercise you want to actually use the same techniques the same tools and the methods and goals of a real attacker you also want to do this end to end you don't want to look for one-offs you want to actually compose a text because that's what real adversaries do they do a they do be they do see they don't just do a and they stop they go throughout your network repeatability is important I mentioned that in the beginning and then this last one is extensible we don't necessarily just want to come up with a model where everything is hard-coded it's going to do the same thing over and over again you want some variability you want to extend the ttps that are in your tool you want to kind of grow as as needed and if we really want to realize these goals we're going to need some form of advanced decision-making to kind of chain things together and figure out what we should be doing so I'm going to talk about the adversary model that I used throughout this talk it's the minor attack framework I have curiosity how many people here are familiar with attack yes this is awesome a good amount of people so so attack is an awesome framework that kind of talks about what adversaries tend to do after they compromise networks it breaks things down into the adversaries tactical goals these are the things like you know persistence privilege escalation lateral movement exfiltration and then the techniques as to how the adversary achieves those goals this is more focused on kind of enumerating adversary behaviors as opposed to a specific like attack patterns or iocs if you're familiar with the pyramid of pain this is really targeting that top of the pain the ttps style you know enumeration of what adversaries do and in the attack framework we not just enumerate them as a nice little picture in a matrix we have a description of what the technique is as well as a list of examples of adversary usage of each technique as well as software that executes that technique and i'm trying to spend too much time talking about attack apparently orange is a bad color here um some of the nice things about attack or that's grounded in real data from from cyber incidents all the ttps we have an attack all the techniques are backed up by real publicly available threat intelligence and my favorite is probably the third one attack the couples the problem of understanding what adversaries really do when they attack networks from the defensive solution you want to deploy so we can use attack in any way we want to this is just an enumeration you know from the adversaries perspective this gives us a lot of flexibility so just give an example this is a slide i stole from a colleague where he just went through and he analyzed the threat report and he found kind of some of the the strings in a a piece of malware and you'll you'll note here that a lot of these things are they all map to attack and that's the enumeration on the left but a lot of these things are are really just like you know normal windows things there's ip config net local group administrators that local group administrators domain net stat ping um mbt stat is in there as well dur all these things that are really kind of these normally benign things that adversaries use when they actually compromise networks so i bring this up to say you know mainly when we talk about adversarial emulation we don't talk about things from a pure exploit driven perspective this is more the end to end thing where we're talking about also like living off the land and that's going to play an important role kind of later in the talk i'm going to pivot a little bit and talk about automated planning i think i have to it's in the title automated planning is kind of old school um it's relatively simple to describe basically given the state of the world an end goal and a set of actions how do you compose those actions together to achieve your end goal and here's just an example that's totally contrived they're waking up and i want to eat breakfast so i'm going to get my bowl get my milk get my cereal then eat my breakfast i'm not going to go get the leash and then walk the dog if i want to go eat breakfast i try that it doesn't work um as i said automated planning is kind of old school um one of the biggest kind of solvers the most well known in the planning field is strips the stanford research institute problem solver and that's from 1971 so again old school um these are relatively straightforward representation you have basically a set of atoms and boolean variables a set of actions an initial situation and a goal situation and the challenge is to figure out you know what sequence of action starting from the um the initial condition achieves the goal situation this is this is pretty straightforward and the thing about kind of the strips formulation is that for each action we define the the name the preconditions that must be true for us to execute that technique the atoms are the kind of the the post conditions that will be true after we execute the technique and then the atoms are things that will be false when we after we um execute the or excuse me not the technique the action jumping ahead here's a simple example that a lot of people talk about in the planning community it's the blocks world um the colors aren't showing well but the basic idea is you know we have a bunch of blocks on a table you know b is on a c is on spot number two a is on spot number three we want to achieve this end state where a is on one b is on two c is on three relatively straightforward we have an action that moves one to the other we have some kind of logic description of you know how we can move blocks between different locations and a plan just kind of says okay b is going to go on c is going to go on one b an a c on three b on two relatively straightforward and you know at at its core you know really planning is really just a path finding problem over a graph and the states are nodes and the actions are the edges and that this is a very like horrible oversimplification but this is a big part of what planning is it's really trying to figure out how you navigate that that graph and there's a lot of research on doing this more efficiently uh some kind of three buckets down here one of them is heuristics trying to analyze a node in this graph and figuring out okay should I actually keep exploring this branch landmarks trying to find those landmark actions that oh I know I need to execute this action at some point so I'm going to throw it into my plan and then helpful actions which kind of help you along along the way and there's a ton more stuff here I have I have a reference down there that is probably eligible um so classical planning is fun but it doesn't handle uncertainty at all so the planning field has really exploded there's like lots and lots of research on planning this is just only a snapshot of that at at one level you had conformant planning this is a little bit more in depth than classical planning here your initial state is unknown instead you're giving kind of a set of possible initial states contingent planning is another kind of more advanced form of dealing with uncertainty now you have actions that can have non deterministic outcomes this is like probabilistic planning even more confusing is partially observable markup decision processes PUMDPs here actions yield observations which inform our belief about the resultant state and then we always kind of maintain a belief distribution of of what state we think we're in and then the last one I'll put down there is offline versus replanting and continuous planning in part of the planning field you kind of come up with an operative plan you figure it out and then you just go execute A B C D and E and you kind of keep going in many applications you find that you're not going to come up with an operative plan you know from the get go you're actually going to execute a few a few actions reevaluate execute some more actions than reevaluate some more and then lose your slides um green's a good color so yeah that's illegible but um so planning insecurity has kind of a you know again an old school approach really traces it back to kind of attack graphs you can't see the graph but the basic idea is we can use planning to figure out like how to chain vulnerable vulnerabilities together and move about your network it's relatively straightforward it's older there's a lot of logical models a lot of a lot of research in you know kind of the late 90s and in early mid 2000s which talk about how you can construct attack graphs how you can optimize them they tend to be kind of exploit heavy and and there there's a lot of utility but it's hard to see when you're just looking at that um more interesting approaches to using using planning and security are kind of automated pen testing using po mdps there's a line of a line of papers that that took about this approach which was pretty interesting and and the reason is that you know assuming an attacker has full network knowledge that's that's kind of unreasonable if i'm an attacker and i'm and even if i'm a red teamer and i'm going at a network i'm not always going to know exactly what the network map looks like or what you know vulnerabilities exist who's admin where you know the whole ad structure i don't really know that so instead you know they take this po mdp approach where you know it's relatively you know straightforward where they have sensing actions and action acting actions you know the sensing action is something like running nmap and scanning for vulnerabilities acting actions you know launching an exploit relatively straightforward but this has resulted in much more robust solution that's actually able to work in the presence of uncertainty or attack graph approaches didn't weren't able to shine and then probably can't read this either um it's just a taxonomy from a researcher in the planning community who's been working on um on using planning for kind of pen testing and he kind of bases it based on what the action model is so do you even have actions or you're just kind of drawing a network graph do you have monotonic actions where each action increases your knowledge base you know delete free or you have general actions which is a normal planning thing and then you can balance that versus uncertainty where you haven't you know no uncertainty you have uncertainty in your action outcomes you have uncertainty in whether or not something might succeed and then you have uncertainty regarding your state distribution in what state you're even in and depending on the specific model you're going for there's a few things in the literature where you might use you know just just a graph or classical planning or an mdp or a pu mdp and so this is a great paper i'd highly recommend it if you're interested in this area it's called if you can't read it simulated penetration testing from dyke's reeditoring test uh yord hoffman is the author it's a great paper i'm gonna all right so i'm going to pivot back it's kind of talking less about kind of you know planning and more about your adversary emulation and so here's an example to motivate it let's say you've got this host one and you have a foothold there and you've seen this host two and you want to copy a file from host one over to host two you might say okay what do i need to do to copy a rat file over you need a working rat on the source host on host one i need the mounted file share access from the target onto the source and then i need right access to that file share well these look a lot like preconditions and then you can say oh what happens after copying a rat file over while there will be a new new file on the target host that file will contain the rat and those are the consequences or post conditions and what you can do is you can say okay if i want to make a plan to copy a file i i i i need to get that file on the target well to to run that copy file action i need a mounted share okay i'm going to run the mount share command i'm going to get the mounted share but i need credentials okay i want to dump credentials get those credentials and as long as i start with an elevated rat i can go about and do this and this looks a lot like planning so there's oh i get a fun graphic too so yeah this looks a lot like planning and just to give you a few examples you know suppose you want to go from host one to host two and exfiltrate data from host two one potential plan is i'm going to dump credentials i'm going to mount the share copy a file remotely execute that file and exfiltrate the data another plan might be exploiting a vulnerability and then exfiltrating the data then another plan might be dumping credentials using rdp and then exfiltrating the data so so relatively simple examples but again you can see how we chain these actions together one of the questions we might want to ask is you know suppose we're doing this chaining we're coming up with these plans well how do i select the right plan and if you have an explicit goal you've enumerated beforehand that's straightforward you kind of enumerate all the plans and go towards a goal but if you're trying to do address or emulation you're trying to kind of show these repeat behaviors you know what's the right plan to execute or even if you have a goal and you have multiple plans how do you choose the right one and something that we've done or the line of research that we've taken about is to basically just kind of assign each action or assign each plan a score based on kind of a heuristic function we have basically just a decreased weighted sum over each action it's very straightforward but the idea is you know if we assign each action a reward function or a individual reward and then sum up the reward it's kind of decreasing as it you know executes further in the plan then we're gonna execute that plan and it's straightforward and here i'm not going through the example in detail but you know if you follow what's on on the lower right hand corner then plan number three is the best plan so that seems really easy and it's not that easy um there's a lot of uncertainty that comes about when you're doing adversary emulation and so let's consider kind of you know to kind of walk through that let's consider two plans the first is you know that dumping credentials plan and the second is exploiting a vulnerability and we consider the uncertainty for exploiting a vulnerability is relatively straightforward first is the target susceptible to the exploit that's a binary that's yes or no and that's something you can scan beforehand and then the second one as well was the exploit technique executed successfully that's also a binary yes or no and you can scan for that after the fact and it's relatively straightforward how you do that and and this is straightforward but it still leads to very interesting scenarios for planners but it's kind of straightforward from our perspective now consider dumping credentials this is really easy to describe when i dump credentials i need a rat with elevated access that's straightforward after dumping credentials i'm going to get credentials for all accounts set of active sessions on the host i just dump credentials on again that's also straightforward however in practice while the description is easy what we're going to get back are consequences here the post conditions for this action are kind of complex realistically running credentials might fail entirely that's one way it can work i might not get any credentials okay that's that's certainly possible i might get credentials for accounts that i've never even heard of so i can't do any reasoning over them i've never heard of them i might get credentials that for accounts that i've heard of but i just can't use and then if i'm lucky i'm going to get credentials that i can use and the problem here is that while these these scenarios are somewhat innumerable to an extent they lead to non-deterministic outcomes that explode in difficulties we try to kind of chain forward and plan for the future and now this is totally illegible um so a lot of the stuff in the literature the planning problems that that we see they tend to have bounded uncertainty where you know at at kind of the biggest oversimplification possible they are kind of explore you know heroistically and opt and they optimize it but they explore all possible states for each unknown and this is a snapshot of a paper where they use partially observable contingent planning as an alternative to to solving a you know fully described pumdp and what they do is that in their problem description they actually enumerate they say hey you know host zero that can either be winnt you know the server edition or winnt the enterprise edition and if i'm enumerating that beforehand then i can use all these techniques but if i don't have that enumeration beforehand it makes it a lot more difficult so you know in our case a lot of the actions we deal with kind of a you know using attack as this threat model trying to do full adversary emulation these actions have unbounded uncertainty and it's really not possible to plan over all of the states we just we just can't enumerate that and instead you know we've kind of driven us to this conclusion that you know continuous automated offensive testing you know adversary emulation it's not just planning there's also a big acting component so some of the things one of the things we did is came up with kind of a planning algorithm it's relatively straightforward we want to update the world state and then after we do that we figure out what valid actions there are that i can execute right now that's just looking at my preconditions is straightforward i'm going to construct all plants that lead off with those actions chain the actions together by leveraging the model run the heuristic over those plans once i have that set i can just use a heuristic execute the first action in the best plan and then repeat relatively straightforward but the problem here is really well how do i construct plans if i can't enumerate all the outcomes of the actions that i have in my model and so we've taken a few approaches one of them that we did kind of on a more research perspective we just kind of ran this in a little simulation environment we basically kind of just guessed what the world looked like and we used deterministic techniques to identify plans very straightforward it worked well in the simulation simulation space but even there it was computationally sluggish and very heuristic heavy we have a paper there too intelligent automated red team emulation that walks through some of the stuff we did there we extended that by using kind of a small world extension idea this is i call the light planner we basically basically would guess small extensions about the unknowns we have in our environment so if i see a host and i've never probed the admins on that host i might guess okay this account's an admin and this account isn't an admin then we can use a rule-based approach to kind of you know tune what the probabilities are that we're going to guess like i'm going to guess that there's an admin there that i have creds for or an admin there that i don't have creds for and this works well in simulation but again it's hard to implement in practice and it's hard to come up with a good like rule set that you actually want want to want to leverage in a real environment what we ended up doing is i'm calling optimistic best best guess this is basically just guessing um this is a simple approach where for each action we have in our model we're going to encode specific hints that talk about like what the opt what the kind of best case scenario is so if i'm dumping credentials my hint is going to be i'm going to get credentials for an account that's an admin somewhere and so this this works reasonably well in practice there's a lot of room for improvement here it's something we're working on i want to invite everybody to work on caldera's open source um but it works well enough in practice so i'm going to talk a little bit about caldera um it's written in python 3 the rat is in c sharp we use mago db have all sorts of fun stuff but there's three main components uh we have an admin web ui that that you can use to kind of control operations control what what it's doing we have the server that kind of controls everything you know the human interfaces with with the hpp server and then the server has a database of all the things that are true an execution engine that talks about how you actually do things and then the planner which leverages the attacker model in the world state the forward train and kind of use the the algorithms i've been talking about for each host that you're testing in the environment you need to have a uh kind of shim agent this just facilitates communications between the hpp server and the agent this kind of just facilitates that communication there there's also a rat component that actually does kind of move around the network so there is a real rat it is really executing stuff the agent is there to kind of just kind of delineate which hosts are in scope or out of scope conducting an assessment is very straightforward and easy this is one of the nice things about using planning you load the caldera shim onto the network hosts you create an adversary you identify which which hosts are actually in scope for your assessment and then you just launch the operation this is really nice because you it's very kind of low overhead for an operator or someone wanting to run an exercise during the operation caldera will report its activities any artifacts it creates that um you know anything it does and then it'll automatically stop if it can't do anything further after the operation we get a report of everything that it did you can kind of go in in detail and kind of scope it out and then and then it'll automatically reset the infected host removing any artifacts it created and you can kind of control that and so this will stop it from dropping rats all over your network and modifying registry keys and doing all sorts of other stuff and i am not going to try to run the demo because that this is not working well um so i'm just going to jump ahead to some closing thoughts um using an automated adversary emulation is fun there's a lot of cool things you can do with it testing analytics and sensors you know that can help you see okay does it do my defenses actually work i'm going to go you know i think i'm going to detect credential dumping i'm going to go throw an analytic into my splunk instance and run caldera and i can see okay did that analytic fire correctly or not data generation is also is also good i don't know if we can produce at this point enough very data to like train models off of but it's one of the use cases we want to do is to be able to kind of generate think generate data to kind of train people and train anything reason number three red and blue team training you know you can tell your red team or say here's caldera go learn its ttps go follow what it's doing they can be like a first nice introduction to some to some potential attack paths it's gonna also teach your your blue team you know hey here's what an adversary did continuous testing now and straightforward and then easy structure and replication is another big big use case some notes for the future automation is a hard problem we've encoded some straightforward techniques but we're getting to the point where i think we have like 30 or 40 different attack techniques encoded in caldera and managing the complexity if you want to use each one of those techniques that gets kind of hard because you have all these different atomic things that you can do and chaining them together is more and more difficult and some techniques are hard to are hard to execute rdp is a common one for red teamers but finding like a way to automate rdp sessions is not particularly straightforward because we don't have good apis for that key logging is another one where now we have to do like asynchronous operations because we're just waiting for someone to enter in a password right now a lot of the stuff we do is just we can do it and just go execute it and go forth but but we want to kind of get to the point where we can do that kind of um you know interactive operation a big question is how we can do this more intelligently we're kind of working on that stuff on our own but there's a big machine learning story here again caldera is open source um you know optimistic best guess is fun to say but that's just kind of guessing um there's a lot of cool things that you could potentially do a lot of things we've looked into a lot of things we've heard you know there's a huge and awesome story here and then the last point i want to make is that we aren't automating offense in a vacuum um you know they're caldera we have a rat and semantic it now picks it up which is you know we're not automating offense in a vacuum if you want to be involved with caldera or anything we're doing um there's a few things you know we have the github we accept pull requests anything you want to do my favorite um and i pulled some of these slides out because i was going to do the demo is that you can actually create your own techniques and put them into attack or into caldera and there was actually some some guys who they wrote a blog post of how they wrote their own techniques and they put in the logic they put in the execution and they threw it into the into caldera and it was able to string it together as part of a larger attack path so that was really cool but anyway you can get involved there we have a slack channel please reach out to us if you're interested you know we've got wider people everywhere and we have emails everywhere too and then on the ai side um if anyone's interested in automated planning we've just released a or we were involved in the uh you know the international planning competition so we submitted a caldera domain that you know kind of got rid of some of the uncertainty in in a way in the for the deterministic track but it standardizes some of the actions in kind of the the pdl format which is what a lot of automated planners do lots of links lots of context if anyone's interested i'm andy i mentioned attack one thing i'll mention is we're running attack con we're running a conference dedicated specifically to attack the cfp is open right now it closes on august 15th submit stuff it's it'll be fun doing 10 minute talks 30 minutes talks anything anything involving attack please let us know uh we have something called the cyber analytic repository that has some analytics adversary emulation plans is a new one that we've recently released this is kind of an in-depth look at kind of how you can go about emulating an adversary and it's basically done anyways um yeah i'll leave the links up here um