 Thanks everyone for attending my talk. So those of you who don't know me, I am a PhD candidate at Arizona State University. I am specializing in use of artificial intelligence and machine learning in the field of cybersecurity. I'm also a security consultant at Bishop Fox. I also authored a book known as software defined virtual network security, dealing with like security and software defined systems. I'm also co-founder of Devil's Sec, which is the hacking club that we have at ASU. I've worked for BlackBerry and Public Services and Computer Sciences Corporation in past, and this is my contact information. So let's dive into the overview of the talk. What is the motivation? What would be the overview of our system ASAP? There are three main modules in the system. Stinger, which is used for the discovery of information, both services and vulnerability in the network. Americano, which is used for analysis of different paths an attacker can take in a network. And Cappuccino, which is our AI-based autonomous attack plan generator. And finally, we do the validation of these attack plans, and we will jump into the demo after the end of the presentation. So let's see what is machine learning. It's a statistical way of learning from the information present in your network, what kind of network traffic you have, the logs on your system. So using some pattern recognition to identify some attack patterns in a network, we use machine learning techniques for that. And it's already been used successfully in things like spam detection on your email. You don't get those Nigerian print emails these days because of amazing job done by spam detection systems. Artificial intelligence, on the other hand, it perceives the network traffic, what kind of activities are going on within the environment, and uses that information to take some decisions. So it basically acts on the information provided by different agents in the environment. You can think of smart ideas that we can design in a system which basically collects information from different parts of the system, updates its beliefs, and then takes some intrusion prevention system measure. So that is something that we can take the help of AI in designing. AI and machine learning have found some useful applications in cyber security, both in industry and research. We use the attack patterns for detection to basically identify malicious actors within our network. And we try to see if the attacks are very stealthy in nature. There are attacks like advanced persistent threat, which are basically slow and low kind of attacks which are hard to detect. They are carried over multiple days. A good example is Sony hack, which was carried out over a period of several months. So identifying some valuable patterns from those kind of attacks is some place where we can definitely utilize machine learning. And there are recent investigation in the use of AI to design deception-based system, moving target defense, and cyber deception, in general, are two ways or two fields of research that explore how to identify the attack patterns in a network and basically use that information to present a fake view of the network to an attacker and deceive him into honeypots where they can do further analysis of his attack intentions. So why do we need AI and machine learning in cyber security? And I did some background research, and it's estimated that there will be 25 billion IoT devices in US by 2021. And the investment in cyber security will be up to a trillion dollar. With penetration testing, if we look at it, the market size would be 3.2 billion US dollars. So the number of devices are growing, perhaps, at a quadratic scale. But we have a shortage of cyber security workforce. It's estimated that 65% of the organizations feel that their staff is not very well equipped in cyber security. And 36% of the organization reported that there is lack of training or skills in existing cyber security workforce. So that is where we plan to use AI to kind of bridge this gap. So if you look at a very practical example of application of artificial intelligence, DARPA Cyber Grant Challenge was one place where AI was successfully applied. There were seven participating schools, seven or eight, who took part in a hacking competition where each school was trying to target the infrastructure of everybody else while keeping its own infrastructure secure. So the important catch was that all of these participants were AI, not AI, but autonomous systems. And there was no human involved in this competition. So Mayhem, which was a company based out of CMU, they automated what white hat hackers could do. So they found and exploited the weaknesses present in these systems. What they did was they created a mathematical model of the parts that an attacker can take. And then they used two techniques, symbolic execution and fuzzing. So symbolic execution was the way to point out interesting cold parts. And fuzzing was hammering through those cold parts to exploit the vulnerabilities pointed out using symbolic execution. And they won this championship and they managed to find 14,000 vulnerabilities on Debian system as well. And 250 of these vulnerabilities were new. So imagine a human attacker trying to do all this, it's kind of difficult. And this shows a successful motivation to use machine learning and AI in the field of cybersecurity. So if you look at our system ASAP, there are four main modules, Stinger, which is S stands for scanning. So we use Stinger for scanning and Rekon. The information from Stinger is fed into Americano. So A stands for attack analysis. And we use this information from Americano to identify the attack states in the network. Latte, which is a module L stands for log here. So it's a module which identifies network and host logs to gather the threat evidence. And Cappuccino, which is kind of the network controller, it takes all this information from Americano and Latte to encode in form of AI model, Markov decision process. And based on that model, it identifies some attack plans. Like if a penetration tester were to test or attack this network, what kind of plan would yield him maximum output? And eventually, we can execute these attack plans on a cloud or web application and update the risk score and attack graph, which is basically Americano. So I am addicted to caffeine. That is why I chose the name of these modules based on different kind of coffee flavors. So let's dive deeper into Stinger. So Stinger basically scans the network topology for service information and discovers the vulnerability. So we have automated Nessus and OpenVS APIs to identify this attack information. And this attack information is then fed into Americano, which is an attack graph generation tool. So let's look at one of the known vulnerabilities. This is a shell shock vulnerability. And there are different parameters from common vulnerability scoring system for this particular vulnerability. Like you need just network access to execute this attack. And it has low access complexity. So you don't need to do a lot of investment as an attacker to exploit this vulnerability. So this is an example of kind of vulnerability where we can implement some sort of automation once we are able to identify this vulnerability. And the reason of providing this information is that we will see later that how we can use these CVSS parameters like access complexity and possibly CVSS score to encode the information in our AI solver. These are some other parameters like the impact on confidentiality, integrity, and availability is very high. And attacker can take full control of the system if he were to exploit this particular vulnerability. So let's take a look at a motivating example where attacker is located on internet. And his goal is to reach this database server and exfiltrate the information out of database server to his command and control center. And there are some publicly known vulnerabilities on these machines. So basically attacker is trying to exfiltrate the information from database server. But there is a firewall on his base. So he cannot directly access this database server. So he either needs to go through the web server or wait for an internal user to download some of his malicious code and use that as a pivot to go to the database server. And the web server is the only publicly available service in this network. So attacker can try to exploit the web server using unknown vulnerability, or he can have some malicious script on a popular website that user downloads and that way he can gain access into his workstation. And using this, he can then take advantage of the access control list, which basically allows any network traffic from web server to go to database server or any workstation traffic to go to database server. And that way, the attacker can exploit the SQL injection vulnerability that is present on the database server and then use it to gain persistent access to his command and control center. So you will see that there are two attack parts in this small network to achieve the same goal. So imagine a very giant network with tens of thousands of instances, and you are asked to perform a penetration test for that network in a limited period of time. So you need some kind of autonomy or automation in that particular case to be able to have good coverage in your penetration test. So we saw this example, but what about it? How do we basically use this information? We can do some initial attack analysis based on this example and see that attack is multi-stage. And attacker had specific attack vectors for this vulnerability and he went through multiple hosts. And he recommended some of the defenses that were present on the systems to achieve his goal of data exploration. So let's discuss on a philosophical level why AI can be used to hack faster. So imagine you are going home on a particular day and you decide to take a turn on Arizona Avenue and Main Street. And you have been taking this route forever to reach your home. But you encounter a traffic jam on the way. So you went by your intuition and this got you into a traffic jam. But if you had a GPS to help you navigate, you could have avoided that jam. So similarly, as penetration testers, when we try to go after certain vulnerabilities, we have kind of a preset methodology. So we will go through some authentication issues, authorization issues. We will see if we can use the user management in some way. We can try to see if we can get horizontal privilege escalation. We go after data stores. We go after application logic. If it involves the code review, we go through the procedure of code review. And use all of that to see what's the maximum we can get in this penetration test. But here is the challenge. If you have, say, 20 hours for a particular assessment, do you think on an environment where you have to do pen test on application as well as the cloud part of the backend components, it's very challenging to get good coverage in that scenario. So AI and machine learning can act as kind of navigator for us on these assessments. So we can think of ASAP as kind of AI-based GPS to navigate the attack surface. And it may not work on all kind of unknown vulnerabilities like, say, data encryption issues which you identify, which is a vulnerability. But it can help us in semi-automating some of the tasks that we may miss out. So the worst thing would be that there is a very low complexity vulnerability that was present on the system. But you just ran out of time on your pen test. And you couldn't exploit that vulnerability. And later the client finds out, hey, why did you miss it? So then you are in a tough situation. So that is another kind of motivation to develop this kind of a system. So in the Americano, we get the information from Stinger. And we use these vulnerabilities and software configuration to pass to a first order logic-based framework. And that framework basically generates multi-stage, multi-hub attack graph. And attack graph basically shows that different parts an attacker can take in a network to be able to reach his desired goal. So if you look at the definition of attack graph, we have some nodes and edges which are property of a given graph. There are some fact nodes. NF fact nodes will be something like the existence of vulnerability or the existence of network connections. And conjunct node are denoted by NC. The disjunct node are denoted by ND. And root node, which is basically goal of attack, is denoted by NR. So conjunct node can be something that you can achieve based on your initial exploitation of certain vulnerability. So you have some fact nodes that you combine with these interaction rules that we provide in first order logic to achieve some other conjunct nodes, like exact code. So suppose there is a vulnerability buffer over flow on web server. And the attacker can access the web server. So if attacker is located on internet, then that can lead to execution of code on web server. And based on that example, the root node, in our case, would be to gain a root privilege on database server. So there are two kinds of edges. E pre denotes the precondition edge and E post denotes the post condition edge. So a precondition edge basically combines the fact nodes and conjunct node to show that the next possible state that an attacker can achieve. And E post means the edges that are triggered if some preconditions are satisfied. And we have some base initial condition nodes in this attack graph that we can denote using NI. So to simplify this, we have some advisories that we identify based on the scanning of the network. We have host configuration information. We have network configuration information. The principle indicate who has ownership on which machine. And we use interaction rules and policies to provide input to this attack graph based reasoning engine, which then generates a attack graph for us. So before going any further, let's look at some information of these MULVEL rules. MULVEL is basically a reasoning system which encodes this information. And it's a work by University of Kansas, which we kind of used in our development of the ASAP system. So advisories show that what kind of vulnerability exists in the network vulnerability property. Host configuration shows that the web server has Apache software. It's running on port 80. And this is the daemon. Network configuration is basically the access control list, which says that from internet to web server, there is a TCP connection that can be established on port 80. And principle show that a user has a user account on this PC. And there is another system admin, which is kind of a root level account on the web server. So all these information we can obtain by scanning the network and by obtaining the host configuration information, the network rules. And then they go through these first order logic rules. So this is one of the rules, which is that if there is a vulnerability existing on host with vulnerability ID, and the vulnerability has a property that it's remote exploit. And there is a network service corresponding to this host. And the attacker has a network access on this host. And attacker is malicious. Then this will lead to a code execution. So these are basically predicates of this rule. And this is execute code by attacker on host. And beginning of privilege is basically the host condition that is obtained when all these preconditions are satisfied. And we also use the policies, which show the user access on different resources in the system to encode into these interaction rules. So this will be a logical attack graph of our system that we saw. So you can think of attacker located on internet as this node 0. Then the node 0 interacts with different nodes. And these ovals represent the rules. So one of these oval will be interaction rule. And based on that, the attacker progresses to the next privilege node. So we can think of this as a root exploit on, say, Apache web server. Then attacker probably gains some other network level access. Using another host condition of that attack graph. And eventually reaches his goal of gaining root access on a database server by exploiting SQL injection. So the main brain or AI in this work is cappuccino. So what cappuccino does is it takes the information from attack graph and information about different configurations and vulnerability from this CV search database and log information from the Latte module to create a MDP graph. So MDP graph can then be used to derive an attack plan that we as penetration testers will implement on the network. So let's see how the states can be extracted from attack graph. So there were fact nodes which shows that attacker was at internet initially. And the next privilege node that he gained was network access on, say, another machine FTP. So there are two things that attacker can do when he is located on internet. Either he can take no action or he can exploit this vulnerability. Let's say the CV ID of this vulnerability is CV 2013, 4124. So these will be the states that we can extract from attack graph to be used in our Markov game or our Markov decision process. So let's revisit the attack graph and let's see that for this another example, if there are two parts that attacker needs to take to be able to exploit this FTP machine. So one way he can go about is that he goes from SSH and then tries to exploit the FTP. Or another way is that he first exploit this web server and goes to FTP. So the corresponding attack graph for this network will be that attacker has SSH access. There is a SSH vulnerability. He exploits SSH, gains a root on SSH. Or the attacker can also go through the exploitation of web server. So basically, he goes through the web server to exploit some vulnerability on the web server. And then he reaches the FTP server. So in this graph, there are basically two parts the attacker can take. So if we are to obtain a Markov decision process from this, basically MDP has some components, state, action, transition, and reward. So state represent the access that an attacker can obtain at any point in the network. So he can be a user in SSH. If he exploits the vulnerability, he can be a root user on SSH. If he exploits the web server, he can obtain a root on the web server and eventually he can also obtain a root on FTP. There are two actions that we use to simplify this Markov decision process. So in each state, attacker can either choose to take no action or he can choose to exploit the next vulnerability. And there are some probability values that are associated with these actions. And these probability values, we will explain like how we derive meaningful probability values for the MDP. And we kind of relate these to the access complexity of the vulnerability. So if the access complexity is low, then probably it's easier to exploit that vulnerability. And there is a higher probability of transitioning to the next state. And the rewards are the values attacker obtains by being in a particular state. So say attacker does not want to exploit that vulnerability. So he will have kind of a low reward. The reward is that basically he is not detected by say an intrusion detection system. So that's kind of a reward for him, but it's not very big reward. As compared to if he is able to obtain a root account on one of the external services, that's a high positive reward. So we put this value like a plus five and we use the CVS score of the vulnerabilities to derive the reward. So the reward can at any point be between zero and 10. And if there are uncertainty in the attacker action, this can be considered as a partially observable Markov decision process. But in this work, we are using the simple Markov decision process to show how we encode this attack information. And when we solve this Markov decision process using a value iteration solver, we obtain some policies. So policies are different parts that attacker can take to obtain some of his goals and the reward for each part. So value iteration tries to maximize the value that an attacker can gain by following a particular policy. So there are two policies, like he can either exploit SSH, then exploit FTP or he can exploit SSH, then go to web server and then exploit FTP. So obviously the reward in second policy is higher compared to first policy. But this is for a simplified network. We can hand encode these values and solve this Markov decision process. But as a penetration tester, if you are dealing with a very gigantic network, we would want to encode this information using some of the MDP solvers. So as I mentioned, the states are the privilege level of the attacker, the value for the transition matrix, which is basically the probability of transitioning from one state to next state using an action. So suppose S0 was a state of user access on SSH and S1 is the state of root access on SSH and the access complexity of this vulnerability is low, that means that there is a high probability of transitioning to state S1. So we encode the value 0.9 for low access complexity, vulnerability 0.6 for medium and 0.2 for high probability, high access complexity because the vulnerabilities for which access complexity is high, they are obviously difficult to exploit. So there is a good chance that attacker will stay in state S0 if he tries to take an action. And the reward value basically for transitioning to state S1, if the CVSS score of the vulnerability 6.4, that will be the reward that attacker will gain by transitioning to that state. So as we discussed, the states represent the current privilege level of attacker and the actions that attacker will take. And the transition probabilities are the values we derive from the access complexity of each CVSS vulnerability. And if we encode this information for action exploit SSH in a form of our transition metric. So by taking action exploit SSH, there is a 0.9% probability that attacker will go from state S0 to S1. So let's consider the rows 0, 1, and 2 and columns 0, 1, and 2. So S0, S1, which is 0, 1 row and column, will show that there is a 0.9% probability that attacker will transition from S0 to S1 by taking that action. But that action has no implication on other states. So if he was in state S1, then exploit SSH doesn't do a whole lot for going to state of S2. So attacker will remain in state S1 if he takes that action. So there is a high probability of being in state S1 and S2. If he takes action exploit SSH in S1 and S2. And the reward as we discussed are the values he obtained by the CVSS score of those vulnerabilities. So if he takes action A in state S, it maps to a real number between 0 and 10. So as we discussed that taking no action will have a low reward and exploiting SSH will have a positive reward, which like if the CVSS score for this vulnerability was 6.4, that will be the reward of being in state S1. And similarly, the reward of being in state S2 will correspond to the FTP vulnerability. So bridging it all together, how we arrive at Markov decision process from attack graph. So this is the algorithm that I designed. Basically you parse the attack graph to get the nodes which show the privilege of the attacker in step one. You find the CV IDs of these vulnerability that lead to this exploitation in step two. In step three from attack graph, you fetch the CVSS score of the vulnerabilities. In step four A, use the CVSS score to create a reward metric which show the actions and the state transition mapping to a real value. So basically it will be say a column matrix for different CVSS score. In state four B, basically you use the access complexity to derive the transition metric. And you provide all this information to MDP solver. So the reward metric, the transition metric, the different states and the action. And we use the pi MDP solver as solver with the value iteration function to generate an attack plan which then we provide to this module pi metasploit that executes the attack plan and shows that how fast is this attack plan that we obtained. So for the validation of attack plan, like if the attack plan says that first exploit SSH and then FTP, we have MSRPC which is a daemon of metasploit which is running and basically it uses the Python scripts to execute different plans. And tells us like what is the cost of running different attack plans, like how much time it took for different attack plans. So for MSRPC, you will need to have one of the MSRPC session running on one window and then on another window you will execute your attack plan and see how it goes. So let's look. Yeah, so let's look at the demo. So we'll first go to here. So we can perform port scan to identify the services that are running on the network. What are the software versions of those services that are present? It will take some time. So in that meantime, we will go and check out the vulnerability scan that we have in our network. Basically, we have like a message and we can see like this scan obtained a lot of vulnerability on machine that we set up. There were about 71 vulnerabilities which were a mixture of critical high and low vulnerability. So our port scan revealed some information that we basically need from the Stinger module. We have the CV search API basically which provides us information on different vulnerability. Like if you provide the CV ID, it tells us the access complexity and CVSS score which we will later use in Cappatino. We have a Nessus scanning APIs which can be used for connecting with the backend of the Nessus on the provided port and basically check the state of the network scan, basically create policies and see like if the scan is running or not. We can basically export the scan using these APIs. One thing is that in latest version of Nessus, I think they have disabled some of these APIs but if you are using the old version of Nessus, you can use these APIs. So the scan that we obtained from Nessus, we need to provide as input to the attack graph which is our Americano module and the Nessus scan file we obtain is this ASAP underscore something dot Nessus. This will serve as input to the attack graph. So Malval is a tool which uses this information so you can basically translate the dot Nessus file using Nessus vulnerability translate and this will generate a list of interaction rules and another script which is a graph gen.sh can be used for obtaining the attack graph from this translated information. So I already have these files so you will need to set up Nessus and XSB if you want to use the Americano module and I have the details in the it me file. So we run the Malval module and the attack graph that we obtain, it has some nodes, edges which show the information about the network services, the different vulnerabilities that are present on this network, the interaction rules. If we check the visual representation, it will look something like this like the attacker was initially on internet, then he used some rules of access control list which allow the access to a particular port to have direct access on another machine in the network. So he can network access, then he used some other vulnerability to mount a remote exploit on one of the server programs. So this information is represented using dot file and we can obviously use some D3 libraries to improve the visualization but this is how the attack graph will look like. So now we use this attack graph and we provide this information to the MDP solver which is present in our Cappuccino module. So the attack graph parser basically parses that attack graph, obtains the information about the access complexity from the CVE search, it checks the predecessor and successor nodes of the network, obtains the attributes that I described in the state condition. And basically it encodes this information in form of Markov decision process. So if you run this code, basically the edge information will tell us like if you are transitioning between two edges. So let's see like between 1007 and 994. The edge has label CVE 2011 0411 and then we will use this information to obtain the access complexity of that particular vulnerability from the attack graph and encode it in form of like a reward metric. Then the value iteration algorithm will learn the attack policy that is beneficial for the pentester in this case. So I took a subset of the entire graph to run the value iteration algorithm and it tells us that taking action of exploit, so zero represents no action and one represents the exploit action. So in this state taking a particular action will be beneficial for the attacker. So it will take some time to basically run the value iteration solver and it will give you an attack plan. So we use the attack plan validator to basically validate the plan of attack. So we use the attack plan to exploit FTP and another SSH vulnerability and basically you can see we obtained two shells in this case. And it took 0.35 seconds to complete and basically using this attack plan validation we can validate how much time it took for or attack plan to be finalized. So basically for the attack plan validation to work you should have the MSFRPC daemon running in one shell one session and you need to execute your attack plan in another session. So in summary the threat landscape is very complex and ever-growing and the autonomous pen testing solution like the one that we have used in ASAP. You can help us navigate through this complex attack surface and it is effort towards the generation of autonomous attack plans and their validation. So Lata is another module that we still have in progress and in future our plan is to use the log information to validate these attack plans and possibly also work for some kind of report generation tool which help us generate the report on how these validation was performed for individual exploits. So thank you everyone for attending my talk the source code is hosted on the GitHub repo the link which you can see. And if you want to contact me here is my contact information. I appreciate the organizers of Red Team Village to give me a chance to speak at this year's Red Team Village and have fun everyone.