 Hello, everybody. So my name is Jonathan Alexander. I am CTO at Open Zeppelin. I'm the co-founder of Florida. And in this talk, I'll be covering research and development on decentralized threat detection bots. There are a number of contributors to this presentation whose work I'll be citing here. A lot of the information and research has been conducted by these individuals. It includes Christian Seifert, who is researcher in residence at Florida. Mariko Wakabashi, who is lead machine learning engineer at Open Zeppelin. Dario Lobulio, who is security researcher at Open Zeppelin. We have a number of independent researchers. And we also have teams at NetherMind and Limechain who have contributed to some of the techniques that we'll be discussing here. So I'm going to start with a little bit of background. We are starting to see increasing acceptance in the space that security needs to be end-to-end and that part of end-to-end security includes runtime monitoring and threat detection. We have multiple leading security audit firms. Open Zeppelin, Chain Security, Halborn, Mixbytes, who are beginning to make recommendations along with their audits for smart contract monitoring that can be done post-deployment. The recommendations include covering protocol assumptions and invariance, critical protocol variables, known protocol risks that have been considered acceptable, the use of privileged functions, and the transfer of privileges on-chain and cross-chain synchronization and oracles and keeping track of the state of data that's meant to be kept in sync across these different environments. And also external contracts and protocols that your protocol may rely on, as well as identified attacks that may follow certain patterns that are already known. And this monitoring is meant to cover the knowns, the things like specific attack patterns that we might see. So that's one thing we want to monitor. The known unknowns, which would be places in the protocol where you know you're not sure what might happen and you might want to monitor that, and then very importantly, the unknown unknowns. So we might have things in our protocols that we think are invariant and will not change, and yet if something could happen that was unexpected, we would certainly want to know about that. So the challenge, though, with runtime monitoring is if you take that whole list, it's very difficult to implement the full set of monitoring that you might want, especially when you get into some of these areas that are not the responsibility of the protocol team themselves, such as other protocol implementations and their risks and updates to those, as well as things like the different attack patterns that have been seen in the ecosystem, but also maybe new and emerging. And we see new attack patterns emerging all the time. So at Open Zeppelin, we started working on post-deployment tools and specifically monitoring a few years ago. And this led us to the realization of the challenges here. And we decided to build, and along with a number of partners now, have launched a decentralized network for runtime threat detection. It's called FORDA. I'm going to be citing some of it in this presentation. And a lot of the research that's here has been done by people working in and around FORDA. FORDA, you can go check it out. It's a decentralized network of scan nodes. There's thousands of scan nodes running now across a number of main nets. And the way it works is that you build bots, threat detection bots. They are deployed as containers onto the network. They run on multiple scan nodes. And it is those results that are brought together to detect threats. So and the idea here is that for protocol teams, we, I'm sorry, am I literally standing right in front of the slides, for protocol teams, back to the prior slide, the challenge of trying to do sufficient and real end to end monitoring and threat detection now is not just the risk. The team doesn't just have to do it themselves. They can take advantage of a community of threat detection. We can work together to have a real complete solution. So first of all, I'd like to share with you some observations from research that's been done on threat detection that hopefully you'll find a little bit interesting. The first thing is that as we look at the history of attacks, we begin to see a very common pattern of a set of stages that an attacker goes through. And this starts with obtaining funds that will be used to carry out the attack. It moves to preparation. And there's a number of steps that are common in different kinds of attacks. It may involve deploying a smart contract, which is going to be used to carry out the attack. It may involve token impersonation, it may involve the transfer of privileges or making use of privileged functions. And then there are different kinds of fraud techniques that we see being carried out to prepare for stealing funds. And we'll get into some of that a little bit later. The next stage is exploitation. That's where the actual attack occurs. In this stage, we may see flash loans. We may see private transactions. We may see re-entrancy, minting, anomalous transfers, anomalous functions, large balance changes, things like that. And finally, we have the phase of laundering, which is where the attacker takes the funds that they've obtained and they move them to try to hide the trail and exit with funds. One of the other interesting things in the research is that looking at quite a large number, over 180 attacks over the last three years, specifically in DeFi, we see that over half of the attacks have been non-atomic. And what that means is that in the exploitation phase, it is more than one transaction and typically many transactions that's used to carry out the completion of the attack. And why that's important is it means there is a time frame over which the exploit occurs. And therefore, early detection could be useful to taking action and possibly mitigating the full effect of the exploit. So that's another interesting fact. Another observation is that attackers often use more than one account. This may be obvious, but as researchers have looked at this and you may have seen this in some other presentations, we can use heuristics to associate attacker accounts and then by associating attacker accounts into clusters, we are able to track and attack through those stages I just talked about, even if many accounts are being used as part of that. We have examples of using heuristic-based approaches, such as a connected component graph algorithm, which would allow us to kind of graph together these connected accounts and then track them as a cluster. Another observation, in over 40% of the attacks, the attacker deploys a smart contract to execute the exploit. These attack contracts differ from benign contracts in very clear ways. And here, on this slide, I'm citing some research conducted where over 10,000 smart contracts were sourced from Lua Base, 155 external accounts were identified as tagged as having an exploit in ether scan and then those accounts were associated to a set of smart contracts that those accounts deployed. So we had a set of benign contracts and we had a set of malicious contracts using a classification of the top 100 op codes in each of those smart contracts. We see incredible success in being able to identify contracts that are benign. So in this case, of the benign contracts with 98% accuracy, the classification algorithm is able to say, that is not a malicious contract. And in the case of the set of malicious contracts, in this case, 21 malicious contracts, the classification algorithm was able to identify 17 of those as suspicious or likely malicious. So this shows promise in the ability to just identify among a smart contract if it falls into the category of potentially malicious. Taking that a step further, we also see that when we get into the details, not just kind of like which op codes are used, but we look at the patterns of the op codes in use in these smart contracts that are used for attacks, we see that they follow very similar patterns. And so if we look deeper into the patterns, in this case, taking an even more advanced kind of approach to classification where using techniques from natural language processing, grouping together op codes into groups, collections in order, and then analyzing the use of those different groups, again with classification. In this case, the researchers looked at over 12,000 benign contracts, over 100 malicious contracts, and then using the TFIDFF NLP technique of grouping, feeding it into the classification. Now we get even a higher level of precision. In this case, 88% precision where the technique and the classification would predict that a contract was malicious. When it predicted that, it had a fairly small rate of error, so 88% accuracy. And also in this case, the researchers went back to a number of known attacks that have been historically analyzed and carried out via smart contracts. And some of them are cited here, the Wintermute, Audias, Inverse Finance exploits. And in that case, this technique was able to predict in almost 60% of the cases that the attacking contract was malicious. So again, promising areas of research that we can identify malicious contracts. The last observation I want to share with you relates to fraud. Fraud is an attack that is usually carried out with social engineering or some kind of web to attack, where in the case of, as the example is given here, ice fishing, an attacker will get users, will trick users into signing approval transactions that give the attacker control over their tokens. These also produce detectable on-chain patterns. And so in this case, we can use heuristic-based approaches just to look for these patterns of accumulation or repeated actions that might be carried out by a single account or, again, a cluster of connected accounts that are carrying out some kind of fraud attack on a lot of users and then going to steal those users' tokens. They might do it. We've seen this happen in certain cases. We've seen attackers who do all the preparation up front and then they carry out the attack all at once. We've seen some cases where the attack just continues over time. But in this case, using a heuristic-based technique and then using public data about ice fishing attacks that occurred over a one-week period in the month of September, this technique was able to, with 95% precision, predict that an ice fishing attack was underway. So in this case, it made 12 predictions that there was an ice fishing attack. And I'm sorry, I probably have the data run, it's probably like 13. And but it was 95% accurate. It's a very low number of false positives. And of the 21 ice fishing attacks that were known publicly that we could see historically had occurred that week, it actually predicted 12 of those. So again, these are promising. This is kind of the research that's going on and some of the things we're seeing. So now what I'd like to do is share with you some threat detection bot techniques that can be used to detect some of the various kinds of threats that we're seeing. There actually will be a workshop tomorrow on Fortebot development if you're interested. I think it's tomorrow afternoon. I don't have the exact details. But it'll be useful for you to know just up front that because I'm going to reference it in a couple cases that Fortebots, you implement handlers, you can implement in JavaScript or Python. One of the things you can do is initialize. You have your own environment. It's a Docker container. And then you kind of subscribe to either handle transaction or handle block. And you can pretty much do whatever you want. You can also look up other alerts that have been emitted by other bots. And ultimately, what a bot does is that if it has a finding, if it finds some kind of threat or something it want alerts on, it shares that, which then gets published. And again, other bots can take advantage of that or users can take advantage. The first technique that I want to discuss. And I have three techniques I'm going to talk about. And in each of these, I'm providing some examples of bots. These examples, if you go to the documentation on Fortebots, you'll be able to find references. Or if you go to the Fortebots Explorer, you can also look up bots by name. And most of them are open source, and you can link. There are templates that mostly are what I'm covering here that have been provided by either the Fortebots network or NetherMind or LimeChain. So the first technique I want to talk about is multiple bots working together in a group to track attack stages. And then to, with increasing confidence as an attack goes through stages, be able to express that there is potentially attack underway, going all the way to complete confidence that an attack has occurred. This is done with having a set of bots that atomically detect some of those things that occur in funding, in preparation, in exploit, in laundering. You can have individual bots looking for individual things like suspicious contract creation, contract spoofing, ice fishing, large transfers, money laundering. You can also have bots that identify clusters of accounts. And there are example bots now available where you can see the source and how to do that. And finally, once you have these bots who are each kind of making an individual discovery, then you put that together with a pattern matching on the various alerts and being able to then declare with some level of confidence this is what's going on, this is the possible attacker, and eventually this is the party or parties under attack. And so there are some bots that you would be able to find the technique in the Fortebots community has been referred to as a combiner bot. The second technique that I want to discuss is simulation. In this case, you can have a bot that literally, while it's executing, forks mainnet with Gnash, with something else. You can fork mainnet and then run simulations to detect issues. There's kind of two techniques that have been used here. One is if you have a known protocol and you know that there are, if certain things happen on the protocol, then you would know that there must be something wrong, perhaps an attack underway. You could execute those in simulation after every single block that's been mined and alert if the transactions failed. An example here is a liquidity tester. This is a bot that's been written to take a DeFi system, identify the top 10 current liquidity holders, and simply execute their ability to exit their position after each block. And in the case, if any of them were not able to exit their position, it would alert. The other technique here is to test contracts. And we have a couple of bot examples, one that uses fuzzing, one that doesn't. But in this case, the simulation is to take a deployed contract, fork the network, and then basically try to execute every single method, knowing nothing about the contract itself, just detecting it through code and trying to execute it and looking to see if any execution results in, let's say, a large funds transfer or some other kind of unusually anomalous occurrence that would say, there is something, this contract is trying to do something or it seems to result in a big transfer of funds and therefore is suspicious. The final technique I want to talk about is the use of machine learning in bots. In this case, you can take a machine learning model, you can serialize it to a file. In the case of Florida, you can deploy models with the bot and you can use the initialized function to load a model. And then in your bot, you can execute the model. This has been used successfully already in a number of bots that do anomaly detection and also malicious contract detection. So in this case, for example, there is a smart change detector bot that I've referred to here and a time series analyzer that they use, they use machine learning algorithms. I know one of them is a library that actually the meta team developed that is very good at detecting outliers and deviation with a limited amount of false positives. So it's pretty good at saying, this is a real outlier and there's a price change that occurs here if we're looking at on-chain prices or if you're looking at, let's say the balances or certain values in your smart contract variables, it can detect over a period of time an outlier that is perhaps meaningful in something you should look at. And then in the case of the machine learning for smart contracts, there's a bot called the contract destructor and some other bots that do similar things where they literally take a deployed smart contract address and they decompile it to get the op codes for the smart contract and then they use these machine learning models kind of that I referred to before to analyze and classify that smart contract and therefore predict whether the smart contract is malicious or benign. So these are all techniques that are already now in use and further research is underway on how to improve these kinds of bots. So I wanna wrap up just with a couple of challenges and you might have been thinking about some of these as I've been speaking, there are challenges here that we're all dealing with. Atomic attacks are obviously a challenge so I said that well, we have history to tell us that more than half the attacks are non-atomic but of course that means that 40 to 45% of the attacks are atomic, also private transactions can be used and so this is a challenge for threat detection and monitoring. Another challenge is that there are teams who wanna monitor in secret that would prefer not to monitor in public and they may have good reason for that, for example, if there is a known vulnerability and you're in process of trying to deal with a known vulnerability, you may wanna do that privately and not on a public system and then finally we have the challenge of response latency which goes to how quickly can we alert but also can we respond and is there even anything we're able to do once we've detected? These are things that Ford is working on, there are already some things available for private but these are also areas for future research. Latency by the way in the Ford network is about 40 seconds from mind block to alert but things that are being looked at are possibility of trusted private scan pools for people who wanna monitor in secret, pre-submission or mem pool transaction scanning, on-chain alerts which would provide a possibility for on-chain reaction and other systems that can be used for automated action and follow-up. Open Zeppelin's working on these things for the community and many of the contributors I mentioned are starting to work into these things and to wrap up I would say if you are interested, bot developers, security researchers, data scientists, please go to fordo.org and you can learn more and find ways to get involved and there's various grants and incentives available for people to get involved and with that I'm done and if there's time I'll take questions. Thank you for them. Sorry, thank you for the talk today. I find the bots incredibly useful for teams but what I'm curious about is what happens once the bot alerts a team, what does the team have to do to then make a fast impact? Yeah, well I mean in this presentation I wasn't so much getting into that but that's obviously a big topic of conversation for everybody. I think we are starting to see, well we have seen many attacks in the wild where teams have paused their smart contracts and we even saw a chain pause the whole chain. We are seeing more discussion about that because I think we're definitely headed down the road where emergency response capabilities may be an acceptable thing to have on your protocol but it needs to be contained. It needs to be limited. We need to kind of have some level of delegation to a set of responders so that they can maybe do fast response but with limited control, right? So we're starting to see protocols implement this and teams talk about it publicly. You may have heard some of the bleeding DeFi protocols are starting to discuss this. So pausing and then having some capability to deal with the attack is kind of a base point but there's a lot to get into there because even pausing depending on the nature of the attack may not solve the problem, right? And I think the important thing here is that we all have to start thinking about this because if you go back to the unknown unknowns it's just impossible that we're gonna catch everything up front and so yeah, but it's not solved. That's the true answer, yeah. Right. Thanks for your talk, Jonathan. Speaking of things that you do after the attack, what is open sibling doing today with that information? How does open sibling harness that information to transform that into knowledge? That is the first question. And the same question, is there anything to do besides a smart contract audit in terms of things that you can do before the attack happens like scanning the mem pool or having some heuristic in there? Well, a couple of two parts to your question. The first one is that open sibling is in our audit open sibling is in our audit practice. We are now, when we do an audit, we typically also make recommendations because about what should be monitored because the auditors have good insight into, well, we're training our auditors more and more in how monitoring can help. So then the monitors also, the auditors also understand the protocol and they can say to a team, these are things you should think about monitoring in production because if our audit didn't catch anything or these may be just live risks you're gonna be exposed to. OpenSeplin is also working on tooling to automate the creation of monitoring and we have, like if you've seen our contracts wizard, we're gonna be starting to build in things that if you use openSeplin contracts we'll auto generate monitoring templates for you and then we have a product platform called Defender where we're working on how to automate those responses that we were talking about, like how you could at least get to where you could pause as quickly as possible as an example. So we're doing all those things and we're collaborating, I know Florida has many multiple, I mentioned chain security and Halborn and Mixbytes are all doing some similar things I think, yeah. We're good, thank you.