 of my new talk, so, oh, yeah, exactly, exactly, whoo! The truth is that I just got ready this afternoon in total panic, so everything is as it should be. I'll be talking about how to take log for a day on a global scale using collaborative security. Yeah, well, there aren't much to say other than that there's a fancy picture of me from when I worked at KPMD and had to wear a tie to go to work. And also I had to go to work, which I don't now anymore because everything is worth from home. Yeah, I don't want to repeat all that. In order to understand how data is collected, we need to talk a little bit about how CrowdStack works. It's not going to be a completely product marketing talk, but just a very overall basic talk about that. But first, I want to address one of the reasons why CrowdStack was built in the first place, because as we all know, all those of us who work with cybersecurity who have an interest in cybersecurity, there is a problem that is not solved because everybody gets hacked, and as time goes, those of us who have been around for a long time, know that more and more people get hacked, and the companies spend bigger and bigger fortunes trying to prevent that, and it doesn't work. So basically what you could ask yourself is, are we doing something wrong? And obviously, CrowdStack has another approach to this, which we think makes much more sense. And basically, it's time for a new approach because we tried outpowering the bad guys, we tried outsmarting them, didn't work. Why don't we try to outnumber them? Because when you think of it, there is more of us than of the bad guys, so why don't we try and take advantage of that? Because when you think of it, nobody wants to fight a beehive. A single bee, not so bad, but if a thousand bees come running at you, you run. If you're sane, you run. And that's because there are a lot of them. So let's try to take that analogy and put it into security. So basically, I'll try to, in trying to explain how CrowdStack works, I usually compare it with Waze and calling it the Waze of Cybersecurity. And if you don't know what Waze is, it's a GPS app on your phone that shares information about your travel, how fast you're going, where there is weather cues and where there are holes in the roads and stuff like that. And it shares that with all the other Waze users so that everybody hopefully avoids all that stuff. And it's sort of the same with CrowdStack. CrowdStack detects the attack it sees on your service and shares them with everybody else. It also blocks the attacks if you want that, which you typically want. And basically, CrowdStack reflock, right? Then it detects threats, it mitigates those threats, and it sends those signals back. And then they are assessed and shared with the community, with the rest of the community as block lests, basically, yeah. This is just an example of attacks. I'm not going to talk very much about them. Other than CrowdStack can also detect web-based attacks. And this is what we did in terms of this log4j thing. Because CrowdStack reads the web server log and detects basically those log4j strings that some of you may remember. That was part of the attack. And in general, this is how CrowdStack works. Whatever web-based attack CrowdStack is trying to mitigate, it's just looking at the log. The current status of yesterday is that we collect around 1.3 million signals each day. We have 3.7 million IPs in our smoke database, which I'll be talking about in a little while. And around 33,000 IPs in our curated block list that it distributes to other users. I'll be talking a little bit more about how big the network is and what we call network strength in a later slide. And the CrowdStack is open source. And the license we've chosen for CrowdStack is MIT, which means that it's free forever. Once the source has been opened, it cannot be closed, and that is completed by design. Because being a young company and all that, there is a risk that somebody with a lot of money wants to buy us. So we can have a lot of money, which is good. But what is not good is if that company based of money or driven by money, as companies often are, they want to close the source and basically screw with the community and destroy everything we've worked for. So hence, this is MIT, it's open. If a company wants to do that, nothing happens, the code gets forward, and that was a bad decision. We offer CrowdStack offers a fair deal, we think. You share the information about your text you see, and you get all the block list stuff back for free. So that is sort of the trading agreement. If you don't want to share it, then you can always pay and stuff. But CrowdStack is free to use, no strings attached. A question that I'm often asked is what about privacy? Well, it's pretty easy to not give out people's information if you don't collect them. Because usually, when something is free, it means you're the product. That's how it is with Facebook and everything else. But in CrowdStack, you're not the product. The bad guys, IPs, they're the product. So basically, it's their privacy that we sell, and honestly, who cares? I mean, they ask for it. So CrowdStack only collects the timestamp of any given event, the offending IP, and behavior, which basically means a scenario that's been triggered, a scenario can be, let's say it's brute-forcing or any attack. That is everything we collect. We don't collect anything about how the attack is being performed, whatever. We just know that this method, this scenario, and that's it. Another option or another important thing with CrowdStack is, how do you deal with poisoning and false positives? There is a little consensus engine that works on the server side of CrowdStack, which assesses all IPs that have been sent to us in the smoke database. This is the initial database where signals enter. There's a trust rank. Basically, all new agents, all agents have a trust rank attached, and they start with trust rank zero. And over time, as they are contributing signals, and they prove trustworthy, and all that, gain and rises. And after six months of being rock-stable and trustworthy, they got a trust rank of 99. And that is important because this trust rank is being used as a point score whenever an IP is deemed malevolent. For instance, an IP, in order for it to be deemed malevolent, it needs a certain amount of points. And those points are based on the trust rank, basically. And the reason for this is that we want it to be as expensive and hard to poison the database. Because if you are a bad guy and want to poison the database, it takes a lot of time. That's one thing. And also, in this voting process, an ASN can only give one vote, meaning that if you want to have, if you think you can poison the CrowdStack database by buying 1,000 VPSs on the same cloud provider, you're wrong, because they don't have many ASNs. So you need to have them on a lot of ASNs around the world in order to have a chance to build this trust rank, and also a lot of time. There is, in terms of false positive, we have a small fleet of honeypots where we basically evaluate the signals that comes from all the agents that we don't have control over. So the honeypots have a slightly bigger trust rank. There's a white list. There are certain IPs that we don't want to blacklist, like Google DNS, CloudFair, CDNs, stuff like that. And this looks a little bit pre-crime-ish, but there's also a predictive algorithm saying that if a certain number of IPs from the same net block has been doing bad things, at some point, we just blocked the entire net block, saying that nothing good comes out of that. Why not block them in advance? And then after this consensus process, IPs are being sent to the fire database and distributed back. And as I said before, there's around 3.7 million IPs in the smoke database and around 33,000 in the fire database. So it is a very conservative assessment, because we don't want false positives. We just simply don't. And so far it works well. This is not a detail to talk about CrowdStack, even though that you may have felt it like that already. If you want to learn more about which OS is supported and how CrowdStack works in details and all that, I'm having some workshop during MCH. I'll talk a little bit about those that are already arranged. If you want to have one in your village or whatever, just feel free to approach me. And I'm sure we can find a way to do that. There are more details on that in the end of my talk. And now it's time for my first break. Before I'm talking specifically about log4gay, it's important to understand that CrowdStack is not a web application firewall. It's not a VAP. And regarding log4gay, this is particularly important, because as you may know, if not, then I'll talk about it a little bit later, it only takes one request if you're vulnerable to get compromised. So because CrowdStack reach logs, things need to be in the log before it can be blocked. And if it's already in the block and you're vulnerable, then you're already fucked. So CrowdStack relies very much on this network effect. It relies very much on the community signals. If you're a vulnerable server and you're protecting this CrowdStack, then you hope that the network has detected those IPs, because you're in trouble. Luckily, this whole reputation, basically, works pretty well, because we did a small study, not scientific at all, where we run two identical servers, and the whole objective was to try and find out how big is the network effect. Basically, we set up two deep-end servers on AWS. Identical, the only difference was that one of them had the bouncer part of CrowdStack, which is the IPS part of CrowdStack, blocking traffic. Because once you have blocked the IPs in the firewall of the host machine, where CrowdStack is installed, then it won't be in the log, and then it won't be in the stats, basically. So we left those machines running for three months, and it turned out that 92% of all the bad traffic in that the server was blocked by the community already. So this is good news if you're vulnerable to LofoJ back in December, and you rely on CrowdStack for this. But it's also important to note that this is also, CrowdStack is also, in a way, resembling to a 100-part in general, because it's possible to install the CrowdStack scenario, or the 100-part scenario in the scenario, without being vulnerable, you just need a web server. So conclusion, community matters. All right, hopefully this is the part you came for. This is where we are now. If there should be a couple of you who don't know what LofoJ is, I'll give you the very short version of this. On December 9th, last year, Apache Foundation released an info on a critical bug in the LofoJ library. Explorable via remote code execution. And then it turns out LofoJ is used everywhere, and by everywhere, I mean everywhere, right? I saw this tweet from a guy called Kasim Koden, which were dawned on me, because he had, in his iPhone, he has said his name to this specific GDI string, which then exploits the server. And it turned out he got a connection back from Apple servers, and they were vulnerable. And then I was like, hm, it really is everywhere. So it was everywhere. So everybody obviously went totally in panic. And everybody was busy, at least in the security field, to, by reason, free tools and resources to help out. And so did we. We released the entire list of IPs that were actively abusing the vulnerability so that people could import into their own firewall. This is a super fast overview of the timeline. The details of the vulnerability was released on December 9th, on December 10th. Krasnick released the first version of the log4j scenario. Then that scenario was pushed out to a number of agents, and the signal started pouring in a little bit, but it didn't peak until two days later. Because when you think of it, this is a really good example of what is cool about this whole community stuff, that when you have a community of Krasnick users that you can utilize in this way just by sending out a scenario to them, then you can start harvesting CTI on even new vulnerabilities. And it happened pretty fast. As the attacks evolved, we also needed to update the scenarios, because in the beginning it was just this plain texturing, but obfuscation, shenanigans and stuff made them happen, of course, as hackers do. So we had to update the scenarios as well. So on December 12th, on December 13th, on December 16th, and finally on December 20th, when somebody from the community was adding some Unicode bypass pattern thing. It was updated then, and then in the end, it turned out there was 34 GROC patterns in the end scenario, which is pretty cool, I think. This is the spike of signals. You can see here on December 10th, not much going on. On December 12th, it goes bananas. But as we started getting signals, there were some signals that we received that were somewhat different. And our data scientists looked into it, and it turned out that there was a German security research company who were doing scanning. And obviously, we didn't want to interfere with their research or anything. And also, they were not any harm. So we blocked them out. And this is how log4j is looking up until now. You can see a big drop in the end of May. And then after that, there are a few things going on, especially there are two big spikes to the right, which I'll be talking a little bit about. That was on June 21st and July 7th. We're not really sure what happened, but we know who did it. It's this guy, 13.89.48.118. Both on June 22nd and on July 7th, this guy was going crazy with the log4j attacks. We don't really know. It was interesting, because these are data scientists. He treated that on June 22nd. And then on July 7th, there was double as many attacks as he saw on June 21st. As I said, we don't know why, but we have a theory, at least for June 21st. And that theory is that on June 23rd, there was a report from CSAT, the Cybersecurity and Infrastructure Security Agency in the US, who warned that malicious cyber actors were trying to exploit log4j in being a horizon service. So one can only guess that that's what they were trying to do. And given, as I said, when I talked about log4j initially, log4j library is everywhere. So it's highly unlikely that everything is patched by now and probably ever will. So we may see more of this, who knows. It's pretty clear that when you're collecting a lot of data, like we do, there are other fun things to see. Some more fun than others, I'll give you that. But here it goes. Here is the top 10 of ASN hosting malevolent traffic. Some are more legit than others. I'll talk a little bit about hackers and their need for anonymity in a little while and why they are hacking like that, and also about how some cloud providers are better than others in taking those bad servers down. Look at the two figures of both. Because the top one shows that, over time, all the IPs that has been seen at least once and the malevolent at some point, over time, since CrowdStack started a year and a half ago, it's around 2.79%. But if you look at it, how many new IPs does CrowdStack verify in a week, then it's six times as high. Meaning that the IPs that the bad guys are using is rotating a lot on the short term. And the reason why they do that is because they want to be anonymous. And the way one of the ideas behind CrowdStack is to automatically detect all those attacks and block them right away and by the route if possible. So this is good news, because we really want to put those ASOs out of business. 2% of those IPs, they renew every 12 hours. And in a week, 12% of the IPs are seen for the first time. So yes, it turns out at least when you look at the stats directly, it looks like the CrowdStack strategy by blogging those IPs right away automatically. It's a good idea if we get the world domination that we are trying to achieve. This is another thing, because it turns out that SIP is constantly being hammered. We talked to a user back in November that is a friend's voice of IP provider. And they were seeing a lot of things, a lot of SIP boot forcing. And they suggested that we wrote a scenario for it. And we have done that for our honeypots only. And I'll tell you in a little while why that is important. But if you look at it per agent, then the SIP scenarios are by far the biggest contributor. On one day, we got 3,000 signals just on that single scenario. And given that it's only installed on honeypots, and honeypots are special in the sense that they don't block attacks. So an attacker can, in theory, keep on hammering. And also, the way the scenario is made is pretty dumb. So it reacts on any request to the SIP protocol. But given that it's a honeypot, nobody would want to talk to it anyway. That's what honeypots are for, right? And also, it turns out that as a sage, it's constantly being hammered. This is by far the most important scenarios. We get around, as I said, 1.3 million IPs, 1.3 million signals every day. And around a million of those are on SSH, either normal brute-forcing or slow brute-forcing. And we get signals from 5,000 agents, and 60,000 bad actors are reported on this. This is, over time, an overview of the top threats that we've detected. Over time, we have to take the 102 million signals. And 60%, the orange and the green, those were the ones we saw before. Those are SSH brute-forcing. Windows is only, Windows brute-forcing is only around 2% of those. But CrowdStack is, data is most likely, or they are totally guaranteed to be biased, because CrowdStack is mostly on Linux. And when you install CrowdStack on Linux, it automatically detects that you have SSH and installs this scenario for that. And Windows is a very new platform, so it's not very widely used yet. Oh, sorry. This is an overview of selected Cloud vendors and how good they are at cleaning up when a VPS or a customer's VPS has been compromised. On the x-axis, there is what we call the malevolent duration, and that is the number of days an IP is reported by the community. So an average MD malevolent duration of all IPs in their ASN is an indicator of the due diligence in terms of dealing with compromised assets. And on the y-axis, there is a number of ASNs this Cloud provider has. There's a big difference between the Cloud providers and how good they are at taking down common-by-service. But there's also a huge difference between hostesses in terms of how many risky services they host, and, for instance, like a vulnerable PHP CMS that's like asking for trouble, right? In the orange here is the average MD for each provider. And AWS was best of those that we looked at. Three days in general. It took them over various OVH to 17 days. So it's not to say that AWS are good and OVH are bad. These are indications that we see. And in spite of how good AWS are at cleaning up, they're still, they still have a lot of attacks. So I wonder how it would have looked if they were worse at cleaning up. So AWS dominates the space with the detallation. I don't know if that's something to be proud of or not. And the CrowdStrike network strength is basically the top 10 of the countries that has most CrowdStrike agents. The picture is a little bit blurry because France, US, Germany, Netherlands, they have a lot of cloud providers. So we may know where the IPs are physically, but we don't know where users are from, of course. And that was it. If you want to try out CrowdStrike, I'm doing a couple of workshops here at MCH. I have one scheduled at the Village People Village tomorrow at 1,600. And on Sunday in the Secura Village. And there may be more to come. I don't know yet. Both villages, they have limited space. So if you really want to join, come early. Or you can also ask me to come to your village to do one, I'd be happy to do that. Find us on Twitter at crowd underscore security, crowd security without the underscore or CrowdStrike, those are different. Those are not us. You can also join our friendly Discord community at Discord.gd slash CrowdStrike or scan the QR code. We also have workshops there. Or you can send me a mail or hit me up here at MCH. I'll be around until Tuesday. All right, that was it.