 less sophisticated than that. When I saw the first image generators come online last summer, I kind of had the feeling that these AI tools like Mid Journey would be used almost as political cartoons. So not necessarily to convince people of, you know, create a fake image and convince people that this is actually happening, but more to create a kind of an image of your opinion of what's happening in the world. So for example, a good example of this was when Donald Trump was arrested a couple months ago. People were creating these images, showing Donald Trump getting arrested. So they're kind of creating these images as an idealized version of what they want reality to be like. So it's kind of like less using these tools to convince people and more to like to show your opinion to kind of like a political cartoon, basically, to kind of parody reality. The inverse opinion that I see from a lot of experts that I follow on Twitter is, you know, a lot of people say, well, these tools are just bad. They suck, right? They'll say, look at how many teeth this girl has. So like, they're like, how could anybody think that this is a real image? You know, it will never convince anybody. This image that was shared by Donald Trump Jr. and what the hell is this guy's face, you know? So there's an attitude that is very prevalent where it's like these things will never, or maybe not never, but as they are right now, they're not good enough to actually cause very important damage. ChatGBT, I've seen a lot of opinions that it's just a glorified autocomplete that does a lot of, makes a lot of mistakes. And it really does, I asked ChatGBT to write a biography of me and it says I was born in 1980, not true. It says I went to University of Montreal, not true. It just says this really confidently. So a lot of people look at this and they're like, well, you know, this is not good. You know, if you get ChatGBT to write articles for you, whatever, it's just very low quality and most people would maybe be able to see through that. What I think though, is that it's already good enough to cause damage because these systems are optimized for spam. So when you think about spam, what is it, it's low quality content that is cheap to produce and that can be, you know, sent to thousands, hundreds of thousands, millions of people and well, these systems take one of the things out of the equation. So they make it extremely cheap to create infinite content and it's pretty much instantaneous. So it's perfect for spammers and people who are doing the clickbait or stuff like that on social media. Now if you think who would wanna click on crappy articles written by ChatGBT or images that have too many teeth and stuff like that, I invite you to go look at Facebook's most widely viewed content. They produce a quarterly report where they show the most popular posts on Facebook for the past three months. This was last report and you can see that the most popular stuff online is basically fish food, right, it's not good content and so these systems are like perfectly adapted to create an infinite flow of content that people will mindlessly click on. We're already seeing ChatGBT being used for all sorts of spam online. So this is a funny situation where a lot of people use ChatGBT and they'll copy the whole thing and forget to not copy the regenerate response at the bottom. So if you search for regenerate response on Google, on Amazon or stuff like that, you'll see a ton of these, this is a dropshipping type of operation that use ChatGBT to write some ad copy for their soap. Here's, this is a real estate listing written by ChatGBT, regenerate response. So obviously I think that these systems are already here. You might not see that much harm in spam, it's annoying. It's low quality crap that everybody says they don't click on it but obviously people do click on it because it's often some of the most popular content on social media. Honestly on a personal level, I think that this might one day make large platforms like Facebook basically unusable for people. So if you have infinite content, you don't know what is written by humans, you don't know if the pictures of the people are real, if the accounts are real, you know that they can use ChatGBT to answer questions or direct messages. I can't show you this because we haven't reported it yet but we did last week find a network of 300 something Facebook pages that are running thousands of ads on Facebook, all using basically these systems so they'll create the images with mid-journey, they'll answer comments using ChatGBT almost instantaneously, they'll answer direct messages using ChatGBT. These people are, they don't speak English as a language but they're able to have English speaking content. They'll use ChatGBT to create these clickbaity articles. So I can't show you that because we haven't reported it but it's already here, they're already using this for these purposes and it's kind of funny because these clickbaity type of articles were kind of not in style for a couple of years, like Facebook and Google kind of cracked down on this type of content. Google demonetized or said they wanted to demonetize clickbait websites and stuff like that. Facebook said that they cracked down on this type of content. We're seeing a resurgence of this now because it's so cheap to produce using these systems so you don't have to pay someone $5 an hour to write articles anymore. You can just generate them automatically and you can have an infinite amount of content. So this is, I think it's gonna be a problem in the upcoming months, years, for people who use social media. I don't know what it's gonna do. It'll be interesting to see how the platforms actually react and try to moderate this new reality but it's coming fast, that's for sure. But I wanna drill down today on one specific case where I see this already being used and I can see kind of future potential uses of these technologies to cause some real damage in the world. So I wanna talk to you about the world's biggest scam. And this is, I started researching, I started covering this scam two years ago. When I first realized how massive this scam is it completely blew my mind. I had no idea that this existed. It's under our feet. It's a massive fraudulent industry that steals billions of dollars from normal people every year. And not many people are talking about this. Unfortunately, it's really hard to get the word out on this and unfortunately, world governments aren't really acting either. So the scam is pretty simple on the surface. So people will end up on a trading website that looks legit. They're often promised that they're gonna be making 200, 300% returns. They're asked to put in a small amount of money to invest, usually $250. When they do invest, so they send money, they're shown a trading platform that looks real, it shows their investment, it's all fake. So the people running this scam, they can show the user absolutely anything on this. So the person usually puts more money, more money, they have an investment agent that calls them basically every day. Some of these victims form relationships with these agents, they feel like they're almost a member of their family. I've seen cases where this goes on for a year. So they'll call the person, how's your kids doing? Okay, cool, yeah, you should invest because look, you made a lot of money last month and you should put more money, more money. And these people get absolutely cleaned. They lose everything. So at Radio-Canada, we talked to Joanne Gantzi, a woman from Utawe region, she lost $250,000, basically all her home equity gone to this scam. We spoke to Fernan Laroche, he lost a million dollars, his entire life savings completely wiped out because he thought he was investing in Bitcoin. I get that most people here would probably not fall for this type of scam and you're probably thinking these people are kind of dumb for clicking on this. This guy's a psychologist, he's not dumb. Maybe has a bit of trouble understanding how the internet works or how crypto works or stuff like that, but a lot of people are getting wiped out by this. In Canada, just last year, the amount of money lost in these frauds doubled in Canada, so it was almost $100 million lost. And this is an understatement. Even the government admit that it's only 5 or 10% of victims actually report being a victim of this type of scam. Average victim loses $70,000, so that's a considerable chunk of change. If you're interested in this scam, I invite you to watch this documentary from BBC. It was made by Simona Wineglass, who we see here. She's an absolute beast of an investigative reporter. She works at the Times of Israel, where this fraud started basically. And so in this documentary, she uncovers one of these criminal gangs. She finds out that it's being run by an ex-politician in Georgia. It's really wild stuff. I really, really recommend watching it, but just to give you a quick idea of what we're talking about here, Simona describes it as the Uber of fraud. And it really is that it's applying the lessons of big technology to fraud. So a lot of people, when they imagine fraudsters doing this type of stuff, they imagine a couple guys in a basement with hoodies on that are stealing these people's money. It's not that. It's an industry. So there's like, we don't know exactly how much, but between 10 and 15 of these criminal gangs running these scams, they have dozens of call centers each. Each call center has like 100, 150 employees. They're salaried people. And this is a business. And they've also now started doing is they've basically Uberized this system. So it's kind of like fraud as a service basically. So how it started was in Israel, they used to run these call centers, running what's called a binary option scam. So basically it's a way of betting on the stock market. So you bet that the stock goes up or the stock goes down and you either win or you lose. Most people lose money. What happens is these guys realize that most people lost money, so that's good for them. And if people win money, you can just not give them their money. So that's how this scam started is basically when people made gains, they would stall. They would say, oh, we need KYC, know your client. You'll need to pay income tax. You need to pay the Israeli income tax on your gain, stuff like that. And if all else felt they could just close the website and start a new one. It costs nothing to start websites. 2017 Israel banned binary options. So they had to find something else. They started doing crypto. So if you've ever heard of the ICO boom in 2017, 2018, a lot of these guys were into that. But they also developed a system where they sell their, they sell this fraud as a service. So instead of running the call centers themselves, they basically sell it as a service. So for a fee and a cut of the profits, they'll help you set up a call center. They'll help you train your employees. They'll help you launder your money, show you how to do it in a way that is safe or safe-ish. And so in the past five, six years, we've seen a lot of these call centers popping up pretty much everywhere in Europe. We've seen some in Lithuania, Poland, Ukraine. There's some in, there was a couple last month in Malaysia that got busted. There's even some in South Africa. So it's all these criminal operations run remotely like that. It's basically a massive, massive, massive industry. On this live stream from Offshore Alert, Ken Gamble, who's a fraud researcher in Australia, he basically, this was three or four years ago, he was basically saying already that this fraud had basically escaped most government's ability to do anything because the money travels so fast. It goes from bank account to bank account to bank account. It's almost impossible for local police to actually investigate these crimes. One of the really messed up things that these criminals do is, so when you sign up to the platform, they'll ask for KYC because they act like they're a legitimate financial service. So they'll ask for photo ID for all your personal information, date of birth, and social security number. They don't care about doing KYC, obviously they're criminals. But what they will do is they'll use that information to open a crypto account on Kraken or crypto.com and then use that to launder money. And then once they're done with it, they'll actually sell the personal info on the black market so they'll make even more money with the info that they stole. It makes it almost basically impossible for local police officers to do anything. So if they've managed to track the crypto payments to the crypto exchange, well, the person who cashed out is basically someone whose info was stolen by the same criminal group. So it makes it exceedingly hard and most countries don't really do anything. The US is one, these criminals don't accept US clients. So if you go on the websites using a US IP, they're geo-fenced, you can't go there. To sign up you have to swear that you're not a US citizen. They're extremely afraid of the US. One reason is a couple years ago, one of these crime bosses, she went to the US to visit her family and she got picked up 22 years of prison in US prisons. So they're very afraid of the US. Germany is doing some pushback on this fraud. There's been a couple of crackdowns in call centers in the past two years in Europe. There was one a couple months ago in Lithuania and Serbia, I think. But most countries aren't doing anything. I can tell you that Canada is not doing much of anything for these frauds, unfortunately. So the victims we spoke to, they're talking to a wall. Basically they lost their life savings and they're being told by the police that there's nothing to be done, that they can't do anything. Why am I talking about this? Why did I talk about this for the last 15 minutes when I'm supposed to be talking about AI? It's because this whole sophisticated industry of fraud has one major entry point. So I described the back end. The front end looks like this. It's Facebook ads, basically. So it's these Facebook ads that usually have a local celebrity. So they'll target Canada with, 90% of the ads have Elon Musk, but they use local politicians in Canada to try to get people to click on these ads. They use local celebrities. I saw one two days ago with Seth Rogen. So they'll try to convince people to click on these ads using local celebrities. When you click on the ad, you get sent to a fake newspaper article that says, oh, Elon Musk invented a new way to make money, click on this. And then you're sent to the actual fraud. Now, I know once again, most people here would never fall for this, and you probably think it's ridiculous. This works, this works extremely well. It's really hard to pin down how much money these criminals make, but upwards of 10 billion a year, probably more. One of the criminal groups, the biggest one makes one or $2 billion a year. And it's not the biggest one. So it's extremely massive and the entryway into these scams is really simple. It's spam, basically. It's really low quality. Most people would read this and realize that it's complete crap, but unfortunately it works. And we often say, well, if you sign up for this, you should have done your own research. You should have tried to at least know what you're investing in. Here's one of these crime products, so called Quantum AI. So let's say you're someone who sees this and you're like, oh, I wanna look at what this is. So you Google Quantum AI and every single result on the first page is ads bought by these scammers. And there's actually one real result on the first Google page. It's for a Google product called Quantum AI as well. So these people beat Google at their own SEO on their own platform. And so how I'm tying this into AI is that these ads are not run by the criminals themselves. They're run by affiliate marketers. So these crime syndicates don't run their own ads because they want plausible deniability. So they'll pay affiliates, often people in countries like Bangladesh, Pakistan, Vietnam, stuff like that, who will run these ads on Facebook for them? This is really lucrative for the people running the ads. So this was on an affiliate marketing website yesterday. The payout is $1,250. So if you manage to get someone to click on your ad and make a minimum deposit of $250, you get $1,200 US. That's a lot of money in a lot of countries. And if you can get a lot of people to click on these ads, you're gonna make a lot of money. These systems, the AI systems like ChatGPT, they're a boon for these people. Oftentimes they don't speak English. They don't know the local culture or political context and they want to create ads that will make people click on them. So already they can use systems like the Bing AI or ChatGPT4 that now can browse the internet to ask for, oh, what's the biggest news stories in Canada today and craft ads with that. They can use, of course, ChatGPT to create the actual ad copy. The only thing holding these scammers back was that it cost money to create these ads, right? So they would have to pay people on Fiverr to write these fake news articles that cost money, that takes time. Now they can just do it automatically with these systems. They can create images, of course, and we're already starting to see these scammers use the AI systems to run their scams. This is one I saw on Twitter a couple days ago. This is clearly written by ChatGPT. You can just sense it. ChatGPT has a certain style to it. So they're already starting to use this to write these ads, to write these articles. I saw one yesterday. I don't know if it'll play. No sound, okay. So it's a deep fake of Tucker Carlson. They used Eleven Labs, which is a vocal cloner to clone his voice, and he's talking about how Elon Musk created this wonderful new way to make money. Once again, I don't think most people here would fall for this. The lips are kind of off, but these things work. They've been working for years, and this is just another tool that they're gonna be using to ruin people's lives, basically, and basically the last frontier that these scammers had was it costs money and time to create the content to get people to click and sign up to these frauds, and that barrier is now gone. Just to finish up, it's not just using these tools to create content for people to click on. They're already using AI in their ad campaigns, so crypto is kind of down. There's less of a buzz on crypto lately, so they've started, of course, including ChatGPT and OpenAI in their ads because that's the new hotness. People wanna invest in this. So yeah, so basically I'm personally kind of worried about where this is going. It's already a huge problem. It's already a problem that is exploding. What we're hearing is these criminals are targeting Canadians specifically. Canadians are a prime target for these frauds because authorities aren't very aggressive in pursuing the people behind this, and Canadians, Canada is a relatively rich country. And so our citizens are being targeted by these massive criminal gangs, and I just wanna finish on this. Often with fraud, we're under the impression that, well, it's just, it's too bad, but it's money stolen, and you can always make more money or whatever, but these are criminal gangs. They use this money to finance corruption in certain countries. They're paying off authorities to run their call centers. It's organized crime, so there's always the violence, the presence of violence in these circles. So the money that's being stolen from basically Canadian retirees is going to these criminal gangs and financing criminal activities overseas. So it's a huge problem, and unfortunately, these systems have basically given them a huge gift to continue doing their operations and nuking people's lives, unfortunately. So that's all I had to say about that. Thanks. Good. Okay. Thanks. Thanks for coming out. Oh, yeah. We're going to talk about the excuses. I don't hear it. Just a heads up for anyone who is in here and does not speak French, the next talk, and the following two after this are in French. So we're going to start with Marc Dovereaux. Welcome Marc. Marc Dovereaux has been responsible for technology in an European internet operator, and Saiso, the architect cyber in government or private entities in France and in Quebec. He currently works in the military field. Welcome. Thank you. So we're between us. We're going to be able to speak French. So I'm from France. And we're going to talk, we're going to be even more between us because we're going to talk about networks. So in general, French plus networks, plus hours, everyone comes out. Thank you for staying. So as it was introduced, my name is Marc Dovereaux. You can find me on Twitter, Marc Dovereaux, or by my name and surname. I'm an architect responsible for cyber, I do cyber robotics, and I was lucky enough to work in telecommunications. That's why I'm here to talk to you about networks. And I work in an entity, a structure that creates this kind of little baby of 70,000 tons. There are only a few societies in the world who do that. And you can imagine that there are a lot of digital in a nuclear submarine, in a aircraft carrier, in a frigate. And so we're interested in everything that's going to be on the network, overlaid or switches. So I'm going to give you a few definitions because I asked the question to my daughter. For her, a switch, it looks like that. An overlaid, my builder was very interested. And the network was very Facebook and TikTok for her. So we're going to give you some terms. The two terms that come back, it's overlaid and overlaid. If you've looked at the Kubernetes or that kind of thing, we're talking about overlaid. Well, in fact, it's very simple. It's a network that is superimposed, as her name suggests. And the idea is to have a network that's going to be interconnected with underlying links. You're going to have physical links on a physical infrastructure. So Glunderlaid is the physical infrastructure. Overlaid, it's the logical infrastructure. There's something called VPNs, where we were in the same concept. In short, it's a virtual network that's going to mask the infrastructure of a data center, since we're going to be interested in data centers. Well, decimally, a data center. I'm talking here about a data center that you can always find in a company, that you can find in a ship, why not, since I showed them to you. And you're going to find it with our friends, Gaffa, with much less public documentation, or with national or international telecommunications or international telecommunications. In general, national or international operators are going to be based on network equipment and are going to use technologies that are given by our Gartner friends. So here, I took you in 2020, and you have roughly the names you know, from big manufacturers. I didn't know too much about Canadian operators on the plane's journey. I looked at them, I called them Belle. You see that Belle, they communicated with Cisco on their adoption of VxLan. So they are, what I'm saying, on the European market, on the North American market, that's true too, of course. I'm going to be interested in a particular technology that is VxLan. You saw, there are many protocols that allow you to do encapsulation, because in the end, we're talking about that, and to do network opening. And why am I talking about VxLan? Because the standard, there's an RFC who's talking about that, is used, the standard was designed by great actors, Arista, Broadcom, Cisco, VMware, Intel, Red Hat. So everyone is doing VxLan. Most of the manufacturers also use this technology. It's based on RTP, and why am I developing it? Well, it's really a good customer, you'll see. It's a good customer, even because you're going to meet this particular technology. If you have switches and routers, you saw Cisco, etc., there's VxLan in it. So as soon as your network admins are interested in a very stupid limitation, it's that VLANs are 4,086 VLANs, not more. So when you're in a data center, if you hope to have more than 4,086 unique networks, well, it's not done for, so you have to go through VxLan technologies. So in the router switches, you have it. In the hypervisors, you have it. So you do VMware, as soon as you put NSX in its T version to transport the information by default, these VxLans that are used, you saw, it's the RFC author. So in the containers, that's interesting. As soon as you're going to do Kubernetes, you have C&I, in C&I, we use networks like Flannel or others. What do we use as overlays of VxLan? If you do software-defined networks, Cisco has a great solution for the operators. You have a central console, okay? And I can manage 80, 200 routers, 600 switches, 2,000 switches with a console. It's VxLan that's behind. And there are nice publications on it. If you want to see the first afros of Cisco, you're going to see this presentation on the Sdn Cisco 4CV1 solution presentation. I'm going to show you three little interesting points today. The architecture is quite easy to understand. You have two levels. First, it's a little inverted, that is, the underlay is what you have on top, that's what's below, and the overlay, that's what I put down on my screen, is where you're going to see your clients. It was the VLANs of the time. So here, I have my green client. Here, I have my blue client. Here, I have what I call VTEPs. So it's virtual termination, VTEP. And it's virtual termination. They're going to do encapsulation. And in fact, the principle is all stupid, all stupid. It's that here, I have internet traffic. On this side, the infrastructure doesn't even need to know TCP and IP. And in fact, the internet trams will be encapsulated on the underlay network. So what? In TCP and IP. All stupid. So here, you have the formalatram. It's a little wire chart capture. You may not see it very well from afar. But what's interesting to see, it's IP traffic. You have a head that corresponds to a UDP head in which you're going to have a port that corresponds to XLAN. And then what do you have? You have the internet tram that corresponds to what's there in the network. Okay? What's interesting is that we're going to be able to shoot these VTEPs that can be switches or routers. A little bit of both, since it has to do with TCP and IP of the internet. Until in containers, okay? That is to say, when you do the containerization, the network part is really down to the container. So it makes limits of responsibility that change between your network administrators and your VMWare administrators. And I see people who are like, yes, yes, yes, yes in the room. Because in general, the VMWare people, they activate that. And the network people are not aware of it. And suddenly, we meet with two teams who are laughing, who manage the network and do jobs on which they have to communicate. So I'm going to go up a little laboratory to show you all that. You can really do it in your garden, in your garage. I did it on vacation. My wife was happy, we were in the mountains. I took my PC. A few VMs, okay? Which actually represents the components that I showed you on the previous diagram. Know that we find free software versions of all the great editors that I have previously uploaded. And the lab is very simple. I simulated two VNI, so a blue green. And I have virtual machines that represent the VTEPs. And on top of that, it can be centralized, but it's not necessarily to use them. So, machines that are here, that represent the different clients, virtual machines that are here, that represent the different VTEPs. And we're going to have fun with that. Level 101. So for all those who have made the network, in general, we start by doing what? To have fun on the network 101, well, we start doing all kinds of things. Why? Because it's been 20 years. For 20 years, we've been doing Poisoning RP, okay? For 20 years, we've been doing Sechon and Jacking because once we've done Poisoning RP, we can go and intercept the TCP session or break the TCP sessions. We can do Man in the Middle. That is to say, with the SPoofing RP, I do it for another machine. I get TCP traffic on the left, TCP traffic on the right. I say to the machine that was really present, what is it? With RP messages, and it works very well. DHCP ROG. I don't do it for a DHCP server. I send it to machines, DHCP messages. I can send it to proxies, force proxies, etc. All that is broadcast protocol, hello protocol, multi-cast DNS protocol, etc. All of that works, okay? The Camp Poisoning. We always have tables in the physical switches with a number of entries that allow us to associate the ports of each machine to the MAC address of each machine. There is a table in the switch. All of that works again on a technology that is current, okay? So all the good old recipes work again. But on the switches, there are lots of technologies that allow us to prevent that, okay? On a current switch, configured correctly, the poisoning RP doesn't work anymore because there is RP inspection that allows us to avoid that. When you are going to buy a solution in the great editors that I showed you, I did my tests last year, it's not yet implemented, okay? So all the good old recipes, that's all for you. Take your classes back 20 years ago, it's still working. So, there is a particular concept in what I showed you. It's that the VTEC, remember, it's its principle, it's here, I do, I have internet trams and then I'm going to send with a TCP-IP application to another VTEC. The question is, what do I send when I have several? When there are two, one sends to the other and if it's not at home, it's at home. If you have, like Google or a national operator, 1000, 2000, 10,000, when you have a tram and you want to send it to a correspondent, there are several ways to do it. First, static solution. Imagine that if I have 2000, it doesn't work. Second, multicast solution. You remember, for those who still remember, their network, multicast, I have a TCP-IP address, well, UDP, by the way, it's an IP address and I send it and it sends several machines at the same time. It works on modest size networks and the third solution is the one that is the most used. Why? Because the operators love VGP. You have done it in the CTF, I think, from North-East, they love VGP. So they used VGP to solve this problem. VGP, I will introduce it to you later and I will introduce you to something here too, that's it. Who do you know? Yeah, it's Scapi. Scapi, it's great, I advise you. It's a French guy who did it. You have it in all the good distributions. It allows you to forge trams, it's all stupid. The language is easy to understand. I send a tram, it's an Ethernet tram. I can forge a source, I can also forge a destination. So here, it's the Ethernet part in which I will encapsulate the IP, in which I encapsulate the UDP, the slash is encapsulation, in which I encapsulate the VXLAN, in which I will encapsulate the Ethernet, an IP and a UDP tram. That's the great classic of a VXLAN. Okay? With this little jiu-jitsu, you can see that wherever you are on the backbone, you can send messages on VTECs. If you are on the right port, which is the destination 4789, which corresponds to the port of VXLAN, well, in fact, your VTECs will accept it. That is to say, there is no security that has been implemented on the Underland part. You forge an IP source and then you can forge trams like that, no problem. You have a very good article here that explains all these technologies that are also a little known. What you have to keep is that when you are in the Underland, there is no security. BGP is a protocol that is not as complicated as that. It is TCP between the VTECs and they will send messages. I know this address, you know this address and there is keep alive. To make it simple, it works like that. All the operators based on their VXLAN, they call it EVPN. So we will try to go a little further than what you already know. What do we start by doing in general? Well, the DDoS. Well, the DDoS is so easy that you put here, here I am a client, so I buy a connection from an X operator, and I generate my address, source and destination, and I send it to my network. If I am a good operator, everything is fine. If I am a small operator who has done things less correctly or just who has read the documentation and said I respected the doc, it works well. In fact, you will saturate the BGP process and the BGP will fall and when the BGP falls, everything falls. So you can say to yourself, it is me who made bad manipulations, so I contacted the constructor, I said, sir, I have a little problem. He said, yes, there is a problem. There is a problem because the BGP does not have to fall when I forge the tram. If you have the BGP on a router, operator, there are functions that allow you to avoid that. Well, if you look, for example, the doc of Cisco, for example, the limit of learning of MAC names always exists on BGP, on IP addresses, but on MAC addresses for VPNs, it does not exist. So no solution on it. So what is the solution? For the big ones who have understood the problem, it's static management. That is to say that when you give a MAC, you can no longer move it. It's a bit annoying when you make clusters. So I will show you what bothers me the most, is that the scale of the time, when I was making the network and I didn't have the white hair, was to do the VLANoping, that is to say, I'm in a VLAN and I get rid of it, I make a user pay to jump in another VLAN. We say to ourselves, no, but on the VXLAN, it's recent, it can't work. It would be too serious to be able to say and I forge a tram and I find myself in VNIVM. It's science fiction. Well no, it's not science fiction. We're going to do a bit of magic. Hop, hop, hop, hop. So I built my tram. It's from the forge, as I showed you with the Scapi, or in fact what we're going to do is that we're going to force a VLAN network ID to VIN, to another VIN and the logic would be that the switch transports it to the other members and that's all. However, we have a particular case where this tram will be found on the other side. I'm showing you the magic. The magic isn't as magical as that. It's Scapi who says what? He says what I told you in oral. That is to say, I'm going to send an ID of an IP address in which I don't have access because it must have IP addresses of Underlay. But in Overlay, if I have an IP address of a switch router available, I'm going to send the package by saying, here, you laugh, I send you the VXLAN. However, I'm in Overlay. I'm not in Underlay, I can't do that. You're going to laugh. In the constructor I tested, it wasn't verified. The 800 pages of documentation, I promise you. So I took my phone back, my Mames. I have a second thing to explain to you. Here I can do that. They analyzed it. There is a problem. So yes, there is a problem. Well, here we can say it's good, but it's enough. Why was there a visible IP address here? We're going to put an access list. We're going to filter all the IPs to prevent the port from opening and then it's over. In fact, there is a thing that is great. It's called IPv6. Everything that doesn't work in IPv4, you pass an IPv6 and it works. It's extraordinary. We have years ahead of us thanks to IPv6. What's great is that in this documentation, in this constructor, IPv6 is activated by default. Without you having to configure these addresses that are generated from MAC addresses. It's a standard in IPv6, so it's easy to find them. It's very good. Why? Because BGP needs this to work because there is great functionality in BGP to make BGP unnumbered, so it uses IPv6. In short, don't forget here that IPv6 was already open by default without you having to configure it. In fact, I had a valid problem. I won three times. Look, there is a problem. The constructor said to me, yes, there is a problem. We will correct it. Now, there is IPv6, and I didn't activate IPv6. The third problem is that by doing that, I do something funny. Who knows a little IPv6? Mac? Is there an RP in IPv6? There is no RP in IPv6. There are discovery protocols, but there is no RP. I do something nice. I do IPv6 in which I encapsulate an IPv4 tram that makes RP. And the developers of the constructor, what did they do? For them, when there is RP, it's IPv4. In fact, I crash a process. I didn't do it on purpose, but I crash a process, and you look at the doc, and you see a process that controls the entire route. So I have a nice coordinate and there is a problem. So the developers didn't expect this encapsulation. On the other hand, as an offensive user, you say, yes, but it's stupid, we are at the bugle. I digest the tram, I manage to pass, I have the tram that is well regenerated on the other side, everything is fine. But I am at the bugle. It would be nice to find the answer. If you are in the offensive, you like it. I am pretty much in the team. That's why I say you, because it's me. No, not at all. It would be very good if we could recover the tram. In fact, we can. That is to say, what you have to do, always in the IPv6, what I'm going to do is that I'm going to change my IPv6 source. In fact, I explain myself. I'm going to say that the internet package was omitted by a certain IPv6. And in fact, what's going to happen is that like the switch routers and VTEPs are going to respond to the machine that asked the question. And so, if by chance, I have in one hour the IP address that I used here that exists and that is routed on the internet, I am in the worst case of the bad configuration, I will be able to recover the tram. So the answers. So, 15 days of vacation, not wasted because I had fun. Results, DDoS, JBGP, VxLan, IPv4, IPv6 because of a bad mapping of the underlay and IPv6 that have been activated without configuration. Okay? Everything is in the process of correction. So as soon as it is published, I will tell you all the details that are missing in the presentation, especially the name of the constructor. But when you see all the press that was before, everyone wins at this moment on VxLan. Because it is so new that in all the constructors, there is everywhere. So you can go. There is work. Conclusion. So, don't worry about your Heberger, but I think you already do. So, of course, the very big ones, they have no problem, they have redeveloped the majority of the protocols, etc. But if you have, for example, an intermediary between a very big Heberger and the applications that you use, they say, well, I'm going to use VxLan. I don't know if I'm going to use VxLan, but I just put VR-moire to activate the functionality of Vlan Dynamics. In fact, it's VxLan. Or I'm just doing the Kubernetes, and I used the config by default. It's VxLan. So, everything that is client, I got in this challenge. The zero-doc-constructor is not precise, everything I told you, so you need experts, people who are on the Blue Team will watch all of this. I didn't tell you about the CVE. When I did my tests, some people came back to BGP because it's old programs written in old technology. It's a process called FR-Routine that was made out of something called Zebra, which is very old, and on which there are a lot of memory leaks, that kind of thing. And now, the goal is to look for what's going on. Here, I thank you and I listen to you if you have any questions. I check if there are any questions about this IDO, sometimes it takes a little time, but if someone in the room wants to ask a question, don't hesitate. No, they stayed on an EP because we didn't wait for the answer because it goes faster, and because Ethernet was broadcasted, and we didn't need an answer. So for me, it wouldn't be part of the solution. The only thing I see, and I've seen some RFCs in that sense, is to say that VXLan is a security hole, but I don't get it because when you read the RFC of VXLan, there's a security chapter that tells you that it's not a protocol that is secured, like any other network protocol, so they were very clear about it. There are people who are saying that we could encapsulate in protocols that manage security, and so here, we would fall back on TLS, on IPSEC, and on that kind of thing. On the other hand, the operators on this kind of infra are looking for the perf and so everything that is digital is deactivated to reach several gigabytes of bandwidth. Any other questions? Thank you very much. Thank you very much. We're going to start with the next conference. It's going to be in French, but I have the bio in English and I'm going to improvise with that. Charlie Bromberg, a.k.a. Shutdown, is a penetration testing team leader in the south of France at Capgemini. He specializes in Active Directory. We will have fun with Active Directory, perhaps. He is the author of the hacker recipes, creator of Exigol, and many other open source projects and tools. I'm very excited to welcome him. You have a minute? All right. Do you hear me well? Yeah, thanks. Awesome. Salut tout le monde. Je vais faire cette conférence en français. Ça va être 25 minutes, donc ça va être sport. Je vais démarrer sans plus tarder. La conférence, le titre de la conférence c'est roses are red, violets are blue, as for you, bonbeuses me, you too, you too. Et pour ceux qui connaissent Active Directory, je pense que ça vous bambousa légalement. À main levée, qui connective Directory, qui bosse avec tout à peu près tous les gens? Ouais, ça fait une bonne partie de la salle. On va parler d'AD, on va parler de concepts un peu plus avancés que ce qu'on a l'habitude de faire, et notamment de quelques attaques qui sont sorties sur les années 2019, 2021 et 2022. Vous avez en bas à droite, sur les deux prochaines slides et celles-ci, le lien des slides, donc n'hésitez pas à les prendre. Ça vous facilitera la tâche plutôt que de prendre des photos dans tous les sens, d'autant qu'il y a beaucoup de slides. Donc la table frontaine c'est la suivante. Dans un premier temps, on va rappeler quelques concepts d'Active Directory et de Kerberos. Ensuite, on va parler de délégation Kerberos. Pareil, malvé, qui a déjà pu traiter de délégation Kerberos. Qui est dégoûté d'avoir traité de délégation Kerberos, ça n'a rien compris? Bienvenue au club. Les délégations Kerberos s'appuient sur quelques extensions et quelques mécanismes internes qui s'appellent S4U, service for user. Donc on va en parler. Et ensuite, on a un autre mécanisme qui est lui un peu moins connu, qui s'appelle U2U, un peu moins connu et beaucoup moins compris. Et donc derrière, on va assembler, pour faire quelques attaques, on va parler de 5 attaques du bypass de délégation Kerberos sans transition protocol. Oui, c'est un nom long. On va remercier AirMicrosoft. Ensuite, des escalades je vais arriver de privilège avec S4U2Self. Un pack de hash. Du RBCD100 SPN pour ceux qui ne savaient pas que c'était possible. Et du SAFIRE Ticket, qui est une évolution du Golden Ticket. OK, c'est parti. C'est le moment où j'aimais de le noter dans un coin. On m'a déjà présenté, donc je ne vais pas refaire cette tâche-là. Globalement, je vis dans le sud de la France, donc je suis un peu malade parce que jet lag. Mais sinon, n'hésitez pas, je ne suis plus contagieux. N'hésitez pas à venir poser des questions. Pour ceux qui souhaiteraient me rejoindre sur les réseaux sociaux, vous avez les différents liens par ici. OK, donc en premier temps, Active Directory et Kerberos. Active Directory, c'est un ensemble de services. Donc quand on parle d'active directory, on parle en général d'ADDS. Active Directory, domain services. ADDS, c'est un ensemble, là aussi, de services, de domain dans lequel on va retrouver tout plein de protocol, tout plein de services, et qui vont permettre à des utilisateurs, des groupes, des workstations de travailler ensemble avec de la GPO, des droits, des permissions fines, de la configuration de devices, etc. Et donc c'est un ensemble de services qu'on va retrouver de façon très, très, très présente dans l'ensemble des sociétés du monde et ceux qui n'utilisent pas en général sont des sociétés tellement petites qu'elles n'en ont pas besoin ou d'autres sociétés qui ont fait le choix de se tourner vers du libre et qui regrettent amèrement ce choix aujourd'hui. On va ensuite retrouver d'autres services qui s'implementent plutôt bien avec ADDS notamment ADCS, pour les certificats de services qui est l'implémentation de PKI de Microsoft, les Federation Services, Side Services, Azure AD, etc. Donc globalement, Active Directory dans son ensemble, c'est un tout de service qui est ultra-présent et vraiment, c'est le leader en termes de gestion de réseau d'entreprises. Au niveau de l'authentification au sein d'ADDS, on va retrouver deux protocol majeurs à savoir NTLM et Kerberos. Globalement, ce qu'il faut retenir, c'est que NTLM c'est de la merde, c'est poubelle direct. Kerberos, c'est mieux mais Out of the Box, c'est pas parfait pour autant donc il faut faire attention à ça. Ensuite, il y a plein de gens qui pensent que Kerberos, c'est mieux. Out of the Box, c'est parfait, c'est non. Il y a plein d'attaques sur Kerberos aussi. Beaucoup plus que sur NTLM en l'occurrence en termes quantitatifs. Alors, c'est pas pour autant plus dangereux mais c'est tellement moins compris que NTLM que Kerberos et lui aussi un protocole dangereux comme tout protocole d'authentification s'agère un pilier majeur au sein de son Active Directory. Globalement, les différences majeures entre NTLM et Kerberos c'est que NTLM fonctionne avec des clés, avec des haches de mot de passe et Kerberos fonctionne avec des tickets. Pour ceux qui connaissent Kerberos un peu plus en l'occurrence, le tout début de Kerberos il y a une étape qui s'appelle la préauthentification ça s'appuie aussi sur du hache de mot de passe. Voilà, je dis ça, je dis rien. Quand je parlais des attaques quantitativement plus importantes sur Kerberos que sur NTLM, voilà une liste plutôt exhaustive je pense. Sur NTLM, on va avoir des attaques de capture de haches NTLM, si on peut dire ça comme ça de reler NTLM et de passe de haches. Et sur Kerberos, on va en avoir une, voilà, pas lancée. Et donc ces attaques-là j'ai fait un petit schéma sur la droite que vous n'arriverez probablement pas à lire si moi-même je ne l'arrive pas à le lire c'est que vous ne pourrez pas le faire pour autant. Néanmoins, c'est dispose sur the accuracy piece donc n'hésitez pas, c'est une petite mind map qui vous permettra de comprendre où se situent ces différentes attaques NTLM et Kerberos. Et donc nous, on va parler de ces attaques, notamment délégations S4 UBIUS qui est sur la fin de Kerberos. Un pack de haches qui est un peu croisement partant du haut de Kerberos et d'autres. Maintenant parlons de délégations Kerberos. Alors délégations Kerberos, le jour où j'ai voulu me plonger là-dedans j'ai regretté amèrement parce que c'est un peu compliqué très honnêtement, il y a des choses qui sortent du Kerberos MIT de base et des choses que Microsoft a rajouté pour rendre la chose encore plus intéressante et excitante. Je vais vous en parler aujourd'hui pas que j'aime particulièrement les délégations Kerberos c'est surtout que j'ai beaucoup t'as fait dessus, donc je pense connaître un petit peu. Néanmoins je pense que c'est important que vous puissiez économiser les dizaines ou vingtaine d'heures que j'ai passées à lire des blog posts que j'ai jamais compris et retenir quelques informations majeures des délégations Kerberos. On pourrait traduire ça en français en 100 contraintes et contraintes basées sur la ressource et dans la contrainte il y en a de deux types parce que 3 c'était pas assez il y en a un c'est avec transition de protocol et dans les outils Active Directory c'est appelé use any authentication protocol et le deuxième c'est without protocol transition donc sans transition protocol qui dans les outils Active Directory Kerberos only. Donc globalement on a à peu près 5 types de délégation, 5 délégations qui fonctionnent de manière différente qui ont des comportements différents nous l'entraîner c'est de les comprendre pour être en capacité de les attaquer et d'abuser en fait de leur propriété quand on va aller faire du pain test Active Directory donc demain si on a un pain test Active Directory et qu'on fait face à la délégation sans contrainte donc vous voyez c'est le premier tiers est-ce que vous voyez ma souris on a la moitié la sans contrainte elle permet à un service qui est configuré pour du KUD Kerberos unconstrained delegation de se faire passer pour n'importe quel compte sur n'importe quelle machine globalement c'est de base c'est une feature ensuite on a le contrainte donc délégation contrainte KCD cette délégation contrainte c'est à peu près la même chose un service configuré pour ce type de délégation peut se faire passer pour qui veut un ensemble de services mais pas tous les services donc par exemple si demain on fait un pain test Active Directory qu'on compromet un service qui s'appelle APP01 et qui peut faire de la délégation sans contrainte on va pouvoir par exemple se faire passer pour un administrateur du domaine sur un contrôleur de domaine et là on a gagné si c'est de la délégation contrainte on va pouvoir se faire passer pour un admin du domaine sur n'importe quelle machine qui est dans la liste des machines pour lesquelles le service un peu délégué et ensuite on a le resource based constraint le RBCD qui lui fonctionne dans l'autre sens qui est lui aussi contrainte mais cette fois-ci on ne configure pas la délégation sur la machine source de la délégation mais sur la machine cible vous voyez légèrement sur la droite enfin j'imagine que vous le voyez en gros on a un ensemble de services web, SQL ou on sait rien qui vont pouvoir ou non déléguer vers un service cible qui lui est configuré ce qu'il faut retenir c'est que la délégation sans contrainte et la délégation contrainte qui a eu des KCD il faut des droits globalement d'administrateur du domaine et ensuite pour la RBCD il faut juste des droits sur la machine target de la délégation voilà globalement les 3 types majeurs de délégation Kerberos qu'on va pouvoir retrouver en ce qui concerne la délégation sans contrainte globalement ça fonctionne à peu près comme ça je vais pas trop rentrer dans le détail si vous avez des questions plus tard on pourra revenir dessus mais globalement je vous ai fait un petit diagramme des familles qui explique comment globalement ça marche ce qu'il faut retenir c'est qu'on peut se faire passer pour n'importe qui ou presque sur n'importe quel service quand je dis ou presque c'est parce qu'il y a en gros une protection qui s'appelle protected users ou alors un attribut qui s'appelle sensitive for delegation si vous mettez ces trucs là l'utilisateur qui est placé dans ce groupe ou qui dispose de ce flag ne pourra pas être délégué voilà c'est une petite protection et si vous mettez administrateur vous savez le compte natif built in admin du domaine dans protected users il pourra quand même être délégué pour la délégation contrainte protocol et une slide sans transition protocol il faut bien comprendre cette différence parce qu'on va aller abuser en fait des propriétés de l'un et d'autre pour que la délégation contrainte fonctionne elle s'appuie sur deux mécanismes service for user il y en a un c'est s4u to self et l'autre c'est s4u to proxy s4u to self globalement ça permet d'obtenir un ticket service de la part de quelqu'un d'autre de n'importe quel utilisateur vers soi-même donc par exemple je suis le service APP01 je peux faire une demande s4u to self au KCD, au KDC qui distribution center et je vais obtenir un service ticket de la part de j'en sais rien vers moi-même alors là comme ça ça sert pas à grand chose mais en fait ce ticket il sert de preuve pour faire fonctionner s4u to proxy et s4u to proxy, lui c'est globalement la même chose on prend ce ticket obtenu par s4u to self à destination de nous en tant qu'un homme admin on l'utilise en tant que preuve et on va les demander un ticket à peu près similaire mais cette fois-ci au lieu de le dessiner à nous on va le dessiner à un autre service un service vers lequel on peut déléguer et c'est comme ça qu'on fait la déligation on récupère un ticket Kerberos en tant qu'un autre utilisateur type de manamine vers un autre service et ça c'est une feature alors avec transition de protocol globalement ce qu'il faut retenir la propriété majeure de avec transition de protocol c'est que le s4u to self ici permet de se faire passer pour un utilisateur sans rien il n'y a pas besoin de connaître son mot de passe, son h2o mot de passe rien du tout il faut simplement que le service app01 soit configuré pour de la déligation contrainte avec transition de protocol et ça il n'y a qu'un domaine admin ou presque qui peut configurer cette propriété sans transition de protocol globalement c'est la même chose simplement on peut pas faire de s4u to self alors on peut mais ça donnera un ticket qui n'est pas forwardable donc le s4u to proxy ne marchera pas et donc pour faire de la déligation contrainte sans transition de protocol il faut d'abord recevoir un ticket de la part de l'utilisateur qu'on veut globalement ça marche comme ça et la resource based constraint globalement les mécanismes sont à peu près les mêmes que la déligation contrainte il y a du s4u to self, du s4u to proxy simplement c'est le service destination qui va lui recevoir la déligation donc c'est pas app01 qu'on va les configurer pour de la déligation c'est app02 qui est la cible de la déligation alors je vais accélérer parce que je suis un peu la bourre en gros les extensions s4u s4u to self functionne de la manière suivante je vous laisserai regarder la slide plus en détail plus loin mais globalement je rappelle ce qu'il faut retenir c'est que s4u to self on obtient un ticket de service de la part d'un autre utilisateur vers nous même s4u to proxy on présente ce ticket qu'on a obtenu et on obtient un ticket de service de la part toujours de quelqu'un d'autre mais cette fois-ci vers le service vers lequel on peut déléguer petite aparté la déligation sans contrainte ne marche pas du tout comme ça donc pour ceux qui seraient en train de se dire ah mais il y a peut-être un truc à faire je vous coupe, c'est pas du s4u to self c'est pas du s4u to proxy ça c'est des extensions qui sont propres aux déligations contraintes et RBCD si vous voulez en savoir plus que cette slide de là vous avez à coeur SCPs c'est un de l'alchamire qui est vraiment super qui est très long qui explique beaucoup beaucoup beaucoup de choses sur les déligations carburose donc je vous invite très fortement à lire pour ceux qui sont intéressés ensuite l'authentification u2u alors là on va faire un exercice le vêtement qui connaît u2u ouais c'est normal c'est un truc qui est très peu documenté qui est peu utilisé je crois mais nous on va l'utiliser à mort et donc il faut d'abord le comprendre pour pouvoir l'exploiter alors la compréhension de ce truc elle est pas très compliquée en soi on est tous d'accord pour obtenir un ticket de carburose il faut que le service cible de notre ticket il a un SPN on est d'accord non on est pas d'accord on peut obtenir un ticket de service pour un UPN un user principal name plutôt qu'un service principal name globalement ça veut dire qu'on peut obtenir un ticket de service pour un utilisateur et pas pour un service oui c'est possible alors ça demande quelques requirements de base ça demande par exemple d'inclure le TGT d'utilisateur dans la demande ça prouve quand même qu'on a connaissance en fait de secrète ça demande aussi dans le S name de la requête de mettre le UPN du target user et ensuite in fine qu'est-ce qu'on reçoit on reçoit un ticket un petit peu spécial puisque ce ticket, un ticket carburose cette clé c'est le H ou la long-term key de l'utilisateur non pardon pas sur un TGT c'est la long-term key du KRBTGT pas sur U2U U2U renvoie un ticket qui lui est protégé par la session key du TGT obtenu pour l'utilisateur TGT qui a été fourni dans la requête globalement comment U2U fonctionne si il y a des choses qui vous échappent pas de soucis on reviendra dessus après S4U2SELF ça fait peur accrochez-vous, ça va bien se passer on respire, on y va S4U2SELF et U2U on va mélanger les propriétés de ces trucs c'est combinatoire, c'est compatible on peut mélanger les deux je rappelle, S4U2SELF on obtient un ticket de la part de quelqu'un d'autre vers nous-mêmes mais qu'est-ce qu'il se passe si nous-mêmes n'a pas de SPN et que nous-mêmes on est un utilisateur on utilise U2U on a un autre utilisateur vers nous-mêmes, nous-mêmes étant un utilisateur globalement ça fonctionne comme ça et ça permet toute une série d'attaques très intéressantes certaines que vous avez pu voir passer sur Twitter dans les dernières années que vous n'avez pas pu comprendre et donc on va les voir maintenant et si vous les avez comprises, je vous félicite moi ça m'a pris un bail globalement je vous ai mis un diagramme sur la droite qui explique à peu près comment ça marche 1, on prend le TGT de nous-mêmes puisqu'on fait un S4U2SELF on prend notre TGT, on l'include dans la demande S4U2SELF plus U2U et on récupère, in fine, un ticket service qui est chiffré avec la session key du TGT qui est en jaune globalement comment ça marche et maintenant on va parler des différentes attaques qui abusent des différents mécanismes dont on vient de parler S4U2SELF, S4U2Proxy, U2U donc on va en voir 5 du bypass de délégation contrainte sans transition de protocol du S4U2SELF S4U2SELF Unpack de hash, du RBCD100 SPN et sa fir ticket est-ce que dans ces 5 attaques y'en a qui en ont déjà vu au moins 1 ou 2 ah ça fait pas beaucoup j'espère que vous repartirez moins bête c'est pas une insult toujours un peu en retard donc je vais accélérer la cadence en même temps c'est 25 minutes 30 c'est chaud sinon je serai dispo après pour les questions si y'en a bypass de délégation contrainte sans transition de protocol donc je rappelle sans transition de protocol, le premier S4U2SELF qui nous permet de nous faire passer pour un utilisateur out of thinner ne marchera pas on est obligé de recevoir un ticket d'un autre utilisateur ici, pour faire un bypass de cette restriction on remercie Ela Shamir dans son article Wagging the dog de 2019 qui explique une technique qui permet de bypasser cette limitation la technique se base sur le fait que S4U2 Proxy vous savez, la deuxième extension utilisée dans les délégations de Mancaberos produit un ticket qui est tout le temps fort de bol et fort de bol, c'est le requirement pour S4U2 Proxy donc globalement ce qu'on fait c'est que si on arrive à faire un S4U2 Proxy qui produit un ticket qui ressemble à un S4U2SELF on peut faire fonctionner le S4U2 Proxy donc ça peut paraître un peu chelou comme ça c'est pas grave, vous allez voir, ça va bien se passer ce qu'il appelle comme technique c'est le RBCD Trick alors je sais pas s'il appelle comme ça ou si Simon qu'il appelle comme ça RBCD Trick en gros, appp02 est configuré pour la délégation contrainte sans transition de protocol on a compromis appp02 et appp02 a la possibilité de déléguer vers appp03 on veut donc compromettre appp03 néanmoins on peut pas se faire passer pour domain admin sans recevoir un ticket de sa part on peut faire comme ça la délégation Kerberos sans transition de protocol on peut créer ou péter un autre service qui s'appelle appp01 faire en sorte que appp02 soit configuré pour la RBCD pour appp01 appp01 peut déléguer vers appp02 appp02 peut déléguer vers appp03 et nous on compromet app01, app02 on veut compromettre app03 et bah en fait la RBCD globalement c'est un VRS4U2 proxy qui va donner un ticket en tant que domain admin vers app02 ce ticket sera affordable et en fait c'est exactement la même chose qu'il produit dans un VRS4U2 self de la part de appp02 un ticket de la part de quelqu'un d'autre vers lui-même vers app02 et donc en faisant un RBCD avant qui produit un ticket quasiment identique à ce que fera un VRS4U2 self There are a lot of things, but overall I invite you to go back to the following. Normally you have all the information in this slide from there. We can set up an RBCD and we have the right to do it. First of all, we do the RBCD, and this RBCD allows us to unlock the requirement which is generally limited by the KCD without the protocol transition. And so it allows us to do an allegation and to infinite counter-meters at 0.3. I put you a few slides of exploitation for those who would like to do it. At least for me Windows, it doesn't work. So everything you see here is Linux. If you want to know how to do it in Windows, I leave on Sunday. Don't talk to me. You have all the information here. I don't doubt that you will see in detail everything that has been written from your place in the audience. I invite you to look at more simple information from the link I gave you earlier. Attack number 2. I'm clearly not in the timing. S4U2self LPE. We're going to do the privilege escalation with S4U2self. In fact, S4U2self, let's imagine that you have compromised a service called App03, but you haven't compromised it as an NT Authority system. You have compromised it as a Microsoft Virtual Account. So for example, it can be the UPPool2sys, MSSQL service. These are services that are a little different. Or then NT Authority Network Service. These accounts can, and I won't go into detail, do what we call the TGT delegation trick. We can obtain a TGT from, well, how to say it, by abusing the rights of these accounts. We still don't have the complete compromise of the machine. At least, we can use this TGT to make a S4U2self. This S4U2self will allow us to get to who we want or near App03. From there, you have compromised App03. Normally, you're going to say, yes, but no, because if we put an admin domain in Protected Users, this account will not be delegated. Well, yes, because S4U2self is not limited by this kind of thing. So even if DomainAdmin is in Protected Users or Sensitive for Delegation, S4U2self and without a file, the ticket will not be a Florida Bowl, but it will still be valid. So you can identify yourself as this user, normally protected, on App03. And you have a systematic non-patchable privilege escalation, since it is a feature on services that you have compromised a Microsoft Virtual Account or Network Service. Similarly, I gave you slides here of the times. So if you want to test it, you have the freedom to do it, and you can do it with the IMPACKET script. Well, don't hesitate. Third attack, Unpack the Hatch. Who saw Unpack the Hatch in 2021, I think, on social media? Okay, keep your hands up. For those who have seen this technique and who have not understood it, I invite you to lower your hand. So in the end, it's little and it's normal because this technique actually rests on many mechanisms. So I invite you to see it again together now. In an environment without stress, without pressure, and we will understand. Unpack the Hatch, globally, it allows to recover an HNT or LM for a given user from the authentication of Kerberos mechanisms. It's a feature. So globally, for it to work, you first need to do Pekainit. Pekainit is the mechanism of asymmetric pre-authentification of Kerberos, which allows you to validate the pre-authentification of a long-term key, so an H2M, but with the certificate of the user. Once you have validated this pre-authentification with Pekainit, in fact, the domain controller behind is generally going to include a blob called Pack Cudential Info in the Pack. And in this Pack Cudential Info you will find the keys N, T and LM, the keys N, T and LM if we can place it like that. Nevertheless, when you will receive this in your TGT, will you be able to decipher this content? No, because the TGT is protected by the KRBTGT. And besides, if you are an admin domain, you don't know the KRBTGT, you can't have it. So what do we do? We recover the information in this Pack, but this Pack is protected by the key of the KRBTGT that we don't know, so we can't decipher it. So what do we do? We retrieve a ticket as someone else, as we, as we are a user. No need for SPN and infinite, the essence of our password since we do the U2U and when we do the U2U, the ticket we receive is protected by a key of the TGT that we obtained just before. So how do we do it? First, we obtain a TGT with a asymmetric pre-authentication, PKNIT. Second, the TGT we receive, we will include it in a demand SRU2self plus U2U and we will receive a ticket service which contains a Pack, this Pack is numbered by a key of the TGT and in this Pack, we will find this Blob, Pack Codential Info which will retain our key NTLM so you have a mechanism that allows you to pass from a asymmetric certificate to the key NTLM of the user and so in fact we will be able to see that as a PKNIT back-port for NTLM. I put you slides with a context of DSL abuse so you have the possibility to modify the permissions you have on an user, you add a certificate and you make the mechanism that I described. Fourth attack, SPNless RBCD The RBCD for it to work, in general, for those who have already done it you have to have a SPN not necessarily, because one day James Forchot, he watched it on Twitter he said but wait, no no, there is not necessarily need a SPN, same mechanism SRU2self plus U2U U2U I remember, no need a SPN since we can get a ticket service for a given user so overall I put all the information here and I tried to vulgarize or change the form of what was present in the article which is great of James Forchot, that I invite you to read but here you have a little the translation for us mortals I put you some impressions of the screen, but overall the mechanism is the following one, we recover a TGT two, this TGT we will extract the session key three, we will ask SPNU2self plus U2UU a little peculiarity is that before doing the SPNU2proxy, we will change the user's hash by the session key of the TGT obtained just before why? because behind the KDC when he will do the SPNU2proxy he will want to decipher the result of the SPNU2self plus U2U result that is deciphered by the key of the session but he does not know, he will think that it is deciphered with the key of the user so it takes the two matches and finally, last attack, his ticket hand that knows the Golden Ticket it does a lot, so look at the hands we will do something, hands that know the Golden Ticket or you have already seen past, you have already used lower the hand what does not know Silver Ticket lower the hand what does not know Diamond Ticket this is a new thing that has been done a few years ago and lower the hand for those who do not know his ticket there are still some his ticket overall the idea is to do the race against the Blue Team and take the lead to do a ticket that is almost detectable the problem of the Golden Ticket in general is that the pack is poorly formed that it is the ticket issued of a non-demand so overall there are many indicators that usually we can detect a Golden Ticket it can generate positive faults but in general we can detect them his ticket overall we use the same mechanisms always of SPNU2self, U2U to recover the pack all the detailed information of a visitor this pack will be perfect we will take this pack and insert it in a ticket which is also legitimate and we will craft a ticket that is legitimate and so we greatly reduce the possibility of detection of his ticket this is a new thing that is done in 2022 that I implemented in a package but not on Windows so I force you to use NUX I think it was implemented in Rubius I'm not sure I put you the details here but overall what we need to remember is that with SPNU2self SPNU2proxy U2U, mechanisms and authentication protocols that are generally unknown we can produce 5 attacks 5 attacks that were discovered and published in 2019, 2021, 2022 2022 I remember it was 5-6 months ago so if it is in 2023, there will be more so I invite you to know these protocols these extensions, know about the maintenance and maintenance of these last ones to understand the new things that will come out on Twitter next year or in the next 5 years without necessarily having to reread the blog posts that take 10 hours to read and that we never understand because the guys who wrote them are too intelligent for us in general I have a handle that is at the end of my conferences to predict the questions you will ask me it allows me to always spend time and never be surprised but there anyway we don't have time so if you want to surprise me I will not answer I will try to catch up Thank you to James Forchot Snow Crash, Pixis and six others Charlie Clark, Andrew Schwartz, Alberto Solino Ella Chamir, Will Schröder and Dircan Molema who have written exceptional articles of the research at a high level and on which the work I presented to you is based I have almost nothing to say so go and venerate these people the resources I have put you a few links I have not necessarily put you all of course I will avoid going to consult the Holy Grail The Hacker.recipes or Exegol who is the alternative a little more pro and secure of Cali des Nukes the glossary you have there pay attention to the glossary when we talk about technical domain and technical terms you have to be a little careful or maybe it's Mamani who does that the same, you have to try to say Challenge Response NTLM or Net NTLM V1 I invite you to look try to build a culture a little more technical, a little more specific it will avoid, for example, the problems of crypts, crypts that have to be locked in regularly it's about the same thing but on Active Directory Thank you if you have any questions, do not hesitate we have one minute we can take a few questions you can ask Charlie or direct let's go to sleep sorry, I'm going to sleep it's super clear if not you can talk to Charlie alright, thank you very much ok maybe put your name on it ok, we will start in one minute so for anyone just coming in, note that the next talk is in French ok, let's go alright so once again the conference is in French but the bio is in English because stuff happens Mathieu Sonnier is a security enthusiast in quotation marks he's also VP training of Nordsec so he's a veteran here and he's a core mentor of DEF CON's blue team village he is currently director of threat research at Sumo Logic where he focuses on research, threat hunting and adversary detection in addition to at Nordsec on an ongoing basis and he's part of the blue team village mentor program and many other impressive things I'll let him take it away, thank you today can anyone tell me how to switch for the speaker notes excuse me update it moves no one can come I don't know what happened it was the English version ok let's start today I'm going to talk to you about three groups first there's YOLO CORP a company that follows the rules of security and compliance but which have no team of security on site COOLSEC, a company with security pros that have understood that yes compliance is important but it sometimes takes a bit of judgment and finally TREAT 4 a group of threat actors who don't need to be compliant to nothing my name is Mathieu Saunier director of threat labs at Sumo Logic it's been more than 20 years that I'm in computer security so sometimes I feel a bit like that while I was preparing the presentation there was a webinar of Black Hills Infosec that gave them the 10 ways that they used the most to test their clients if we zoom a bit we can see that CREDENTIALS is the number one of the reasons for which they used what they used the most to enter companies the solution that they give for that it was the longest passwords when we ask Google the most frequent breaks once again the passwords, the weak passwords the re-use of passwords now that we've put the table we're going to meet the protagonists of our story so here we have our friends from YOLO CORP so we can see that they look quite unsupported we wonder what could happen to them as we mentioned earlier YOLO CORP is PCI compliant and GDPR compliant we're going to see a bit later what it has as an impact on the passwords here we have cool sec people so we can see that cool sec are a bit more serious they are also compliant to PCI GDPR but they also follow the recommendations of the list and finally our friends EVIL CATS who they how to say it and their compliance you may have already seen the poster of KiwiCon which says Hacker don't care about and a list of 200 things EVIL CATS it's exactly that but in addition to the unpleasant attitude of cats so our story begins with EVIL CAT 1 which is going to do a very well known attack called Password Spray this attack touches the internet in less than 24 hours I personally saw a machine attack by a password 7 minutes after it was put online so when we do a password spray attack we generally do research on the companies we want to attack so we're going to go in this case EVIL CAT 1 using a LinkedIn scraper the presentation is in French but it's going to be a lot of words in English and then after that we built a list of passwords which is likely to work with the company we want to attack so generally we're going to take the company's name we're going to add 1, 2, 3, exclamation things like that so our EVIL CAT 1 here did his research and he launched his attack so the first attack is Kirk with Yellow Corp Worth, Soleil, 1, 2, 3, exclamation he found a combination that works so once his attack is over he's going to go to Cool Sec, same type of attack so first he's going to have Two Barker with Cool Sec exclamation and finally Bob Afech with Welcome, 223 exclamation and he's going to have a match so now he's going to the second step of the attack and he's going to find some Credentials that work it's time to try them so when he arrives at Cool Sec I don't know, at Yellow Corp sorry he finds an RDP exposing the internet and uses his Credentials and this screen here that you're probably all very familiar with and for some of you you have very good ideas of what's going on after that when our EVIL CAT tries to do the same attack Cool Sec here's the screen he's receiving so here yes there was the password that was made but we have MFA, Multi Factor Authentication or Multi Authentication so much less success with the same attack According to Microsoft MFA usage can reduce 99.9% the chances of an attack can go beyond a single word I'm going to repeat it 99.9% less chance that the attack is a success by putting MFA so if you don't have MFA on everything that's internet facing it's probably the first thing you should do on Tuesday at the office like Cool Sec, they also have a very good SIEM they received alerts obviously of course it's in a very good SIEM and if they have user entity behavior analysis they will also have things like first connections first successful connections from external IP or first maybe even for a country if the attacker has never done his research now completely in terms of Opssec and finally of course the EVIL CAT1 the multi-factor authentication we will have an alert for that too so what does PCI say about the password someone who wants to try it how long does a password PCI suggest for passwords we can see below there's someone who wants to try a 10 when I did the presentation it was 7 characters and the password should be changed to every 90 days I imagine that there's no QSC I don't know when I heard that I thought 7 characters I don't know if you think the same but I thought how long does it take to crack a password of this character once again is there someone who wants to try it with an answer? Eric it's a long google 7 minutes and we will see later we will base on a table so we can see whether it's a few seconds or a few minutes it changes absolutely nothing for an attacker if we come back to PCI we were talking about 7 characters 90 days but let's be honest it changed at the end of 2022 now we're talking about 12 characters to change every 90 days so what's the difference between 7 and 12 characters we will go to one of these images and it's the one we will use during the presentation there are different versions but I had to choose one so if we zoom a little we will see that 12 characters it takes us 3,000 years while at 7 we're talking about 30 seconds 31 seconds but in fact it's without counting on the visibility of the environment during 2023 or welcome 2023 exclamation these two past words have 12 characters but they will take a few seconds to crack because it's dictionary words it's things that are predictable and so they are tried very quickly it's built in the tools to crack GDPR what does GDPR say about past words does anyone want to try it while I'm taking a little bit of water what does it say about past words we're talking about 8 characters and we say avoid dictionary words and rather favor a past phrase instead of a past word past word is in a word past phrase is in several words and I put avoid here put the accent on avoid and I would like you to think about why we use the term avoid well there is no way for an auditor to know what past word you put or what past word all your users put so they used the term avoid the only way we can know is to crack past words I don't know about you but I don't really want to give my past word or even less the past word of all my users to an auditor as I often say to my friend Eric who is sitting there I trust you with my life but not with my password so if I don't trust Eric I trust you sorry if it's one of your users so if we go back to our table we see that 8 characters whoops, we are talking a little bit fast we are talking about 39 minutes so not enough to use a real past word we feel that an adversary will spend a lot more than a few minutes NIST what does NIST say about the past words You won't ask the question, no one will answer the other two. 15 characters and never expire. So first, we have a word that goes much longer with the artist, but in addition, it doesn't expire. When I saw that, it was the reaction I had. Never change my corporate word again, what a joy. If we go back to our tableau just to see what it looks like, here we're talking about a million years. It seems interesting to me, it seems quite robust. So it brings us to our second attack by our friend EvilCat2. He just looks at it, we see that he's lost. EvilCat2 is the specialist in post-intrusion. He managed to put his hand on a copy of your NTDS.dit file. For those who don't know this file, it's the file of the passwords in Windows, which is on the Active Directory Domain Controller, which is generally very well protected. In this case, it was in a safeguard copy, which is generally much less protected than the Domain Controller. So, to do a cracking attack, we have our NTDS.dit password file, and we need a dictionary. Almost everyone means that the best dictionary to crack passwords is RockU.txt. One of the places to download RockU.txt is SkullSexSecurity.org, which belongs to Run, which is currently giving a panel in room number one. So if you see it afterwards, he says thank you because there are several resources for the password cracking on his site. In addition, he's super kind. So, how do we crack a password? Well, we're going to use tools like Ashcat, John the Ripper. In the presentation here, we're going to focus on Ashcat. So here, we have the minus M1000 mode, which says we want to crack NTLM. After that, we have the minus A0 mode. This is the dictionary mode that we're going to take, and we have the rules at the end. And generally, we're going to use several different rules. Typically, when we crack passwords, a company that has never done it before, who doesn't do it regularly, we can crack about 50% of passwords in 24 hours and about 80% of passwords in a week. So here, we're going to see how our two corporations managed to resist this attack. What are the percentages? So we're not surprised when we crack YOLO-Corp, we crack 80%, when we crack Colseq, we crack 1%. We're going to look at a few passwords on the side of our friends at Colseq. So the first password is welcome-223-eclamation, and the second is password-password, and we're going to replace the O by D0. So if we come back to our tableau, the difference between the two passwords that have 12 and 17 characters, that would be 3,000 years versus 7 trillion years. So which of these two passwords is the hardest to crack? Well, in fact, it's my friends from Thailand who have the right answer. Same, same, but different. These two passwords will take a few seconds to crack because changing a zero for a O for a zero will bring an exclamation. Putting the year at the end of a password is really, really, really bad. Now we're going to see which passwords are used at YOLO-Corp, at Colseq, sorry. So the first is backtick, the empire barks back, and the second is patience, young, part of one, with points instead of spaces. So if we look at how many passwords there are, who thinks the first password is more secure? Who thinks the second password is more secure? We have a little bit more people for the first one. The two passwords are 23 characters. So according to passwordmonster.com, the first one would take 161 million years, while the second one would only take 29 million years. So here we see that he used the number of special characters to calculate in his algorithm. So here we go to security.org, the two have 3 7th year signs. I don't know what it is, a 7th year sign, but it must be extremely long so we don't have to use another word. But once again, it's my Thai friends who have the right answer. It's same, same, but different, but still the same. The two are going to be too long to crack for anyone. There's no chance that people find this type of password. So what did Colseq do exactly to get to this result, which is still really impressive and much better than the majority of 80% to less than 1% is still quite impressive. So the first thing they did, they put in place the password of 15 characters that never expire. They put in place the cracking of password in an ebdomada way. And a cracked password equals a changed password. If we value a little more in detail, the first thing we do is create a service account. We're going to give the rights to replicate directory change all to this account that we created. Of course, we're going to give it a very good password because it's equivalent to having a DC5 for those who know the tech. Then, we're going to create a GPO that has the recommendations of the NIST inside. We're going to let our users who will soon have to change their password to take 15 characters and the good news if we don't crack it or if you don't change it. Then, when we deploy it to the GPO, we force the change of password for all our accounts in the company. So now we're really going to go into detail. There are different ways to go and get the NTDS.Dit file or the password in Active Directory. There's the way I saw it used in... In fact, two things. Most of the companies I've seen implement it are financial institutions. So bravo, that's really good. On the other hand, it's not super elegant. We're asking the admin domain to give us a copy of the NTDS.Dit. But the admin domain is already relatively busy. So they don't just have to send us files every week. So here, we're going to use the DS Internals module made by Michael Graffneter, who lives in Canada. So Kudos for a Canadian. After that, of course, we're going to import the module. I'll put it there just because I often forget it and I lose precious minutes. Then we're going to... The account that we created previously, we're going to associate the Credentials, the Creds variable with our Credentials. When we use this command, it's going to make a little pop-up where we can tap the username password. For those who are in the back, there are places if you want to sit down instead of staying in the door. There are plenty of places to sit. After that, we're going to pass to serious things. We're going to run the DS Internals module. So we're going to go and use the command AD-Replicate-Account-All on the server, on the DC. We're going to point it on the DC. We're going to pass in parameters no Credentials that we entered in the previous step. We're only going to look for the accounts that are Emable because we don't want to get cracked by passwords that are disabled for people who have left or things like that. And we can then put that in a file that H4 can read. So very, very simple. After that, we can run the command a second time. So almost the same, except at the end, we're going to use the module Test Password Quality. And that's what it's going to do. It's going to look at all the passwords that we've already cracked. Are they present in our database? If we cracked them once, we feel like we're going to crack them a second time. In addition to that, there's something very cool in this function. Now, I'm going to ask you a question here. Who has only one account? Your users only use one account. What there are people in Active Directory who only use one account. Two accounts. Three accounts. Four accounts. It's not that bad. One of the things that's good when we do the administration with Active Directory is to go with tiers. So the domains, those who are admin domain, should have an admin domain account. Those who have managed servers should have an admin account of servers that are different. And finally, your user account that you use every day to read your emails. If you have the same password for all these accounts, in good French, it's a bit of a purpose. There's not really a point if your password hash is stolen by a tool like Responder for your user, but it can be reused for the other accounts that you have. So the function will tell you how many has similar hash. Another thing, of course, if it's the same user, we're going to ask him to change it. But if we have two users who have the same hash, it probably means that the password, even if we haven't cracked it yet, is easy to guess. It's predictable because we have two people who have the same password. So is it related to our company? Is it related to, I don't know, tomorrow it's the World Cup or I don't know what? So we can be like... prevent rather than defend. Once again, use your judgment. Well, the other steps after, well, now we move on to cracking. So you're going to see that the command that is run by... I think it's... CoolDog3, I think. Well, it's exactly the same that was run by our EvilCat2 at the base. So H4, minus 1,000 for the ntlm files, the username, RockU, the rule dive. After that, they're going to run another rule, which is called DeadObo. There's best64, NSC, there's plenty. Plenty, plenty, plenty of rules. Here, depending on your company, it's going to have more or less success with different rules. And finally, when we're done, we're going to look for the H4.pot file that contains all the password words that we cracked. And we're going to put that in the file that's called WeakH, but we're going to be able to reuse it next week when we crack. Next step. We cracked password words. We're going to look for the information of this user, or 16 users in our Active Directory with properties like email and things like that. We're going to be able to find the password words so you're going to have to change it. If it's been three times in three weeks that the user changes the password words, he might have a call at the help desk saying, what's going on? And then it's time to have a conversation with the user, explaining to him that if he always takes the same word and changes the password at the end, we're going to find it every week and every week he's going to have to change his password words. You're going to see that after a certain time you're going to have less and less success to crack the password words of your users because they're going to have chosen security password words. These security password words, they keep them, as we said, you don't need to change them. And then it's going to be interesting to go like one layer further. And there's a good talk by Travis Palmer who had been made at DEF CON, I think it was the Adversary Village, on how to crack password words of 15 characters and more for less than $500. So that's it. If you want more information too about DS Internals, this is a presentation that was made by Michael Graffneter a year or two ago, where he presents other modules that weren't presented in today's presentation. Last year at Go Sec, there was a presentation on password words. There was one of the questions that the person who presented it was, what are the tips to create a good password word? So I said to myself, you're going to have the same question, so I saw it in the presentation. So for me, there are two types of password words. There's a password that you use, that you tap on a machine. There's a password that you're never going to tap on your service account. For the 64-character service accounts generated randomly by a tool and stored in a password manager. For the password words that you tap on, I call that dress the password word. What do I mean by that? We're going to take a sentence here, I have a bad feeling about this, and we're going to add in two and four characters at the beginning and at the end. So already here, if we put two, it's still easy to remember, there, hashtag, hashtag, dollar, dollar, and at the beginning and at the end. Well, first, we're going to add eight characters to our password word. And the sentence, I have a bad feeling about this, I don't think so. Well, first of all, it taps very quickly on a keyboard. Secondly, it's very hard to forget. So here, we have a password word of 37 characters. How long do you think it takes to crack a password word of 37 characters? 17 decillions of years. I have no idea what it is, a 17 decillions of years, but it will take even longer than what we had earlier. Another trick, as you are here in this room, the chances are that, apart from Brian, you speak very well French. So, use your maternal language. In the dictionaries of cracking the password, it's almost just words in English. There are languages that have dictionaries, but there are languages that don't even have dictionaries. I've never seen a Romanian dictionary, for example. There may be, but I haven't seen it. So here, we have, again, we have the word pass, la peur mène à la colère, which is a sentence from Yoda, for those who didn't notice it. And an interesting fact here, is that neither the words colère, nor men, are present in rocku.exe.t. So just to see how many years it took, here, we're talking about 900 decillions of years. So it's still quite long, I guess. As I saw that there were no words men and colère in the rocku.exe.t. it brought me another question. So how many CCDs e accent aigu the S that we have in Slovakia, double S in German or exclamation in reverse in rocku. Well, the one we're talking about, I was very surprised, but the one that's the most often, is the CCD, with 800 words that have a CCD. Then we go to e accent aigu, which is much less surprising in my opinion, we're at 500. It's still less. Double S in German, 65, sorry, 70, the S with Slovak, 65, and almost 90 with exclamation in reverse, which is more or less surprising because it's still used a lot in the Spanish languages. To compare, the letter A, it's 9.5 million times, well, in 9.5 million in other words. Okay. So, I leave you on this little sentence here, against whom do you want to protect your network? Against the auditors or against the attackers? I have nothing against the auditors, but I find that sometimes they make us do, no, that's true, but I find that sometimes they make us do things that are a little counterintuitive or they don't even know why they ask us for things. So, the goal here is to think about what really changes what you're asking me about my security posture and above all, is it enough? Are you asking me enough? So, I'm going to repeat the words from MFA, multi-factor authentication, especially what you have on the Internet, you did that in the morning, using your maternal language to create password words. I was talking to people from Switzerland who said, yeah, in addition, try to write, there's time to do it, you're doing well in your U-turn. You're going to try to crack that, you're going to have fun, I think. So, for those who liked the presentation, you can follow me on Twitter or on LinkedIn. And I have a last little... Good. I had a last little request. There's a family in... Oh, I'm sorry, just because it's time. There's a family in my neighborhood who is threatened with expulsion. They come from Congo. The little boy is in the school where my children go. Both parents have jobs. If you could take a minute to sign the petition, it would be really nice. Do you have any questions? I think there's one or two minutes left. Do you have any questions? Yes. It depends. It takes very good arguments, but when we present the results of cracking, it usually happens. It always depends on the person, but it can happen. And it also depends on who comes as commentator. If it comes from an analyst, it doesn't go as well as if it comes from the company's CISO. So when we have support from the CISO for these decisions, it will be really better. What do I think about creating the words to the past? The question is, what do I think about choosing the words to the past to create words to the past, like the example in XKDN. I think it's much more difficult to remember these words than a sentence we have chosen. And it often happens that it's more difficult to hit them too. But if it works for you, it's not a problem. I just think it's more difficult to remember. If it was tomato or tomato or tomato, CISO. If you change everything, it's not a problem, but if it's something we use a little less often, I prefer phrases that make sense to me. Another question? Thank you very much for coming. I hope you have a great North-East. Don't forget your badge. My name is Zunera and I'm working as a research engineer in Synopsys. I come from Belfast, Northern Ireland and today I'm going to talk about machine learning and how we can leverage machine learning for detecting vulnerable code. So I'd like to kick-start the conversation with a funny story I had just a few days back when I was traveling from Ireland to Montreal and I had a conversation with the Uber driver and he asked me, so what's the purpose of your travel to Canada? And I said, yeah, I'm going for a tech conference and the theme is about cybersecurity and he goes, you know, I'm so glad I've met you and I'm like, really? And he goes, you know, there's a lot of stuff going on internet that's constantly getting or it's on risk like it's getting hacked. So why don't they just put on security everywhere? And I loved a bit because it also reminded me that's the concept, that's the mindset also about machine learning. Some people think, okay, so there is a tough problem. They say, okay, let's just put some machine learning magic here and voila, the problem is solved. Let me break it down. Machine learning is not a simple magic. It's actually very complex. So today I'll talk a little bit about SAST overview and then the challenges we had to face to collect the data set and some of the machine learning approaches we applied and the motivations why we should use machine learning in SAST and later we'll discuss some peer limitations and some of the research that's already been done and if you'll have time you can ask me questions as well. I'll try and be on time. So this is one of the examples of vulnerability. If you can see it's a weak cryptography problem. So if you see it's really easy for me as a human to understand, so we just have to look for some like secret key specs or if it's correct or not, but if we go to the next problem and it's about OS command injection it's bit tricky. It requires what we call as TensorFlow analysis such as, so we have to know what kind of data set is there and what kind of input is going through and then we have to track the path for it and if we see in this code it's trying to run OS command on the directory but imagine the same process where maybe it's not just limited to one function but it's like maybe calling some other multiple functions maybe it's also calling network connections and stuff stuff like that, so inter procedure analysis will be really difficult and so is the case for the machine learning. As the meme explains, you just cannot simply collect vulnerable code and trust me it was really difficult to get the data set the right data set because for machine learning we need good training examples we need good and handsome amount of data set so we can train our ML model and one way could be to scrap all the code from the GitHub repositories and scan it through soft tools like we have in Synopsys coerity and something like that but the other way was to just use some pre-built data sets such as one data set is called NIST Julia data set which is really good for like 100 more than like 100,000 examples are we can use divine vulnerability data set which is based on real-time examples so little about Julia data set so they have like over 100,000 test cases for software security assessment and it's pretty good because it's labelled so we don't have to struggle with that we can just collect the data to run our ML model but here specifically we were trying to just focus on one kind of particular vulnerability here I was focused on finding detecting OS command injection in the code and we collected some of the data set it's not that big for just OS command injection as you can see it's just 3000 test cases focused on OS command injection and it's labelled but make sure you notice that it's quite imbalanced as well so we took 80% as training for training 10% for testing and 10% for validation moving forward to divine vulnerability data set this data set is really great because it's manually labelled they have like 27,000 318 test cases and well now they have like five more projects but I just collected for the two FFMPEG and it's created by two security experts so this is legit data set and good for the ML training model but still I would say there are some issues let's say some synthetic problems some synthetic vulnerability problems are some of the data set is not really updated and it's outdated are maybe some limitations of the frameworks so before diving deep and introducing my own model I thought it will be better to look around and check other research which is already done so it will save our time so there is one paper it's called Automated Vulnerability Detection by Russell and they claimed that using natural language processing and conversion neural network you can actually get 85% accuracy in determining so there is a vulnerability in the code great so I was like okay let me use their model on my own data set and see if it actually works the way they claim to be so little about the model I won't dive much deeper because we have other important stuff to discuss so we leverage their model and it actually profound pretty good on the training data set as they said so let's say I got 86% accuracy okay cool enough and the last function is 66 I mean ideally it should be less than .5 but the data set is quite small so I'll ignore it for the moment but considering it's getting decreased so it's a good sign I was pretty happy with that but then I was like let me try this model now on some unseen data set so the model can learn in natural language processing I was pretty disappointed so I took their models and did the same thing and on the unseen data set it was actually really opposite look if you see on the table it was 86% accuracy scores and .5 to last function and I took the same model trained model and I run some experiments with the unseen data set and the model went through overfitting so now we have 50% accuracy okay that's a really sad moment and then we have like last function which is 72 damn that's not what we want so I decided maybe it's because we are just using natural language processing and it's not enough to detect vulnerability in the code so because we need to capture the code semantics and the structure and maybe the approach could be using abstract syntax tree it makes more sense like how the code is working, how the flow is and the best part about AST is that it can be leveraged for any programming language so we don't have to limit ourselves to one kind of programming language like we do for data flow analysis are control flow analysis so here we if you look, so this is a piece of code also mentioning about that I'm focused on finding vulnerabilities at function level so this is a function and then we are trying to capture, we generated the AST for that using a small library and then we have to we can get different path context and based on those path context we can develop our own neural network which should be able to get using a neural attention and it should be able to decide which path is more important than the other one if you decide to do everything from scratch it should be very difficult so there is one model it's called code2vac it's pretty good because they are using the same kind of approach in a sense but for a different case study and I try to fine tune their model for my own case study like detecting vulnerability in the code so for the moment code2vac is designed in a way so they take a snippet of code if we just focus here so you have a code snippet then you generate AST based structure and then based on that you have path context and then you have to generate like vectorization of those path context as in to give it to the code2vac input format model and then I fine tune this code2vac model for predicting vulnerabilities so this is a summary of how we decided to pass the dataset using AST minor we have code snippet then we are using AST structure and then we are getting path context and if you see on the diagram so we have labeled functions which are vulnerable and which are safe separately and we are getting the path context separately here and then we are using vectorization and path context within the code2vac model and then we are basically predicting if this piece of code is vulnerable or not it would be great to briefly explain how it works so basically you have a program and then you generate bag of context using AST structure, you get path context, you get tokenization and then you have vectors like the orange boxes this is in a fully connected layer the orange boxes are like vector representation and then next thing which gives us interpretability is using attention mechanism using neural networks so attention in neural networks helps us to aggregate all the path context into a big code vector and then the next thing we are doing is to predict if there is a vulnerability or not so I am using the standard approach 1024 batch size embedding size is 128 and dropout rate is 0.25 and a little bit about metrics that I used to determine the accuracy of my model results and if it is actually what I wanted to do so precision here is basically it may use the accuracy of positive prediction identifying correctly flagged vulnerabilities and recall also called as sensitivity so this one helps us to to detect vulnerability by calculating identifying vulnerabilities among actual vulnerabilities and F1 score which I am going to use as an inference because it kind of balances out both precision and recall so it is a good compromise we will go for that and then accuracy but accuracy alone is not enough when you have to actually predict if your model is working correctly or not so these are some of the results using code to work and using divine vulnerability dataset so with divine vulnerability dataset the best epoch results I got was the fourth epoch so I get 65% recall and precision is 56% I mean it is not like 80% I mean we know what we are doing that makes more sense right so if you see if we look at the confusion matrix for this case study so we have correctly identified 781 vulnerable codes 611 and then non vulnerable codes and then 404 and it is the number 404 cases well yeah we get like 60.35% accuracy and 39.65% are the not correct okay drop down but again our thought let's try the same model with a different dataset maybe it will perform different so with Julia dataset the accuracy results are great maybe because the dataset is small one of the reasons so when you train your machine learning model you have to have really big dataset because if the dataset is small maybe it will go through overfitting or underfitting so we don't know we cannot trust this but considering that we had 60% accuracy before and on the limited kind of dataset you get like 79% it's still good enough because if you collect more dataset on OS command injection you will get 60% accuracy so maybe we will get 60% accuracy but that's not the point the point is the model performs similar as on it generalizes well on unseen dataset which we couldn't see using NLP or neural networks are just some LLM models because we really need something that can work well on the unseen dataset it should know the structure of the program the structure of the program just like getting some tokenization or getting some path context so this is it let's leverage GPT on it maybe it will perform well but if you train it on a huge dataset yes it will but how about if you get in future you have a different kind of problem and your model doesn't know and it cannot even learn it's not a vulnerable code but it was actually a vulnerable code so we need something more structured approach so here we need to collect more dataset I thought just saying LLM models are not enough to predict vulnerability it will be better to have one experiment and to know it better how it works because definitely there is a value in LLM I'm not against LLMs I love them but it's just that they are not just enough for the security detecting security vulnerabilities so this is one of the LLM models it's called CodeBerg and it's really good because it's not just limited for programming natural language trained for programming languages and using this model we will fine tune the same model for our own case study predicting code vulnerabilities and see how it works so little about CodeBerg it's a transformer based architecture it's built upon the BERT model with attention mechanism which helps it to focus on what is important and it helps it to understand the context of the program semantics and all and it's by model so it's trained for both programming languages and natural language yes, that's great because it will have a rich representation of code for various tasks and it's versatile just like Code2Berg it's also versatile like it's not limited to one kind of programming languages and it's easy to scale on any other programming languages like for Code2Berg I had one experiment using just Java based code and then the other experiment was with C and C++ code so it performed the same way and it's easy to scale well, with CodeBerg I trained this I fined you in the model and trained it for my NIST Julia data set I get 100% accuracy great, not too great because it's too hard to believe too good to believe that it's true I trained the same model again, well the reason was it's because of the small size and then I trained the same model for Devon data set which has like 27,000 plus test cases which is good enough for training a machine learning like model and we get like 60.36 accuracy and it's kind of better than Code2Berg because with Code2Berg we have like 60% accuracy and here we get 61% accuracy so you see the value in it it's good but how about if we just merge both of these models together and come up with a different approach and say we are using AST and then we are using natural language to detect vulnerabilities it will be more legit and it will be more structured approach and we will know how it works so I would say this is just an evaluation of all the models and the experiments we had so if you just look at this table with this Julia data using deep learning conversion neural network so they said okay just using natural language processing we got 86% accuracy no you didn't get this thing you are just trying to say yes on training data set and on the scene examples the model performs really good but if we tried the same model for a different approach for a different unseen data set it just over fits and it's like okay just 50% accuracy no we don't want that and then we moved for a more structured approach which is code 2x try it again with different data sets divine and Juliet so we get like 60% accuracy with Juliet 79% accuracy but I would say if we increase the size maybe the accuracy will get little bit low and with code bot we get 61% accuracy and this Juliet it just over fits so I don't count that okay so so preoccupied with if you could you didn't stop if you should people have this mindset these days okay because AI is everywhere so let's just use machine learning everywhere because it can help us to solve problems and it can help us to ease things and but I would say if you know like if you think about LLM models all I would say yes you should use it but try to modify a bit let's say now the next take way for us in synopsis will be to use AST plus NLP like neural networks like LLMs and merge both of these things together and get a new product which will be able to predict not only like vulnerabilities but it will be definitely able to generalize well on the unseen data set so that is it okay so I didn't do an experiment with GPT because my data set was very small and I am sure they have already seen those examples in it so I don't have to train it it's just prompt engineering now but considering about security stuff I won't recommend GPT I'm not convinced with it but yes you can leverage it in different ways but at the end you need some structured approach for security related stuff so I've seen I researched a bit about like whatever is already done so there is a paper on automated detection of source code vulnerabilities using code to back model they tried to do something with SQL injections but it was not really well they didn't they couldn't get better results and then there is another one on using distributed representation of source code for the detection of C security vulnerabilities they used deep learning and it's again overfitting issues because it doesn't generalize well on the unseen data set then we have different ones and I try to reproduce this work and there are some other models okay so that's all about it I'm happy to answer anyone any questions just refresh Slido in case someone is shy to speak up I cannot hear you yes it's the same thing actually I observed so basically with code to back they were just focused on ASD based structures they were not looking at NLP stuff it's a neural network using attention mechanism just based on ASD structure so what I propose is that the solution lies using both of these things you cannot just leave one and just focus on the other so with LLMs they're just focused on okay so you just get a path of just get a snippet of code and then you say okay so here's my code do some ML magic and get me results well how about the structure and the semantics you're ignoring that thing so I would say in code to back they're just focused on ASD in code but they're just focused on natural language processing and deep learning and neural networks but not looking at the structure of the code so what they need to do is that to mix both of these things together and come up with a new model which will also look at the ASD which is also using NLP stuff and then it's kind of merging both things together so it's basically getting two code embeddings together and then trying to predict what's the best result than me if that answers no problem another one interesting yes I tried to do the same thing when it overfitted for the first ever model automated vulnerability detection because I really trusted it because I just started back in October this project and I was like oh I got the results I don't have to bother about some difficult models so I came up with my own model so I was very happy I tried to change it but it didn't really help so same for code to work I tried to change their hyperparameters but again the same thing so yes dropout really helps too for overfitting issues and learning better feature representation but again the same thing we need to modify a bit yeah anyone else? thank you so much I want to present Mike Saunders he has over 25 years experience in IT and security he's worked in the ISP financial insurance and agribusiness industries he's held a variety of roles including system and network administration development security architecture Mike has also been performing penetration tests for a decade and is a very experienced speaker and we're super glad to have him here welcome alright thank you Flo alright thank you NorthSec for having me alright we got a couple awesome good on you so welcome I'm Mike Saunders we're going to talk about intro to AV and EDR evasion if you want to copy the slides this talk is part of a larger talk that I've given and this link here has all of the slides including some of the stuff that I'm not going to talk about today if you have questions afterwards I'm Mike at RedSiege.com I'm on Twitter I'm on LinkedIn feel free to get a hold of me be happy to answer questions a little bit about me Flo told you most of the important stuff I'm the principal consultant at RedSiege I've been around a bit sadly my Infosec Twitter these days is mainly pictures of things that I do not Infosec but I'm into photography I do a lot of fishing and kayaking and stuff like that but enough about me let's get to what you're here for I want to tell you that this talk is not about advanced evasion topics we're not going to be talking about unhooking and we're not going to be talking about evading runtime checks for the most part so we're talking about ways that you can use to obfuscate your payload to get it to land on disk or to get it into memory that's the intro part the advanced talk is when we talk about the techniques we need to use to load that code into memory to execute that code and stay hidden so as I said this is part of a larger talk where we talk about more things things that get us caught and some things we can do about that and one of the things that can get you caught is entropy and I promise you we will define what entropy is and what that means now this is normally where I give you a whole bunch of other content before this point but now we're going to hop right into entropy so with regard to entropy I do want to give you a disclaimer I am not a mathematician so I may not be explaining all of these concepts correctly I'm going to do my best but I am not a mathematician I will tell you that we are talking about entropy as it relates to information theory not the heat death of the universe they're related but we're talking about information theory specifically here in addition to entropy I'm going to talk about a concept called Kolmogorov complexity so let's get into it a rough definition of entropy is a measure of the amount of randomness in a given thing that you're observing if something is said to be higher in entropy it means that it is more random if something is lower in entropy it is less random now I also mentioned Kolmogorov complexity so Kolmogorov complexity comes from this theory that data that is highly random is less compressible than data that is less random so we can test this you can see on the screenshot I did a simple little test I read in a million bytes from dev u random so if I have a suitably random number generator and I read in a million bytes and then I'm going to compress that with gzip yes I know you can use ent and there's another bunch of tools that you can use to measure entropy but gzip stands in as a good proxy and it's available on all systems where ent isn't there by default so we have two bytes of random data into gzip and we try to compress it it tells us that it compressed it by 0.0% meaning it did not reduce the size of the input at all because of Kolmogorov complexity we know that that must be highly random data because we couldn't compress it so we got highly random data which is high in entropy and we can't compress it so here's where the problem comes in we typically use some means of obfuscating our shellcode usually encryption and when we use encryption it increases the amount of entropy in our payload some AVs and EDRs will use a measure of entropy as a proxy for trustworthiness in a payload we'll look at that but I want to take a little diversion if this was the longer version of the talk that I've given before I would tell you about some of the things that we get that get us caught and one of those things is not changing default values so if you've ever heard of the MS build technique and you've used MS build to get payload execution you've probably heard of the 3G student template and the 3G student template is a good template that a lot of people use but it's well known starts out with a task name and this is an XML file that gets loaded by MS build if you're not familiar with the concept it's got a task name, class example it can be anything but the 3G student uses class example there's one default we also have several things we need to allocate memory and copy our shellcode into that and it creates a variable called funk adder that basically has a variable containing our shellcode called shellcode these three on their own might not be suspicious but in the context of an XML file being loaded by MS build that may be suspicious so not changing these variables could be something that gets you detected so you're like alright let's just use some random value for that we'll just put that in well that's also a problem because we put in random values random values if they're truly random they're going to be high in entropy so we actually have these short variable names but they're high in entropy and if we replace variable and function names with random strings that could be bad I have been busted by EDRs that looked at my MS build template and said there's a high amount of entropy in function and variable names as a result we think that this is untrustworthy just the randomness of the variable names was what it used to detect it because there was high entropy in there so again that high entropy is a sign that we're trying to hide something a lot of times means we've encrypted it otherwise using random things to hide what we're doing and that can cause a problem I use just custom word lists but I don't use just a single word because depending on the language that you're developing in you might run into reserved word collisions especially common in like VBA there's like all kinds of reserved words but depending on the language that you're in those reserved words could cause a problem if that shows up in your variable name because it's a reserved word it means something specific to that language so we use a random word pair generator so splendid dragon or obfuscated diamond whatever north sec we could use that but we have multiple words concatenated together we do not have an entropy problem because we're not using random data and we greatly reduced the possibility that we're going to have a collision in the namespace so that works for things like scripts that we're using scripts and code templates like the MS build template another XML or JSON format or whatever type of template data but let's get back to talking about compiled code so we know that languages are not random and well this actually could probably apply to computer languages programming languages as well I'm specifically referring here to written and spoken languages that humans use to communicate so I've done a little test this first one dictionary.txt because I'm a heathen American and I assume that everyone speaks English so dictionary just means English and I tried to compress it with gzip and I reduced it by 64.7% it's a good amount compressed it quite a bit which tells me that it's probably not very random but now I did the same thing with the German dictionary and I compressed it by 75.9% so that tells me that one German is not only more terrifying than English but it is also less random than English it is more compressible it is lower in entropy so this leads to an interesting thought experiment that some people went down you can check out these papers to find out more about it what they actually did is they took written language like you can take Shakespeare and then put in a Shakespeare sonnet and that is your shellcode and the position of characters within the words that go back to an algorithm that you have map to your shellcode bytes it's a really smart paper the guys that wrote this are wicked smart I'm not that smart so I didn't implement this I encourage you to read this and check it out because I forgot to start my timer there we go alright so why am I telling you about this as with most stories that I share they're related to things that I experienced on a test so a little over a year ago I was on a test client had CrowdStrike on their endpoint and CrowdStrike had its next generation machine learning AV component enabled so while it sounds like buzzword bingo it actually did its job I took a payload it's the same payload I've been using for a long time it's a shellcode loader written in C has some ZOR encryption or AES encryption I guess in this case I was using AES and dumped it on disk and I tried to execute it and it didn't run and I don't mean that it ran and it got terminated it just didn't run I double clicked on it nothing happened went to the command line, typed in the path nothing happened so I need to start debugging this so I do it with a bunch of printf statements because that's how you debug things and so the very first thing when this program starts up it's supposed to say hey I entered main print when I run it I put the new version on the disk so that tells me it didn't run and get terminated it didn't load at all so why didn't it run so I hopped on the Bloodhound Slack the red team channel if none of you are there I encourage you to go check it out it's a great place to learn from other people and I was on there talking with some people and we kind of all came to the same conclusion that it must be entropy related must be entropy that's causing this so how can we defeat entropy and defeat entropy checks I should say and of course the entropy was high because I was using encrypted shellcode so I've got this tiny shellcode loader that does not have many lines of code in it and all it does is takes encrypted shellcode decrypts it sticks it in memory and executes it so there's not a lot of code there and there's a high amount of encrypted data in relation to a small amount of code so the entropy in the shellcode loader is very high and CrowdStrike looked at that and said I don't know what this is but I don't trust it I'm not going to allow it to run interestingly enough it did not generate any alerts my clients looked at it and said let's see any alerts but it would not allow it to run so there are a bunch of different ways that you can lower the entropy without actually recompiling your payload you can just cat a bunch of non-random non-binary data onto the end of a payload you could cat it doesn't work with a JPEG right it's got to be a PNG I think cat.png and redirect it to the end of a payload what I did is I compiled an array that contained a bunch of words into the payload recompiled it and dropped it on disk and it worked and this was my face like literally what's happening so this is what it looked like this is literally what I compiled into my code I created an array called anti-entropy filled it with a bunch of words 7,740 words there's no significance to that number the only significance to that is that's how many words were in the dictionary that I had now diversion back to another part of this talk talking about compiler optimization when you're compiling your payloads you have the compiler that helpfully tries to optimize your payload for you it will try to take your program and make it use less RAM it will try to make it run more efficient because it looks at your code pass and says hey here's some stuff that you never actually getting called we're going to remove that from there that can lead to a problem we were using the shell code loader using it for years that used a just ZOR routine to obfuscate our payload but we're using it with a multi-byte key so like you know an 8-word or 8-character long word was our key and that worked for quite a long time until someday Defender started tagging on it what is going on and again because I'm an excellent programmer my way of debugging it was removing chunks of the code I kept removing chunks of the code until the only thing left was this ZOR routine and Defender was still alerting on it it tells me it's looking at the ZOR routine with a multi-byte key and this is suspicious use it with a single key just like a single character like an A it was like this is fine but using multi-byte key it says hey that's suspicious so the way we got around that a co-worker said well maybe it's compiler optimization I was like that can't possibly be I did more tests and then I was like that's the only thing it can be so I recompiled it and I disabled compiler optimization and sure enough I now have my ZOR routine is not compiled down to this highly optimized piece of signature code and now it works so when you're compiling your payloads it may be worth to disable compiler optimization and I mention that because with a lot of compilers if you throw this array in here and I don't use it I don't do anything all I'm doing is compiling this payload with these words in there compiler optimization would optimize that out because it's not used anywhere so I had to disable compiler optimization so let's take a look at what that looked like we have on the top my original payload AES.exe I can compress it by 18.4% because of Kolmogorov complexity we know is not a lot of compression so it probably tells us it's very high in entropy we take that same payload the only thing that I've done is compiled in an array of words and I compress it by 97.8% again Kolmogorov complexity we know that this is very not random data we have driven down the entropy simply by compiling in these words into the payload now it's not ideal because you can see that the size of my original payload is 362k the size of this payload is 15 megs now a few years ago red teamers we were obsessed with the smallest payload we could have to get on disk but these days that you know internet speeds pretty fast everyone can download stuff real fast they don't really notice the difference between downloading 300k and 300 meg payload most of the time so not obsessed with being as efficient as possible I'd like to take another diversion though and if you haven't heard about this back in 2019 a couple of researchers were doing some research with Silance's next-gen machine learning AV component and as a tip for you if you want to try testing against whatever vendors new AV EDR component but you can't afford the commercial business version a lot of times the same engine is in their home product it just doesn't have as good of reporting capabilities or as configurable so that's what these guys did they they used Silance's engine but they bought the home version and they installed it and so that was significantly less expensive and they did some research and while not related to entropy it was an interesting study in their analysis they found this database of words that Silance seemed to be using as a test if your payload contains these words it gets a boost in confidence score if it contains multiple words it gets more boost well it turns out there was some very popular game at the time that was getting blocked by Silance so Silance's approach was to take all of the strings in this program and add them to their database and increases the confidence score if these strings exist this must be this game it's good to go so they figured this out and they took the top 500 malware samples at the time shotgun these strings onto the end of the malware samples and ran them and all of them passed pretty much all of them passed so not related to entropy but again a very interesting research avenue that I encourage you to check this out another thing that we do not necessarily related to entropy is increasing the file size some EDR and AV engines if you make the payload too big if the file is too big they will opt to not scan it because there's a performance hit by scanning large files and so there's a trade off between usability and speed and they decide that people want things to be fast so we are going to make it some threshold if it's above 200 megs I tried to find if there was a standard but there's really not it varies by vendor and in many cases it's configurable so people may have changed the default I can tell you that defender doesn't do it by default by default it doesn't care how big it is but we can make a file large enough that the AV engine says this is too expensive for me to inspect I'm going to allow it to work I promise you I am not picking on CrowdStrike here I just happen to know it works against CrowdStrike because we did it recently the reason I'm bringing this up is because there are tools out there like Optiv has mangle which is a great tool but mangle pads the payload I believe with null bytes inflators will pad with random data and some engines will look at files to see if they've been padded to increase the size if they're padded with null bytes or random data that sometimes is a signature so to avoid that I came up with DigDug throwback to if anyone knows the reference between DigDug and inflating the side of your payload anyone alright thank you for making me feel not old so DigDug does the same thing that mangle does with regard to inflating the size of a file it artificially increases the size of a payload however I have set it up to read data from a dictionary so you have a dictionary full of a bunch of words and then append those words onto your payload until it gets to the size that you specify now the entropy is lower so we don't have entropy problem and it's big enough that the engine might opt to not scan it anyways so you can check it out it is on my github hardwaterhacker slash digdug so let's get into a little more evasion I'm going to tell you about a tool that I created called jargon jargon is special words or terms that have special meaning to a group that's hackers have our jargon that other people don't understand popping shells doesn't mean anything to people who aren't a hacker so after I did the crowd strike bypass essentially by compiling in those words I started thinking about it like that was a really big payload and I still had a payload that had shell code in it what if I could have a shell code loader that didn't use shell code and what if I could if it didn't have shell code in it I wouldn't have this entropy problem so how can I decrease the entropy of my shell code loader and also get code execution well I know that English words any language actually will be lowering my entropy so if I could encode shell code as words I would decrease my entropy and I would avoid having any shell code in my payload all together so how does that work I kind of came to this breakthrough and I will add the disclaimer that I am not the person that invented this technique other people have done this before me I have found other people doing it since I wrote this tool I just haven't found anyone that had released it publicly in this form so I don't want to state that I am the creator of this I am just the first tool author that released it as far as I can tell publicly so we typically store our shell code in hex bytes right 0x00 depending on how we declare our character we can actually store it as an int we can store it as a 0 0123 so on and so forth now how many possible shell code values are there 256 possible values that's it no matter how much shell code we have it is always going to be made up of a maximum of 256 characters from that character set so if we take 256 random words and they have to be 256 unique words and we put them into an array and the position of each word in the array represents its shell code value so here attending is a 0 promptly is 1 terry is 2 so on and so forth so we get 256 words we put them into a character array now each word in this array represents a shell code character and it doesn't have to be English words we could take all the strings from an array for instance that would be fun for an analyst to look at that and be like what is going on with this payload so it could be any kind of thing but we have 256 unique words so we have a translation table and as we said 0 is attending 1 promptly so on and so forth now we do a lookup we look at each of our shell code bytes and we say what is the value okay that is a 27 we look then what is in the 27th position in our array yes I realize 0 numbering not actually true we got an offset for that but just go with it we look at what's in the 27th position of the array we grab that word and we insert it into our translated shell code array so this is our payload in this case it happens to be a cobalt strike beacon coded and it's gigantic yes it is gigantic but we have a bunch of words in this array and then when we want to go use it our loader needs to contain the translation table the translated shell code and a simple for loop we loop through every byte of our translated shell code table and then for every byte we look up the word in the translation table oh that's that position 0 enter a 0 oh that's that position 256 put 255 into our shell code array once we've done that we've now recreated our shell code and loaded into memory and execute it and it works so if we take a look at this again our original loader was 300 62k this one is 1.7 meg so it's you know 10% of the size of the other one that we did it's not quite is low in entropy it's 85.8% versus 97 point whatever the other one was but it's still much better than the loader containing our encrypted shell code so it's much lower in entropy it doesn't actually contain any shell code you know to someone analyzing it's just going to contain a bunch of words does it work? let's double click our payload receive beacon so it does work I've used this with C loaders C++ C sharp more recently in some rust loaders by the way if you're not writing payloads in rust give it a shot however don't use things like DES or ARC4 or DES for your encryption routines because the mere fact of loading one of those crates with rust to some EDRs is a signal of badness you can write a skeleton shell code loader that doesn't have any shell code in it doesn't actually do anything but it just imports the AES crate and defenders like no go away I don't know what you are but you're using AES so it's bad jargon works great though so you could rewrite your own AES crate that's fine but jargon works great with rust and honestly I have a rust loader that uses jargon and doesn't do any unhooking doesn't do any other evasion and it works against several EDRs rust is really a red teamer's fun language for a bit that's going to catch up but it is there so if you want to try jargon you can get it on my github hardwaterhacker slash jargon I encourage you to check it out I encourage you to use it in your own tools if you use shell code pack from ballast kit they've implemented that in there you can implement it great have fun few other things as I mentioned I'm not the only person to come up with this technique and other people have come up with really cool similar ideas if you were at DEF CON last year there was a hack or a talk where someone used emojis instead of words they used emojis for their shell code and it works and that would be a fun payload to analyze so shell code is emojis check it out I'm here to tell you that cobalt strike is not dead if you know how to use the artifact kit yes there's a lot of signatures out there but you can get around a lot of detections if you modify what's now known is the arsenal kit which includes the artifact kit so definitely make use of it couple other things I am not sponsored by these guys at all but I think that the sector 7 malware dev courses are a great intro if you're not super familiar with evasion it makes it very accessible I would highly encourage you to check it out it's like 250 bucks per course very much worth it so with that I think I'm right at the end of my time slot here so are there any questions that I can answer for you do a couple of questions I think we have something I don't I think this is someone asking in the wrong room so never mind well I want to answer that then yeah let's not do that yeah there's a question yay sure so the question was how complex will it be for av and edr to be able to detect these kinds of techniques and that honestly is something I struggled with a little bit because I come from a blue team background I am a blue teamer at heart and everything I do is I love making the blue team better like I don't care if I win the only thing that matters is that the blue team is better at the end of the engagement with this I honestly don't know how you would detect it off hand what I will tell you is the only benefit this gives you is bypassing that entropy check and that simple looking for shellcode bytes that are in there you still have to do everything else to execute that code in an opsec safe manner so there's still plenty of means to get caught the only thing that I can think of as a means to detect it is you have to go back to some real cryptography roots and do character frequency analysis like if attending shows up a bunch of times well we know that's the null byte or that's a zero that shows up a bunch and start doing frequency analysis that would be a way of doing it but I don't see that being an efficient way for EDR to handle it other than that I'm not sure how they would detect it you're welcome got time for one more maybe yes in the back would I recommend a sandbox to test in as a red teamer yeah yeah absolutely if you have access to like some type of malware sandbox to analyze your payloads it's both good and bad I think that some of the roll your own ones that you can control the telemetry that's being sent would be better because if you're using one of the more commercial sandboxes they're collecting telemetry on your payload and yes it might not be detected right now but I'm sure that they have a lot of your signatures so it's going to show up eventually so it's kind of more like if you can roll your own sandbox where you can control what telemetry is being received then yeah but you absolutely need to analyze your payloads before you dump them on the customer environment because you know there's a lot of failure there's a lot of pain before the winds happen so thank you if you want to find me I'm in the red siege shirt in the red siege hat I'll be around the conference for a bit this afternoon to be happy to talk more otherwise hardwater hacker on twitter hacker on github mike at red siege dot com feel free to reach out thank you