 Morning, everyone. Thank you for coming just a little bit up Thanks for coming out here bright and early to this talk Hopefully we'll have a little bit of fun here today and Try some interesting things. So this talk is going to be about getting initial access through leaked credentials And it's going to be kind of done in the mindset of an attacker. So we're going to go through various ways that an attacker exploits Credentials finds credentials how they use them. We'll look at some real-world examples We're going to do it in a number of different different ways and then at the end we'll kind of look at how we can counter off that and defend against it So a little bit about me first. My name is Bikensi. I'm from Artero, New Zealand But now I live in the Netherlands and I work for a French company called Git Guardian And you can find me anywhere on my socials at Advocates Mac So This presentation is going to focus a lot on I'm going to you're going to hear me say the word secrets a lot So just to get everyone on the same page. What are secrets? So secrets are Digital authentication credentials and typically these are things like API keys. They may be security certificates They may be database or database peers or other credential peers So it's something that gives you access to third-party services allows you to ingest data encrypt data decrypts data all of that Kind of stuff. What's important to keep in mind about secrets is these are made to be used Programmatically they're made to be used by your applications and not meant to be used by humans But humans touched them. This is where the problem lies in so that's what I'm talking about when I'm referring to secrets How do we use secrets today? Well, we just take a look at our modern application that we have here That's doing all kinds of fun stuff. We're we may be focusing on one element We're trying to do something unique with our application, but our application needs to do lots of things As well. So what do we do? Well, we leverage different services, particularly third-party services to help us do this quickly So the easiest one to explain is credit card processing Do you want to write your own credit card processing and deal with the financial? complications and law or should you use use stripe right or same with Authentication like octa or goal here So quickly our applications end up being a collection of these different third-party services All of these services need secrets to be able to communicate with each other But then we need to host our application somewhere. We need to host our code somewhere We need to test our code somewhere and so our infrastructure It becomes a collection of these services as well and these all leverage secrets as well Then once we've launched our application. We've got to monitor it We've got sales Integrations and we're not even talking about all the micro services or independent services that we've created Perhaps using API's all of this leverages secrets Every single one of these logos as an attacker is a potential entry point for me to gain access to something So this is how we end up where and this is a simplified version Your applications can end up very quickly having thousands of these applications these third-party services And now you have to manage thousands of secrets So I work for a company called get guardian and each year we publish a report called the state of secret sprawl And this is basically looking at different areas that we've monitored to try and find leads credentials So the number one place that we look is github. So this is the largest contribution Place of source code if you see my other talk the next two slides will be familiar But then then we'll get it. We'll get into new stuff But I want to do a very quick demo Using github and I'm gonna come back to that in a minute So I have here some credentials these are AWS access tokens So these are really sensitive, but these are what we call honey tokens. So this is basically is is based a trap for attackers and so what I want to do is I have here a Public repository This is informative correct with it. So this is here as a public repository. I've just created called DevCon DevCon and I'm going to commit these secrets in this public repository So by the way, please never do this this is a terrible idea But what's going to happen? in about 20 minutes near the end of my talk I'm gonna come back and I'm gonna show you how many times those keys have tried to be exploited by an attacker Just in this presentation. So I've pushed them publicly on github and now a bunch of bots I scan in the github API and I'll show you how that's done to try and find these credentials and abuse them And so we'll talk about that bit more in a minute. So moving on get hub very popular place Over a billion commits are made to github every single year 85 million new repositories were made last year and these are just public So lots of information lots of source code in in here every single one of those commits those billion commits at get garden we scanned for secrets and We publish our results of how many secrets we actually found so if you know the answer don't put up your hand But anyone here? Yeah, I'm seeing some familiar faces who here thinks that we've found less than a million credentials on github More than a million. Well, okay less more than a million under five million More than five million More than ten million So we found ten million ten million secrets that we discovered in github public repositories last year Now you may say, okay, but how do we know that these are actually real credentials? How do we know that these aren't just kind of test keys or high-entry strings that look like secrets But aren't and we get around that by validating them So if we can if we find an AWS credential like the one I just leaked then we will check with AWS to see if that's real And if it's not real, we'll ignore it So if we look at the progression, we're leaking a lot more secrets than we used to so in 2020 We found three million and we compare that now we've found ten million so Part of this is explained by an increase in github more code more secrets, right? But also it's because we're using secrets in different way now. We've got infrastructure as code So this is changing how we're programmatically using these and so we're using them in more ways, which is why we're getting more secrets We can actually have a look at the types of files that leak these secrets So Python is number one not because anything to do with Python just because it's the most popular language But we find them in lots of places json files dot e and v dot e and v files are You're really big ones that we find I leaked mine in a dot e and v file to make it a little bit easier for attackers and lots of other areas so We we we really find them in all sorts of places and we have a very long list of other Extensions as well And then we look at the types of secrets that we most commonly find data storage is number one So there's databases, but then also cloud providers are 20% So this is 2 million cloud provider keys that we found last year in a public repository and remember that these are only valid cloud provider keys, so Two million keys you can do a lot with that if we never wanted to pay for cloud hosting ever again We could easily done to do that, but when we don't not that malicious But there's lots of interesting things version control platform keys This one always amuses me because this is your github credentials to your private repository that you'll somehow put in a public repository So a bit weird, but it happens messaging systems is also another big one I love these as an attacker because it means I can launch internal phishing campaigns Using your own messaging systems, so like a slack web hook I can post in there And if we want to look at the kind of specific secrets this thousands of secrets that we look for so it's a very long list But Google API keys are really the number one Google cloud keys going down and we find lots of other interesting things as well Google OOS tokens these are very sensitive So we're finding lots and lots of these so lots of different secrets that we're finding out there and github All right, so how to attack is find these. There's a couple of ways This is the first way and it's the least interesting way in my opinion, but I'll talk about it because it's the easiest So this is just using the github search feature to try and find Credential so here I'm looking for a file name called credentials and I'm looking for an AWS access ID inside that The syntax has changed a little bit since this slide, but it's the same thing The reason why this isn't that great is because most of the secrets on github are buried in commit history So when you're doing something in version control and get a record of that is maintained For a very long well forever in your github history unless you rewrite your github history, which is a whole nightmare So this only looks at the top level so it's missing most of the secrets that you'll find It's also going to have a lot of false positives But there's a lot of what we call github dorking and we can use these and you will find some things if you have enough time You'll be able to find it But there's a much easier way to do malicious things of github and that's abusing the github API So github has an API API github.com forward slash events You don't need authentication to look at this anyone can And there's a bunch of events on here There's two that were interested in public event when a private repository is turned public and the push event when we push code I can show you What this looks like this is it here and you you get information like we have the email address of users This is all public. You don't need authentication So if I wanted to target a specific organization, let's see I wanted to just scan commits made by at Twilio domains because I'm on a target Twilio then you can filter that out Using this the credential I leaked is in this ledger It's been published on here and this is how the attackers are finding it because they're monitoring this. They're scanning it it's very easy to do and That's how they're gonna find the credentials. So when we say public repositories, it's easy to think that okay It's public therefore if someone knows it's exist they can view it But we also have to understand that it's broadcast It's not just public isn't someone needs to know you exist so they can find it It's on a ledger and they don't need to have any information about you if you leak it Someone is going to find it Here's a just a quick example of a real-life attack that happened with Toyota Toyota contractors and not Toyota themselves leaked database credentials belonging to a mobile application called T connect Adversaries were able to find these and this was in a public repository So what's interesting about this and why I like this example is because it wasn't even Toyota that did it It was someone that was working with Toyota and We have lots of examples of when source code is leaked or what I call involuntarily open-sourced So there's lots of these examples here of of source code being publicly leaked If we take one example, we could use twitch Samsung is another one There was 6,000 repositories that were leaked from twitch due to a misconfiguration And we found over 6,000 secrets when we scanned it we found a hundred and ninety four AWS credentials This is pretty typical. This isn't because twitch was terrible as this is a lot of data and secrets are in source code There's another way that we can find Private Information and that's doing wide-sales scanning for dot get directory So when you go get in it it creates a folder called dot get inside that is all your metadata Inside that is your history of your project It regularly happens that these dot get folders end up on your end up on your servers And if you have a public if that's publicly available your websites public your dot get directory is public too Which means I can not only find your source code I can find all your source code history from there So there was some cyber news did some large-scale scanning and found 2 million accidentally exposed dot get directories Which is is problem because if you're thinking that your source code is private It's much less private than you think because even without all this is cloned on to your developers machines It's backed up into wikis. So as an attacker. I know that I'm going to find secrets in your source code So it's there. There's lots of opportunity for me to really do that Why does secrets end up inside source code? Why is this such a problem? So I'm sure no one here would hard-code credentials into there So I'm sure no one would do that, but why does it happen? I'll give you the most common example that we have here We have a very simple git branch that you see you have your main branch But you then you also have some development features on there. So let's say that I say to you Hey, I want you to create an integration with Algolia. Here's a key Please create this new feature So you go off on a feature branch and the first thing that you do just because you want to test this, right? Is you add in your secret that green dot there that you have added in your secrets Into that just to quickly test it. You're on your own branch. No one's going to see it. It's fine You're testing it right it works now. You're going to remove it You remove those secrets and you put them as an environment variable or however you handle them And then later on it comes to code review your review is not going to look at all your history At least I haven't met a reviewer that does maybe you do that's that's fine But there's a lot of work But they're going to compare the latest version which has no secrets in it with what's happening in the main branch and make sure that That's going to work well They're not going to go through the history, but that's where you have a secret So this is why we have so much secrets in our git repositories and in our source code that we don't know Exist and this is why attackers are after them so much We also find secrets in logs or auto-generated file. Let's say that you're doing a debug You've got a problem so you dump out your environment and that debug log and your environment has environment variables which are secrets We find them if we don't have a dot git ignore is a very simple way of Preventing certain files entering into your git stash and your git repositories if there's no dot git file then Obviously those are going to enter in there We find lots of weird things when you do wildcard commands I get at all if you've got like secrets dot txt or whatever file in there you go get at all that gets captured put in there in Templates so if you create I don't know there's any Django developers here But when you create a Django it automatically creates keys and pushes it in there unless you know that they're there then they can end up in your Directory and even if you remove them later, that's they're still there And then the other one main one is that we find people just find it convenient to share secrets on git So they just put them in there in an ENV file because they think that they're protected by authentication But hopefully as I've just kind of proven source code is not as private or as secure as you expect All right, so I want to move away from Just source code and I want to start looking at some other other technologies that we can find secrets in So hopefully everyone here is familiar with Docker If not, it's like a mini virtual machine that you can package your application and its dependencies in And there's a place called Docker hub which contains most of the the docker images. There's more than 10 million publicly available Docker images on Docker hub And so we wanted to have a look at how many secrets were in there now Docker like some some other ones We find huge amounts of secrets So almost five percent of the images on Docker hub contain at least one plain text secret that can use it This may be for a package manager. It may be for your application It's typically different types of secrets then we'll see in source code because it's usually more related to the infrastructure that it is to Your services because hopefully you've removed all your API keys from here But there's still a huge amount of Docker images and I don't have time but sometimes I like to do a demo of actually breaking apart a docker image And looking into it because a lot of people think that because something's not human readable that it's not That it's secure, but this isn't the case Docker you can break it apart you can decompile them and you can look at all the layers that are made to build up for it And if you're interested a cool tool to do that, it's called dive So let's have a look at an attack that's happened because of leaked our credentials on Docker that also involve Code repositories so code cove is a code coverage tool It tests how much of your credentials are being helped it tests how much of your application is being tested So it sits in your CI CD pipeline. It does a small job. It's not that critical, but it's important So what happened when you use code cove you use you run their application in your CI CD pipeline using their docker image on Their official docker image that was publicly available that people were using they had a hard-coded credential That credential gave access. I think it was to a Google storage bucket which contained a bash uploader file Attackers were then able to edit that bash uploader file to turn code cove malicious They did something very clever. They added one line of code that said every time code covers run I want you to dump all the environment variables, and I want you to send them to me the attacker. So When we're testing our application, we need to build it. We need we need these secrets in our environment to be able to Connect to everything and make sure it's working So all those secrets are in our environment. So when we dump our environment, we get all those secrets Now if you're smart, you're using different credentials for testing than production But there's some credentials you can't avoid using Namely the the credentials that the attackers were after where your github or version control system authentication credentials so this gave the attackers access into 20,000 of their customers private code repositories now some of these Twilio Monday comm rapid seven hushy corp all had their private source code Exposed because of this. So again, you think your source code is private. Here's a supply chain attack that gained access to their source code as well I Don't pick on companies too much, but this is one that I will because I think it illustrates a good point Is that how she corp creates a secrets manager? Probably the best secrets manager available on the market is called vault. How she corp is a great company with amazing security posture The whole reason and their whole pitch behind vault is that vault? Reduces the need to ever touch credentials and therefore you won't have secrets inside your source code If you use vault, you won't have secrets inside your source code Because of the code cove incident hushy corp had their private source code accessed and guess what they found They had to report that they had secrets inside their source code Because of it So if hushy corp has secrets in their source code, no one else has any chance of being able to solve this problem All right, so moving away from docker images. I want to now talk about another thing and that's mobile applications So again, what is a mobile application? So you go on to the play store You look at you look at your applications and and what are they so similar to a docker image? You assume that these are non-human readable. They packaged up in some black box. Therefore, they're secure, right? Definitely not the case What are Mobile applications, they're glorified zip folders in the case of Apple. It's literally just a zip folder So it the the extension that they compiled to is dot APA for Apple dot APK For Android and these are easily reversible So how can we reverse an Android application? Very simple. We can download it on our computer using a simple tool called G play downloader We can decompile it with a tool called red X and then we can scan it with a secret scanner I'm in this case. I'm using GG shield to do it So this is the workflow to be able to find secrets inside an Android application Literally anyone can do this. It's very very simple. You don't you don't need any special skills All the tools are available super simple. I Have a quick demo. I probably don't have time of actually just how simple it is I took a random mobile application and From the play store and I broke it apart We'll just skip forward and then I scanned it once I had decompiled it I scanned it for secrets and if we skip forward You'll see that we find a lot of secrets in here including Google API keys And if we go to the top we'll get to a lot So these are all secrets that we've found in this mobile application This was a real application. It's not particularly bad. This this is just What it is you'll see that we have valid Slack web hooks so potentially I could post some internal messages try and trick your users We have valid Google API keys in here as well So we don't need to go on to too much, but that's just to illustrate how simple it is to To be so decompile and scan for secrets in mobile applications Apple's even easier So again a tool to download Apple When I say they're glorified zip folders How you extract an an APA is you just change the extension to dot zip and then you just extract it And then you can scan that for secrets So how many how many of these secrets do we typically find well first? Let me talk about a real-life example. This is from my friend Jason Haddick's who's a an ethical hacker and And this was a exploit where he found for a bug bounty So there was a bank application We're not allowed to say what it is but this bank had a mobile application with it One of the features of this bank an American bank one of the top five was that you could take a picture of a check and Then with the app and then cash that check What by looking at the code and decompiling it? What he found is that these images weren't being encrypted. They're being stored on the phone's memory He then found that these were being sent to an Amazon S3 bucket He found the keys to that Amazon S3 bucket hard-coded in the mobile application and then whammo He found 10,000 images of checks in plane in plane form on a Amazon S3 bucket that he had access to So this is an example of kind of showing this is a bank Right, you wouldn't expect a bank to have hard-coded credentials for something as sensitive as this But this is the state of world of the world that we are and in fact Huge amounts of mobile applications have secrets how many so my friends at cyber news did a full study on this Did I remove them okay, I did So about half they found about half of mobile applications are on the play store Contained secrets so huge problem huge problem that we have here that everyone is facing so as an attacker I have lots of opportunity to try and gain access to these credentials using different ways I want to go quickly back to the demo that I did and let's hope That we have lots of things So I have a slack channel here every time someone has tried to abuse my credentials since I've been talking It's posted here an alert that someone's tried to use those credentials and given me their IP address. So if we look We have this first one was me testing it, but here we have already This is their this is the IP addresses that we have so there's a few different ones in here So in this period because it does it in five minute periods So in this period we've already had about seven attacks from it and then since then we've had another two and another two So about ten about ten Bots have tried to exploit the AWS credentials that I leaked in public github in the last 20 minutes So this is the how big of a problem it is that if credentials get leaked on github They're going to be found So what's actually going to happen with my credentials throughout the rest of the day I'm gonna get a lot of activity on this So you'll see here that every time someone does it it gives me their IP address And it also lets me know what they've used to do it. So here. It's doing a call to get get my identity It's basically checking that these are valid It's gonna come back that these credentials are valid because of the honey tokens are marked as valid and then what's gonna happen? Usually what I'll see is that I'll stop getting activity after a couple of days and then a month two weeks Two months later. I'll get another spike in activity What's happened in between that because we can track it is that these credentials get bundled up and sold on dark web forums so What's actually happening is a first group of attackers is really good at discovering credentials, right? But they're not very good at doing stuff with them So then they sell them to a group of attackers that know want to do perhaps they want to create Crypto mine for cloud keys like this This is often what they'll use for DDoS attacks is that they'll gather lots of Credentials that are valid and then use them to do malicious things or they might be looking for specific companies So my email address In my github is at get guardians if they wanted to attack at to get guardian Perhaps they could bundle them as here are all the credentials from slack from get guardian from Twilio From these people that work at these companies. So that's what an attacker is going to do So how do we prevent this? So first of all we got to stop hard coding credentials This is really the easiest one that we can do up here. We have you have an example You have your API key is in there Even if this is just a test even if we're just wanting to see if just to test that these work We should never do this because it's going to be in our history And if you've ever had the experience of trying to rewrite history in a group project You'll know the unbelievable pain that comes with that. So once they're in there. They're pretty much in there We need to use the correct secrets managers and this is not always the best So the best secrets manager as I've talked about it's probably how she caught volt Maybe there's some other ones that you could argue just as good or better But that's really at the top the problem is that this is very heavy If you have got a group of five people working on a project and you want to use volt You basically need one of those five people just to manage volt and just to manage a secret server It's very heavy. It's very complicated And then what's going to happen is that you're going to get sick of using it So then you're going to store secrets.txt on your home page. So you don't have to deal with it So maybe volts not the best solution Then you can kind of go to SAS versions of volt one for Doppler that's not up here You've got a keyless one password has a great secrets manager for developers as well with cool stuff like VS code integrations Then you could use if that's too heavy You don't want to have a dedicated area for secrets manager if you're hosting it on the cloud There's secrets managers in here these lack a lot of the features But at least if you start using these you'll get an idea. It makes it difficult to share secrets And then the last one and every security person will tell you that this is a bad idea that this is terrible I'm one of the few that will say it's okay. So this is Encrypting your secrets and then storing them in an encrypted file on get this is a terrible idea for a few reasons It gives you a single point of failure that encrypted file is going to sprawl with your source code So if it does get cracked you have a problem However, why I say it's okay is that if this is what it takes to get you from hard coding your credentials If this is the lowest point of area that I can get you to actually do it then do it Let's do that. I'll work on the rest of the stuff later, but if you can just encrypt them to start with then That's really where we need to start Using automated secrets detection secrets at this point. There's a lot of secrets detection tools that are really good A lot of them are open source. I work for a company called get Guardian So I'm totally biased in anything I say about them But we have commercial tools available with dashboards and stuff We also have some open source tools, but there's lots of other open source tools And so it really depends on what you want to do But you know travel car get leaks these can all detect secrets and if it if And as I get if that's the start just to use some tools All of these can be used to create things like get hooks to prevent you from committing secrets and and They can also be used to scan your directories and I've used gg shield to be able to do you know Scan the mobile applications and other things like that And just some final thoughts, you know rotate your secrets regularly Don't use long-lived secrets the added benefit of rotating regularly means that you know how to do it because when a secret gets leaked You might be alerted. There might be Traffic that you're unfamiliar with but no one knows what that secret does and who's in control of it And what happens if I rotate it if you have a rotation policy, then you're gonna be good at it So if you do have a breach, it's gonna be better limit your privileges Stop creating admin tokens if all you need to do is read information make sure that's all that you can do with that key White list your services so if you know that this service meant to be talking to this service if you can make it so that only that can happen and That's pretty much it So here are some QR codes the state of secrets for all is to report that all this information is in So you can download that if you want and here we have a white paper on how to manage your secrets I'd like give you a benchmark against other people But thank you all for coming out early and watching me and if you have any questions, I'll be gladly to take them now So thanks guys any questions Yes Yeah, yeah, so one of them definitely is so one of them could I leaked AWS credential? Oh Yes, sorry. Thanks. Yeah, so the question was in the information that we have on the slack channel Could some of them be good and and all of them malicious so I don't know about all of them I know that some of them are malicious because we can monitor it, but some of them are good too one of these IP addresses is going to be from Amazon themselves Amazon is actually one of the companies doing the most to prevent secrets leaks and they are looking on github themselves and If they find a key they will actually try and alert you to the fact that your keys leaked as well So they're definitely going to be some of them, but there's thousands of credentials This is one case that Amazon is doing particularly good at And yeah, there used to be some other other services such H H SHH leaks or something like that would would be doing it and this was kind of like a gray service where it would It wouldn't alert you but you could see all the secrets So yeah, some of them are and you'll probably notice it like some of the IP addresses are the same like these ones So it's it's hard to know but definitely some malicious activity and and definitely some some good activity as well Any other questions No, no problems if you want to learn how to make honey tokens like this It's incredibly easy in 10 minutes. I'm doing a workshop on how to do it Where I run through where it's just an open-source tooling and it's a lot of fun So if you want to know how to make these I'm yeah, I'm running a workshop in about 10 minutes and A to 1 8 I think but yeah, thanks everyone for paying attention and I hope to see you again soon