 Good morning. Thanks for coming out to my talk. Thanks for having me here. Today I'm going to be speaking about microservices and functions as a service for offensive security. All right, who am I? My name is Ryan. I work as a pen tester at Centurion in Singapore. And my first obsession with functions as a service happened in January 2015 with AWS Lambda. This one became available to general public. Essentially what it allows you to do is upload your code, whether it's like a Python script or Node.js or whatever you want to do. And AWS takes care of all the scaling and running of that code from the server onwards. So they give you a million executions for free every month. When it first was released, they only supported Node.js. But then I found Lambda by Eric Hammond and that allowed me to get kind of a shell access in this temporary Lambda environment. So what you could do is you could use Lambda to create a function and you could type in like a shell command. And then that would run the function. It would run the shell command, get the output and report it back to you. So you could start exploring this temporary environment that Lambda is running in. And this leads to the idea of serverless. And the whole concept of serverless, which I think is a terrible name, is the idea that there's no servers, right? That's what it kind of gives you the perception of. But the idea is that you don't have to worry about the servers. So you don't need to have servers running that will manage them and know how to like be a sysadmin that can scale up very quickly. So if we look at the stack, basically the function of the service, you just run your code. And the cloud service provider takes care of the interpreter where that's like a Python interpreter and everything down in the stack. A good example of functions of the service is for like Lambda, maybe you take a photo, it gets uploaded to S3 bucket. That triggers a Lambda worker, which takes that photo, applies some kind of filter to it and then returns the image back to the user or stores it somewhere else to see. There's been a few security examples, security cases for functions of the service and Airbnb had stream alert. And this is essentially the good way to scale your logging infrastructure. So if you imagine you have a very scalable server infrastructure, you need to also have scalable log processing and monitoring. So you can have a Lambda function, which ingests all these logs, applies some rules, and then triggers another Lambda function to maybe call page or duty or Slack and get some alerts. Some other examples of using Lambda are like having an API endpoint, which a developer can call to and then they'll change the firewall rule in the EC2 instance and they can SSH in or you could monitor like club flares, public IP addresses and update your firewall rules so that only club flares, web servers can talk to yours. AWS also has a web application firewall and they use AWS Lambda to do some same kind of concepts in terms of like log monitoring and reacting to some of the alerts from the web application firewall. All right, so now I'm going to run through some very quick examples of a Hello World and Lambda basically just go and create a function. You can have a lot of triggers, which you can run, so you can integrate it with different AWS services that will trigger your code to run. And we're just going to do a very simple like three or four line Python script. I was going to go use URL lib2, go and call up to open DNS, get the current IP address and print it off the screen. So we set some basic limitations. We only need to use 128 megabytes of memory and we're going to time out in one minute. So you run the code and we get a we see what IP address we're running for. And we can see that this this script ran for 224 milliseconds and AWS is going to bill us for 300 milliseconds. There's also other platforms like play with Docker, which kind of lets you experiment with Docker swarm. And you can just go to play with Docker.com. All you have to do is click a capture and then you get four hours to explore around in this temporary environment. And they're running on top of AWS and they have a Python interpreter there. So you can go to play dash with Docker.com and you have four hours to kind of explore and try out the Docker. What's interesting about this is that it's anonymous. You don't need an account. You don't need to sign up. You don't need to identify yourself in any way. But then you only have a time limit of four hours and there's a capture. So it's difficult to kind of automate and getting that temporary shell. Now let's talk a little bit about cost. So if you go to serverlesscalc.com, you can see a good comparison and overview of the cost of different cloud service providers. And if we think back to that simple hello world example where it ran for 300 milliseconds, you could run that 10 million times for only a dollar 80 every month. So it's very cost effective. Now different cloud service providers have different levels of kind of maturity in a sense of where they support this functions as a service. AWS has 14 regions and Azure has 23. So they're kind of the most committed to functions as a service. This is a good overview. So we have Google. Google gives you a native IP version six address. And IBM supports Docker. So if you think about functions as a service, just uploading your code. If you can have like a Docker image, you can have a much greater control over the environment that your code is running in and what else is there. So AWS has 14 regions and Azure has 23. I think Azure is probably the most mature kind of service offering in terms of the function of the service. They support the most scripting languages. But they only run on Windows. So some limitations there depending on what you want to do. So in summary, I think there's three main advantages to functions as a service is very low cost or somewhat free in most cases. When you sign up, they give you a sign up credit. And due to the low cost of most of the services that they offer, it can be difficult to use all of that credit. And you get a unspecified source IP address. So what these cloud service providers are doing is they take your code, they inject it into kind of like a random server, which they have, and then your code will run there and then they'll take it out and put it in different servers. So you're kind of running your code in a different environment almost every time. I allow them to have like global data centers and data centers in China. So you can use that to your advantage. So this, let me to start a small project of mine called Project Thunderstruck. And your goal is to find use cases for functions of the service in offensive security. And I explore different cloud service providers. And I wanted to get super computer resources without paying super computer prices. And earlier this week, I spoke at B-Sides about searching an IP version six. And today I'm going to talk about distributed denial of service without servers and brute forcing SMS OTPs. All right. So we had this client that purchased the anti-DDoS service. And they were kind of concerned whether or not it would work. They wanted to know, like, is it going to work at 2am? Is there someone monitoring a console and manually doing something or is it automated and does it really work? So I came up with this plan to find a very simple HTTP DDoS tool written in Python. So something really script kitty. And then upload it to a cloud service provider, trigger it, and then monitor the target and wait for results. And what I found was a golden eye. A golden eye is pretty cool. It has some good ASCII there. And so I just modified it to hard code the target in with some of the kind of like command line parameters that just hard coded everything. And then I had it time out after a minute. So I only want to do a DDoS for a minute and then I want it to stop. So this is the modifications I made. Just look at line 567, remove everything down from there, and then hard code in all the parameters. So I set up my test server. I ran the function and then I tailed my Apache logs. And I just started to see all these requests coming in. So it's making post requests with, like, large amount of data in the URL and post data. So I can see it's working. So it was time for the real tech. So I triggered the code to start. And I just waited for the abuse email to come in from the cloud service provider and from the client. But then the site was still up. So something strange was happening. So I took a look using curl. And I realized that the site was responding with the location header. And this location header is part of the anti-DDoS solution that they purchase. And they basically want to, I guess they're trying to see that it's a real user. So if it's your web browser, your web browser will just handle that. But if you're using some tool, then obviously the tool will not be able to follow the redirection. So I went back to GoldenEye, went to line 336, modified it a little bit just to get the response, get the location header, and send the request over there. Tried again. And it worked. So I was using AWS route 53 health checks. And essentially it's like doing a curl to that web server, looking for a certain response, then determining whether or not it's up and it's working. If it doesn't get a response, it assumes it's down. So it has a very nice graph over here. You can see that we started the attack. And then it went down. And then we can stop the attack and immediately come back up. Another good thing about using AWS route 53 health checks is that it does a health check from different regions. So you can get a perspective on the server from Tokyo, Singapore, Sydney, Ireland, and all these different world views to make sure that it's available from different locations rather than maybe just goes down from where you're checking. So here's the results. I mentioned to generate about 30 Mbps of DDoS traffic, but I only used one region and one zone from one cloud service provider and managed to get pretty good bandwidth out of that. So if I was to maximize this over multiple regions and multiple service providers, it would grow to be quite a lot of bandwidth. And the best case about this is that the abuse was not detected by the cloud service provider and our accounts still active. So in summary, I think anyone who knows how to copy and paste a Python script can become a DDoS king and they can get access to really high bandwidth and almost for free. Okay, so now I'm going to talk about brute forcing SMS OTPs. Essentially, when you go to buy a credit card purchase online, your bank will send you an SMS with like a six digit OTP and it expires within 100 seconds. So this is similar to a verified by visa kind of setup, which I looked at. And if we look at the architecture diagram, there's kind of two main components. One is the access control server and the merchant plugin. And the access control server is kind of the key component that's responsible for the OTP verification. So it checks to make sure that the card holder is like registered and enrolled if it can send the SMS and then it parses the SMS and make sure that it's correct or it's incorrect. So it's up to the ACS component to detect a brute force. And most of the time these components are generated by, they're created and provided by third party providers. And they can either be hosted in health integrated into the banking system, or they can be externally hosted service for the bank. And visa does some basic compliance testing of the ACS and merchant plugin, but it seems to be more around making sure that it's interoperability, that it functions and kind of meets the spec. And they clearly say there's not any kind of endorsement or warranty to the security of the system. And yeah, so it's up to the ACS to check that the OTP when entered is correct and to implement any kind of security controls that would be necessary for this component in the system. So I came up with this plan. I need to get a six digit SMS OTP value. There's a million possible values and I have a hundred seconds to do so. So the plan is I start a simulated online purchase, I load the SMS OTP page, I submit one OTP, I capture the HTTP request, I load that into Thunderstruck, start all the workers, they start guessing the correct value. And then when they find it, they'll report back. And I can take that request, put it back in the browser and continue with the online purchase. And I have to do all of that within a hundred seconds. So it seems like a good use case for functions of the service to scale this. So this is the architecture of the script that I came up with. Essentially, I have this Python script, which is going to create a random OTP value. It's going to clear the guest counter because I want to keep track of how many guesses I've made and how long it takes to brute force all one million possible values. Then it's going to keep pulling elastic search to wait for the result and then trigger all the Lambda workers. They're going to recursively call themselves to try and help with scaling. And then they're all going to start attacking this Google App Engine server that I set up to simulate kind of the online purchasing payment processor. And each Google App Engine instance is going to talk to a memcache server to check if the OTP is correct, increment the guest counter and then return to some kind of message to indicate whether it's the correct OTP or the wrong OTP. And the AWS Lambda worker is going to look at that response. And then if it's the correct OTP, it's going to report it into elastic search. And then that will get picked up by the Python script that's constantly pulling. So I created this Google App Engine, basically learning how to scale a server to handle about 16,000 requests per second. So I used 200 instances and I have like a 50 line Python script. So very simply just handle setting the OTP, storing it in memcache, getting the current OTP that was guest, checking the value, returning the message, and then incrementing the guest counter and kind of reporting a little bit about how many guesses have been completed out of how many there are for that possible OTP value. I use a memcache dedicated back end with that can support 20,000 operations a second. And I set a daily spending limit of $10 because I don't want anything to go out of control. So I just do gcloud app deploy and my 200 server is running. On the kind of attacking side, I have this script called trigger worker AWS. Basically just calls to the Google App site and sets the OTP. And then it pulls the elastic search to kind of keep searching for the correct value. And then I invoke all the Lambda functions and then wait for the result. For the Lambda function itself, it's the worker.py. It's basically just going to receive like a message from the trigger worker AWS to say which OTPs it should try brute forcing. It's going to call itself with each different OTP and then brute force to the Google App Engine site. So I have a simple test that has set up. So I could call the set OTP. I can see I can set it to 013370. I can see how many OTPs have been guessed. And I can try an OTP like 123456. And I can see that it's wrong, but I've tried one guess. And so now I have a good test server to test out my theory to see if this will work. So I started small, started with four digits. I was able to split up the work so that each worker does guesses 100 OTPs. And so over 100 workers, I can do this in about, I can find the OTP in 12 seconds, but I can brute force all possible values in 26 seconds. So this includes like from the time I started to setting the OTP value to trigger all the workers to pulling for the response in elastic search and getting the value back. So I split up the work a little bit more, give less OTPs to more workers, and I can reduce the time down to 11 seconds. And if I do it even further, I can get it in seven seconds. And so I scaled up another order of magnitude to five digits. And I was able to do it in 100 seconds with 100 OTPs of worker and 72 seconds and 24 seconds to split the work even more. So then I started on the six digits. And in six digits, it's a little bit sketchy. So sometimes it can find it in 31 seconds. So I found the OTP in 31 seconds, but then it took about three minutes and 43 seconds to brute force all million. And so I split the work up a little bit more, but it didn't seem to really have much effect on doing it faster. And so I managed to sometimes I managed to get in about a minute and 16 seconds or 76 seconds, which is still under the 100 second time limited window. But I used some different geographic regions or different regions from AWS to try and make sure that closer to the test server, this Google apps server that I'm using maybe to deal with some latency issues. And so I did some more tests and I managed to get it in like 68 seconds, 101 seconds using AWS regions, which are closer to the Google App Engine region where the code is running. And so eventually, yesterday I did another demo and I recorded it. So I'm going to show you a video soon. And I managed to get it in 29 seconds. Okay, so now's the demo time. Okay, so I ran the script. It generated a random OTP value of 661226. It triggered all these workers across different regions. It took about eight seconds to start all those Lambda workers. And then it's pulling for elastic search in the background. And after 29 seconds, it's going to fast forward a little bit. Yeah, so then in 29 seconds, it managed to find the OTP value, get it out of elastic search and complete it. Okay, so I used a test server and Google App Engine with 200 instances. And I was able to guess about 500,000 in the first 60 seconds. And then the rest of the request kind of time out or they take a really long time to process. And there are some requirements to this attack. Essentially, you need to be able to keep guessing the OTP and not have account lockout. And you need to have a server that can handle, you know, in theory 16 or 17,000 requests per second. So there's a risk of causing a denial of service. And you should try to do the attack from somewhere that's geographically close to the target. And you need a little bit of luck. Over here is the graph of the Google App Engine. So you can see it's handling about maybe eight or 10,000 requests per second. I'm going to be posting my code and my slides on GitHub. And if you look at the visa, visa release the merchant server plugin guide, implementation guide where they say they should expect about a five minute timeout for handling the OTP and that transaction. I think going further, some banks have introduced eight-digit OTP, but they've also increased the time limit to three minutes. In order to do this, I probably need a more scalable test server to test this out. But I think this is interesting because there's probably further applications of this style of attack, maybe on password reset URLs or account signup and registration where there's no account lockout, but you can brute force it for a longer period. Okay, so I hope you found this talk interesting. And if you like this topic, I would definitely recommend you check out a talk gone in 60 milliseconds last year in December by Rick Jones at CCC. There's also a few talks. There was a talk last year at Black Hat and there's a few this year besides Black Hat and Def Con. And if you find this interesting, maybe there's some key points that might get you started in doing your own kind of work in the space of functions of the service. So you can look at AWS Lambda. They give you an instance with 1.5 gigs and you can run that for 266 seconds for free every month. There's also the Alibaba cloud based out of China, but you need a plus 86 mobile number to register. And quite interestingly, if you look at IBM OpenWisk, you can have a Docker image. So you get more control over the environment and run, maybe do some more interesting stuff there. If you don't want to use a service provider, you can also set up your own. You can use this Docker swarm. This function has a service here and you can set up your own kind of similar environment for running functions, scaling it, and monitoring what's going on. And that's the end of my talk. I'm going to be posting the slides and demo and all that stuff there on my GitHub. Thank you.