 So, before we can get started, I just want to say thank you first for your volunteers. The first session, we were doing a little bit of our preach to find some people to turn up and speak, you know, the first one. We didn't really want to be all about AWS. We wanted the community to speak, so thank you very much for stepping up and doing the first session for us. I really appreciate it. So even though it's supposed to be a bit more general, this is an AWS event so I thought it would be better to keep a bit more to the subject before they throw me and Paul out, so just to be safe. So today's topic is developing on the cloud, developing security, securely with AWS. So I'm from Horangi, we are a seven security firm. So one of our products is a software as a service, it's an online vulnerability scanner. And this scanner, the application is posted on AWS. So this presentation will be our experience developing on AWS and the issues we encountered and how various AWS services helped us solve these problems that we faced. So brief introduction of myself. I'm a cyber security consultant. I'm a developer by training. I'm a developer by training. So along the way, I started two startups before I joined PricewaterhouseCoopers. So it was a Wi-Fi monitoring startup and an e-commerce startup. So as you can tell, I was a co-founder in those startups. AWS security was particularly important. So that is how I got into security. And then I joined Peter VC PricewaterhouseCoopers, became a technology consultant focusing on security. And at that point in time, I started hating the developer part of me. Because whenever I coded something or did something, I would realize that there are a lot of issues with the code, security loopholes and so on. So now I just, I enjoy making and breaking stuff, but now I'm mostly breaking stuff. Oh, okay. So there are so many security search out there, right? Like your ECH, the ethical I can't remember what it stands for. Yeah, but so many of them. So I was exposed to this while I was with PricewaterhouseCoopers. I found OSCP the most interesting because it's a hands-on process. What happens is they will give you access to a VPN with a couple of machines that can be compromised. You go in and compromise these machines and that's also what you do in the exam. After that Crest, which started from UK, I think, started making its presence known in Singapore, Australia and a couple of other countries. So now a lot of government organizations, when you're doing this for security and so on they'll ask for this certification. So that's what I did. You've got OSCP and Crest. So they are basically just certifications to prove that you can do something. You can do security and break stuff. The good thing about the Crest exam for those of you who don't know is it's quite difficult to get and it really raises the bar when you do the exam throughout the past. And it basically simplifies using bad-ass hardware. And so if you see anyone that's Crest, like QDOS. Both of them. I was Crest. Disclaimer. Disclaimer. I need Crest because I'm a scrawny Asian dude. And I say I can break your stuff. You don't believe me. So the agenda today is really simple. It's how we started out. How Horongi's technology started out. Where Horongi is right now and what we can improve in the future with respect to our infrastructure and what AWS provides. So what we started out. Time to launch in three months for our product. So this is a software to service hosted on AWS infrastructure. It's an infrastructure. It's a vulnerability scanner. So let's start off with the prehistory of Horongi's product. It's a really uncomfortable feeling of my butt moving on the floor. It's not working. Let me check the panel. Let's carry on first. Can everyone hear me at the back? So I'm going to start from version 2 of our architecture. Which is what Horongi is like three months before launch. There's no point starting from the start. Right at the start, our development was a bit of a mess. Because it was mostly local development. Everyone brought it together on a single server. There was no security in place or anything like that. So we started our initial sprint like any other startup. Focused on functionality. Focusing on the minimum level product. And then along the way we found that we had to refactor the code base and overhaul our entire environment. So we brought it onto AWS. So this is our original architecture. I was made known during the pizza session. There are a lot of architecture solutions certified people here. So I think a lot of eyebrows are switching right now. This is a really bad design. We did not take into account any BPC, any subnetting, any IP, white listing, black listing, any pod. Every pod was allowed in and out. It was a very bad design. That's what we started out with. So as you can see, we did put some measures in place. We thought about load balancing, auto scaling and so on. So what if a lot of people are trying to access our application at the same time? We made use of AWS autoscaler in our main application server. We used elastic cloud services which is also on AWS to help with our data analytics and S3 for data storage. So on the left side of the slide, you can see everything that was posted on AWS. On the right side, you can see things that are out of AWS which is our scan server. We do boundary scanning out of the scan server. So there's a lot of traffic coming out of that. But AWS has some conditions that won't allow us to do that freely. So we moved that out. But other than that, everything's on AWS. So the priority is then. So basically making sure no one can break our application because we are a boundary scanning application. It's a bit of a reputational issue our application gets broken into. So security with data at rest and in transit. Basically encrypting your data properly, making sure you've got proper SSH connections. You're using SSL and so on. And do an encryption for important data. Ability to scale and handle unexpected loads. So this explains the auto-scaler and our load balancer. Redundancy and resilience. So with this, we use some ECS doctor. I'm not sure if any of you have heard about this. But just check it out. It's really, really handy for that. Fine. Just a quick question on the application security stuff. What sort of tools are we using? What methodologies are we using? Any lessons learned for the guys? Okay. So for application level security, for web application security, in general, we'd like to follow the OWAPS testing guide. That's a pretty comprehensive guide you can find online. Just Google OWASP testing guide. And the PDF I will pop up on Google. So what it covers is the various categories such as infrastructure security, configuration of your server. Input output validation, authentication. Many things. It's a pretty comprehensive list. So that's also our priorities then. And then concepts that cropped up from that structure we had was along the way, in the three months leading up to deployment, anyone who has worked with a development team who have experienced this, a lot of problems cropped up, functionalities are made, changes are made in the last minute to functionalities that you have. So at the code level, a lot of changes are being made. So development was paused for us to clean up our environment and our code. We did not have a BBC at the time. It came to our attention that that would be an issue because anyone can just ping POP22, SSH on our server and find that it's open. Someone might try something funny or a new vulnerability might crop up the next day. It might not react in time. It's unfortunate but at the time we couldn't just stop development and put in a BPC. So we just carried on and we tried to mitigate all this risk partially with pen testing and network audits. So all these are some of the concerns. The major issues we had were because we were using Docker, whenever the Docker instance resets, we lose all the locks in the instance. So as a developer, as a DevOps person, that is a huge problem because whenever your locks reset, you lose everything, you don't know what was happening in the server before it crashed or did someone try to do funny, something funny to your server, any suspicious traffic and so on, you wouldn't be able to tell that. And then there was no centralized logging system. So if you want to figure out what was going on in the application, you have to go to reuse Python with a plus server. If you want to figure out what was going on there, you have to go down to that server, lock and look at it and then you have to check the locks on Amazon to see what ACL commands were being passed and so on. It was an insecure environment. There was no separation of production and developer networks and everyone had access to access permissions. So basically on Amazon's really, really useful EEM, the identity management, identity access management, everyone was allowed to do anything. So we had just had a normal tester who was supposed to test, use cases on our application so that person could actually create an admin account and kick other people at one point in time but we changed that of course. Just to stop on the IAM stuff, for those of you who don't know, when you create an account in AWS, we give you the flexibility to create many users and you can be really specific about what they should do. Guess what? What you just mentioned, it's a bit easy to just give everyone permission to do everything and that's how it starts. And then I guess, you know, it's real easy to stand around a clock in an afternoon clock. But again, shared security model, no responsibility to say who does what in your environment. Is it a storage engineer? Is it a security person and so forth? They should only have permission, just like normal systems, principles of these privilege, so you can use IAM to do that. I'll go into more detail about it later but basically what we did to solve this was to define user roles as with any application. So you've got your engineers, you've got your infrastructure people, you've got your testers and so on. We defined those roles. We gave those roles only the permissions that they would need and nothing more. So that's IAM for you. So what we did to resolve the urgent issues with the help of Amazon services for logging, we manually added server logging to the application level. So this helped us streamline the logs all through one location, all through one stream. And this was done using Kinesis. So Amazon Kinesis, we created basically a pipeline for all this logging data. All this was fed to S3. So S3 is just very cheap storage, basically. All the logs went there. I'm not going to go into detail but this is what it looks like part of it, what it looks like. We created a delivery stream. All the logs will go through this and then it will be delivered to our S3 instance for storage. Does this fuzzy mark say encryption or unencryption? Basically... It does say encrypted. My boss is here. It looks like too many letters. It is encrypted now if there was a transcription. Did you particularly say CloudWatch at that time? Sorry? We haven't configured CloudWatch at that time. CloudWatch is also a really handy service that we could have used but because from where we started from we started from developers who were not used to using CloudCholster services. We did use things like DigitalOcean and so on but not really any provider that had a scale of Amazon services. Having Elasticsearch hosted at the service that was pretty new to us having a relational database having CloudWatch, all these utilities that come along with the EC2 hosting service was quite new to us. So it took us some time to learn about that and pick that up. So we did logging the old way. That's a good question. You've used CloudWatch long days? We have used CloudWatch. We need to do that because for instance, we used to use this of NGS as well but it's not so then we used this CloudWatch. Just a question, when you've got the CloudWatch logs agent where do you share it? It automatically goes to CloudWatch logs. The issue that we have now is that with the application we have a lot of logs. The operation team takes a very long time to query the specific logs because we use this as an API and there are open source tools that allow you to query the logs very easily. Adverse logs form human but it's still very slow. That's the only concern that we have now. And then for the insecure environment we did segregation via VPCs. VPCs are virtual private clouds so essentially it allows you to create something like a subnet within your Amazon machines and you can define security groups and assign them and then what we did was we defined the security groups we allowed for particular machines in the VPC that did not require external access. For example, database servers and so on we only assigned internal IPs and external addresses. If you go to the hostname assigned to our database from the internet you will get nothing. You will just give me an error 404 and this will only be accessible via a VPN server. So another thing that came helpful was OpenVPN which is the VPN service we are using it's available on the Amazon marketplace so it's just what happens is you see something like that on the marketplace it's a deployable instance of OpenVPN All you do is you go in you click deploy, choose your instance size and then configure OpenVPN and it's up. So in the past when we did this on say digital ocean servers it took a bit longer this makes it more convenient but whatever works for us at that time whatever was the fastest this was the fastest way and then this is the architecture that we had after a bit of modification so this was done within three months of launch we tried to put everything properly and make it a bit more secure so you can see that we put everything in a BPC we defined security groups for public and private access and then we also made sure that the ports allowed were very strictly controlled for example for the public for the public security group only port 80 and 443 so those are your HTTP ports only HTTP and HTTPS allow any traffic going out goes through the NAT that we have same thing for the private network the private the private security group so only SSH and the VPN was allowed to go through so that's basically port 22 and because we used OpenVPN there was a web interface we also allowed 443 and port 80 but we tried to limit access as much as possible everything else that didn't require external access our development server, Jenkins, control servers our reverse DNS everything else was refuse access to the to anything outside of the BPC and outside of the VPN control I've got a question so in your experience not the BPC and the security groups do you think that was easy or difficult the port wasn't around that time but there was a lot of sleepless nights I was going to say it was easy let me ask you a question was it easy or not it was really easy I don't know if these servers are linux or windows but let's assume they were windows in security configuration that you've got for these security groups do you think you would be protected now around this world like that especially if the port blocking by defining only port 80 and 443 we are reducing our attack surface by a lot so the hard part was learning how to use these services because there are so many of them you just got to look through Amazon has actually pretty good documentation the mistake we made was we didn't think in the documentation we just tried to hammer everything out as fast as possible and made a lot of mistakes along the way so we learned it the hard way so that's why it was tough for us but there's pretty good documentation look through that and you are quite well covered so for us that's the journey we took but it could have been easier so for the insecure environment as you can see from the architecture diagram I showed you just now in addition to that as I mentioned previously the IAM rules we did a review of all our roles and authorization permissions for example not everyone can access the marketplace anymore not everyone can access particular services or APIs that Amazon provides to the IAM service we restricted permissions on the individual services as well for example for S3 and our relational databases we don't need a lot of particular white listed IPs to go through so in this individual services you can actually define how you want to restrict either by certificate if I remember correctly or by IP white listing so we chose IP white listing because we are using internal IPs within the BPC and lastly this was a pain for everyone but we activated 2FA on anything we can be at our hands on back to everyone come on so what we used was the Google Autonicator application so when everyone's phone could just install Google Autonicator we just synced that application to the 2FA services IAM has it OpenVPN has it so those are the two largest entry points to our infrastructure so both were protected by 2FA and this screenshot is a bit small but you can actually see in IAM policies there's a policy that says false MFA so false multi-factor authentication we activated that anyone who didn't have MFA will log into the Amazon console and find that they didn't access to anything so there were a few people complaining that they but everyone activated 2FA pretty quickly alright is that just for developers or administrators or what's kind of like anyone who has access to the console needs 2FA that's basically what we did except for the administrator who had to set it up first so we were trying to be on the safe side so that's why we did this just a hammer home why that's super important the AWS platform CK where you could set up your security really good setup but if you're sitting at lunchtime in Harangi headquarters someone says you email where you checked and you click on that piece of malware and it says key log in your instance your password and someone from the other side of the internet may be in Brazil who just log into your service and start deleting things right so having multi-factor authentication even if they steal your password they physically would rob, plan and steal this form of it there are the 2 factors to actually log into AWS super important to enable it and as John said not just for your admin but for everyone because everyone's trying to be good so in addition to something you know in terms of security, in addition to your password your username something you know we also included another factor which is something you have which is your phone and the Google Authenticator application installed on it so that's multi-factor authentication and that's the end of the changes we have so this is roughly what the architecture we had when we launched our service but of course there are other things we can do and other things you're working on along the way improving security next thing it's a very steep job to view one last thing anyone who's seen an Apple launch one last thing is always the best I'm a bit of an Apple Trader this looks like a MacBook it's an Apple sticker with a dial behind it tell me so one last thing there's still many things you can work on to improve on our development environment and our architecture so the first thing that comes to mind would be high availability VPNs because it's so easy to launch open VPN instances and we have people working currently Horangi has presence in five different countries and our developers our infrastructure people our testers can get pretty spread out sometimes and everyone is connected to the Singapore VPN so you can imagine that can be a bit of an issue especially if you're working away from say Manila at a pretty isolated part of Manila well you can't get isolated in Manila something like that you'll find that it's very slow especially if you have to go through the VPN and with the security measures in place a way to mitigate that would be high availability VPNs from multiple countries routing back to our BPC or our internal network so another thing we also want to do is to define the environment in top formation so this would allow us to deploy deploy more easily and redeploy things repeatedly so this will give us a bit more control and ensure that security is in place and lastly clock trail so we are alerted to any changes made and react we can react instantly so this is one of the more important things that we failed to do initially also implement clock trail to look at the locks whatever is going on so an example of how clock trail and clock formation can work together one thing that one scenario we have thought of that can be solved easily by this for example your ACL has been changed clock trail detects this and then lambda triggers to redeploy everything via clock formation and that will actually pretty much automate your security solution in a way so it's one thing we are trying to do in the future I'm going to keep this short because I'm getting late and I only have one slice of pizza and I'm hungry you don't want to eat out there anymore food I'll take the point for that sorry we thought 20 pages was enough it's not okay question since your company is a new start-up company any enterprise customer already subscribed your service we are in this discussion with several enterprise customers I can't tell you any names in terms actually yes we have a few enterprise customers that are already with us in Singapore our presence in Singapore is not very strong yet we are still building on that but there are customers in Philippines Indonesia and Thailand enterprise customers that are using it Singapore only a handful so with your journey on AWS what was kind of your experience joining different services together internally we refer them as like you've got different cases that you put on top of each other building something else was it like a very difficult thing like a sharp learning curve do these things work together or I think it's a very natural progression with the majority of cloud provider services so if you look at programming maybe 30 years ago everyone in the mindset that everything had to be built yourself and then people started getting libraries all over the place you just download a library from GitHub and then you get a whole bunch of functions already in place so the way I think of it is that Amazon services is something like that in the visual libraries everything is being made easier to do because it's just there it's pre-created, it's easy to put together like you said a jigsaw but from a developer's background unofficial messaging we like to use cool now I'm really glad to hear that I've been through the same process myself trying to bring multiple AWS services and tied them into one delivery with an application or one particular it's definitely a very welcome progression like in the past if I wanted an encryption library on C I might have recorded myself now I just download a bouncy castle library so I think AWS services is something like that but for the cloud okay any other questions yeah as a security consultant from AWS a manager that asks what kind of freeze did we have to patch our software and then for some of our consultants we need space to see like open to AWS like we have run like maybe a thousand trial for a good lot of things we didn't do any patching because there are some consequences on the application that may impact us we had a lot of attack is on the Windows interface system but on the internet what kind of freeze that maybe we can share with us what kind of what? risk I mean the risk will always be out there as the operating systems and the services you run on at age there will definitely be more and more issues that come and they'll be patched less and less often so if you say that because of the application you are unable to update the OS itself we can look at defence in depth as a mitigation factor so do not maybe you restrict that particular service or vulnerable machine to only specific IP addresses for example or you restrict it within a VPN so you got to connect the VPN before you can access it so in your particular case I would say think about it as security in depth instead of trying to solve the problem directly because you find they can't do it but of course at the end of the day ideally what we want to approach is to to knit the problem in the bud so to upgrade to a later version of the operating system that does not have that problem I would actually challenge you to try as hard as you can to work out how you can treat your service as devices that you don't really care that much about right the important thing is there are four or five that are hosting your production application if you need to kill one in order to replace it with a service you want that shouldn't matter so try and get to that point where you can literally say a server is just an unimportant block of processing capital and then you can patch things at three o'clock in the afternoon on your busiest day you can start taking down a few servers, patch them bring them up, make sure they're okay, service is okay and just keep rolling yeah we have an easy to get an upgrade application things that are elastic but we have a lot of because the application has a big batch processing I can make background I completely sympathize with you more than I mean so some of my customers now they're running 10,000 windows, 2,000 servers which ended like 18 months ago 2 years ago how do we deal with these problems so there's this communication are the internet facing no okay there's one step that we can knock them down on but we don't want to sit there for long so certainly have it out I was seeing some of the risk analysis, look at what the potential attack factors are and then try to shut off, so what you're doing is actually defence in depth but what is vulnerable, the particle surface is vulnerable try to limit that but like he said at the end of the day you try to solve that problem but do you leverage on 3rd party's scanning engine do you leverage on 3rd party's scanning engine to do the vulnerable scanning function bit 9 or carbon black this kind of anti-mailware engine to do that so we do leverage on a couple of open source and some of the stuff we develop internally for the things that you mentioned for example on NASA's scanner and all that a lot of them have their own, of course it's their business right, so they never knows how you can use it so what we do is we develop internally we develop all products and we also try to leverage a bit on the open source community there's a lot of data over there and development is pretty constant for some of the so for example the malware detection rules and so on some of them are pretty good source for that just to add to that in a previous slide at least 6 months ago before I worked for AWS we used carbon black and it's an awesome tool and you can run it on AWS, it works fine oh yes, you might want to check out the AWS marketplace so a lot of the tools are there so you can just deploy them we have a product so in the market my business side I was doing a business so I was doing a market tool to work so so our tool is currently being in the process of integrating with AWS marketplace it's not there yet but our developers are working on it because there are some APIs we have to integrate in the meantime so what we were aiming at initially was we tried to provide a service for every segment of the marketplace for your small, medium businesses enterprise businesses companies are big enough to require something customized so at every level of that we have our own set of tools and services to provide for example we have something called the Hunter Gallery which is an agent deployed on the individual workstations so we use that for incident response if something happens we activate that agent gather information from that particular machine and clip that upload to Amazon S3 and then we get it from Amazon S3 and they clip on our analysis machines so some of these services of course as a startup or small, medium business you're not interested in so we try to provide something that's cost effective and every segment of the business can use I think I'll add to that as well I mean it's a local company based on the ground here in Singapore if you reach out to some of these other companies you're not going to get a response you have to wait until the marriage when the sales rep calls you back it is the guys here locally they got some local customers they flexible their customers so it's not a stuck in stone problem last question what's the EC's injection venture capital your venture capital the funding Fred I can't review that what's the venture capital to inject for this company that's what I'm going to say if there's no questions I have a question you did have stories about migrating from the plane storage maybe this is for the rest of the story what kind of lesson learn or gain point or maybe set back that you have when you do this migration the second question is there any simple calculator for us to estimate the cost we have like a few like GPS storage we have so many things that we have it's actually a cost calculator but I think you're the better guy for sure services versus micro-services you don't have to you don't have to hang a delta on there I'll give you a run to us so there is a calculator you're absolutely right it's called the simple monthly calculator and it basically lists all our services you put in how much you want to use them and it spits out basically the cost that's going to be associated with that so in terms of encryption that really means our service for KMS key management service they give you an example of the cost of KMS it's basically AES 256 encryption it encrypts your server volumes addressed you can also use it to perform ad hoc encryption or files you can push in t-attacks to get outside the text that is charged of $1 per month so $12 for a year if you use one key and then it's a very small charge for api calls so you end up with something like $15 if you use it in anger for one key over the year so $15 very very cheap so it's just the improvement in the process by the way so what you're paying for is the key management and the key protection that's the dollar per month the actual encryption decryption is completely free we want to encourage customers to encrypt their data so we're not going to charge people to encrypt the storage I think another angle on that is for things like 82 EBS volume service is it all transcribed so you can encrypt your 100 terabyte volume I'm going to show you that very quickly but when you do go to things like redshift which is our kind of warehouse sort of products they're absolutely transparent about the performance penalty that encryption may incur so we say generally if you're working with redshift you're going to incur like 5 to 10% performance I believe there is an extra cost because the encrypted storage we have we need extra volume right how many percent is there that can give back to K? quite a bit I will need extra if you want if you want to use one key for KMS encryption you're paying around 15 USD a year yeah okay let's say you're paying a key amount and not the amount of data you're encrypted if you encrypt 10 gigabytes of data 15 USD if you're encrypting 10 petabytes of data 15 USD you're saying there's a differential though between hunting encrypted and encrypted on disk it's like how much space there's going to be but it's a little larger it's a tiny fraction it's not huge it's unmeasurable okay from the AWS point of view do you have any validations to pop into this marketplace? do we have any validations processed to pop into this marketplace? yeah we do so if you want to be listed in the marketplace you have to work with our partner team to attest that you've gone through certain checks and bounds that you're a real company that your product is fit for purpose there's a whole process that our partner team runs and so what you do then is if you look at security products that are listed in the marketplace you can either speak to our partner team like John and Eric who just left give you advice on how to implement them because it's still very much a shared model even if you deploy a partner solution you need to figure out how it works or work with the solution architects from the customer directive we need to do some paperwork to put it on marketplace basically yeah any more questions for punk? no awesome thank you very much he wants to flip me a name card and do it here because my boss is here I'm not on the email address I'm kidding alright so I've just got a couple of slides to run I know we're going over a little bit on time