 Good afternoon. Welcome to threat modeling in the cloud. I am Paige Cruz developer advocate over at Chronosphere and I want you to look at this vending machine and think about what threats you can come up with that are likely for a given vending machine. Maybe somebody hungry who forgot their pocket change and wants a free snack. Maybe a raccoon if this is outside at a public park, raccoons got those opposable thumbs. Maybe something more mysterious like credit cards giving. But what if that vending machine was in a hospital? What if your threat model would not be the same given a hospital? So we'll look at this hilarious memo which is zoomed in says warning do not use endoscopy equipment to steal chocolate from this vending machine and I am willing to bet you know maybe something small like five dollars that you did not have this on your threat model as your options for threats. And what is very important that took me a long time to come around with this idea of threat modeling is it's highly dependent on your environment whether you're in hospital or a park or at a school. Just because you have one vending machine somewhere doesn't mean that you could have a copy and paste threat model to take everywhere with you. So what exactly is threat modeling? It has one of those nice parallels to outside of computing we as humans threat model all the time if you look both ways before you cross the street you were probably worried you'd be flattened like pancake so you took some actions to mitigate that risk and to not bury the lead it is literally about assessing your situation and ranking and prioritizing threats by likelihood and impact. So a very official academic definition here is that it's a structured process to identify security requirements pinpoint threats and potential vulnerabilities quantify their impact and likelihood and prioritize remediation methods. It is not making a big long list of everything that could go wrong that stresses you out and keeps you up in the middle of the night and turning that into a to-do list. It is kind of taking that down what is likely and what is worth us going after. A more real world definition that I vibe with a little bit more is that threat modeling is a conversation similarly to when we talk about DevOps you know you can get into the tool chain you can get into the practices but if your first step isn't talking to somebody on a different team or in a different org you're doing it wrong so very similarly you want to be talking to as many people as possible. When you're starting a threat modeling conversation especially if you're not in security you wonder where to start and there are four guiding questions that come from the threat modeling manifesto. What are we working on? You would be shocked how many teams would come up with different definitions for the work they do or why their service is important to an org. So first things first what are we working on? What is it that we're doing here? My favorite especially for those of us that are nervous Nellie's what could go wrong? This is your fun time to list endoscopy equipment, raccoon stealing the Reese's from the snack machine list out all of the things and then take a look at what are we going to do about it. So if say someone's like oh my god what if a meteor hits you know this az you're like okay what would we be able to do about it? How likely is that maybe we want to look at our our configuration on S3 that is a little bit more likely to cause a security incident for us than a meteor taking out an az. This question really helps you ground the discussion in reality and can avoid some of those rabbit holes that folks like to get into. And finally did we do a good enough job? This is really the crux of the sort of threat modeling feedback loop it's not a conversation you have once it's a conversation you have many times and good enough is subjective to your org your customers or end users your team. It does not imply perfection it is not do are we 100% secure that's as silly as saying we're going to have 100% uptime like that's not going to happen so defining what is good enough and what is enough risk to accept and enough risk that you can mitigate is sort of what puts a whole bow on this threat modeling conversation. And I will say I do recommend threat modeling manifesto.org it is not lost on me that they need to update their SSL cert and I that is very ironic but it is a very good resource aside from that. So who is invited to this discussion? You might think okay your security team you might think maybe the software engineers maybe we bring a manager or two in who's got contacts but you would be surprised at how wide you should go to have invitations. Your architect is going to have a really good high level view of the system not only the technical components and how they interact but as well as what teams work well together what areas of the org maybe have a little bit of a rougher history and would need to have some meetings before the meeting to make sure that this conversation is productive. Of course your software developers who are going to bring perspectives from their daily experience contributing to a particular feature or service or set of services. Customer support is one they are a group that gets often left out of design discussions and security discussions but there is no better person than somebody that works directly with customers to know not only how they're currently using the system but how they abuse the system or the things that are important to them whether it be different security implications for SOC1, SOC2 they're the ones who are on the front lines getting all of that information from customers so you definitely want to invite them. And of course if you've got QA quality assurance please bring them to the table they are holding on to all of the knowledge of the weak spots the untested the unexamined points they probably already have a list in their head of things that they would like to be addressed and fixed bring them to the table. And finally I don't know if you had tech writers on your list but they also are a group that you want to bring to the table because what good is a security fix feature or configuration if your users don't know about it and it's not documented this will come up later but the tech writer plays a very important role in the system. So threat modeling it is a conversation while there is artifacts that get produced as the result of that conversation sort of a report these are the risks we see this is how we think we want to mitigate them. The real important part of this is the conversation that happens the report is great to refer to afterwards to have action items but you really want to have the conversation. Great so it sounds pretty applicable right whether you're cloud or on-prem what would make threat modeling in the cloud different because running in the cloud does not mean that you get to offload all of your threat modeling to google or aws or whoever your hosting provider is there's this notion of a shared responsibility model so there's a line at which you need to worry about what's going on inside your vpc say and there's what google aws so on so forth have to worry about but there's that line in the middle where you both meet and it's very important to not just absolve yourself of security concerns because someone else the google sres can take care of it so this is a big long list we're not going to talk about all of these but if you take a look at these maybe take a second and pull out in your mind what you think the top three threats specifically to cloud-based environments are this is uh this comes from the cloud security alliance this is what they call the egregious 11 sort of like the aws top 10 for web app vulnerabilities all right we've got data breaches misconfiguration and inadequate change control and the one that gets me is the lack of cloud security architecture and strategy like that is in my mind that is sort of when somebody an organization says we're going to the cloud and they take a very much like a lift and shift approach we'll take what we have we'll put it in the cloud and everything will work don't need to change our processes nope that definitely does not work and we'll double click into each of these data breaches if you have been a human with an email address and use that email address as a login anywhere you are definitely in the crowd of people who have had their data breached their PII just it's hit medical providers it's hit educational institutions there's really almost nowhere that hackers haven't tried to get data in or out of and when I was prepping for this I discovered that last last pass is actually facing consequences for two data breaches that they had oftentimes as end users or consumers we kind of left out of companies facing the consequences for this kind of stuff so what happened is one of their competitors tried to capitalize said ah never been breached this is not going to age well I can already tell you that it's not as easy as saying we'll we'll just not get breached um again because it why are they facing a class action lawsuit why are consequences being wrought for last pass versus any one of the other providers you can find on have i been pwn.com well it was not just that they got hacked it was the way in which they responded to that so kind of a big a big long thing from the the lawsuit but it says is instead of waiting months if they had just disclosed the full extent of the breaches when they knew then the end users could have had time to one like switch password providers to change their passwords and really limit the scope of the impact there because they didn't do that there were months that their users were basically left out in the open and that it is sort of the mishandling of that security incident that is biting them because like we said earlier you're not going to be 100% hackproof you're not going to be 100% secure 100% reliable so we can't litigate every single case of that but we can litigate when people are not um taking responsibility and notifying people of risks when they know sort of similar to when folks who find bugs security researchers researchers find bugs and they say all right told this big company they haven't done anything haven't seen a patch haven't seen a release and you wait whatever period of time it is that is reasonable for them to have made those changes before you can release it to the world to say you know that all right top two cloud threat is misconfiguration this one I think about a lot because I I'm hoping somebody was going to build the rest of the gate but if not very much speaks to how many boxes we have in the cloud how many new features and products I can't even keep up with what AWS and GCP release in a given year let alone know how to securely configure them some popular issues include unsecured data storage which we'll talk about excessive permissions that's when someone's like hey um I want to spin this thing up in dev and you're like oh okay I don't want to play whack-a-mole with IAM permission so I'm just going to let you allow all star do whatever you want um or leaving default credentials as is oh my gosh please change defaults um and sort of the more extreme example is when you go in and actively disable your security controls so we're gonna talk about an example from 2018 where level one robotics which is a I learned is an engineering company that works with a lot of big car manufacturers like Volkswagen Chrysler Ford Toyota GM Tesla and the elevator people Tyson Krupp okay so they have a pretty impressive customer base what happened here is they had an rsync server that allowed unauthenticated data transfer to any rsync client it wasn't level ones data that was at risk of being taken it was their customers like what a breach of trust there that would be really hard for me to recover from that so wild like this stuff really happens all the time to very big names it is why we want to take security very seriously and our final threat is the lack just total lack of cloud security I whoever came up with the egregious 11 props on just transparent naming um does what it says on the tin so many of us most of the time I work in SRE and reliability work and they're sort of this idea of the big one until a company has that big incident that floats up to the top of the c-suite it's really hard to get buy in for reliability initiatives very similarly until you have a big breach or a big security risk or maybe a near miss it can be hard to get investments into security unless you're in a highly regulated industry but again I would say there's one year I got I think three medical providers how to send me the oopsie your your data got breached emails and so at this point I'm like what nothing secure who do we trust good questions to ask yourself so what's our example here well Accenture if you've heard of them in 2017 they confirmed that they had inadvertently left a massive store private data across four unsecured s3 buckets it exposed robot highly sensitive passwords secret encryption and decryption keys that could have inflicted tons of damage depending on had somebody if somebody was able to access it store it and make use of it the s3 buckets also contained hundreds of gigabytes of data for their enterprise cloud offering aka their customers data and supports most of the fortune 100 companies so at this point we're like okay basically all the car people got hacked of elevator people now we're talking about a bunch of fortune 100 like nobody is really immune from thinking about this stuff and the real like difficult part here was the data could be downloaded without a password by anyone who just knew the web server address so when you're thinking about this conversation about threat modeling and you're like what is possible what is likely what are things we should think about if we're in the cloud or moving to the cloud a very great place to start is that egregious 11 it will help you avoid some of the rabbit holes of the oh my god a meteor or our whole security team went on an offsite together you know you know single point of failure start with the egregious 11 work your way through those and then see how the conversation goes but what does a successful threat model report look like kind of been talking about the doom and gloom we will turn to a lovely cncf project called argo back in 2021 very proud of the cncf they tasked trail of bits to conduct a component based threat model for argo cd workflows rollouts events kind of the whole argo ecosystem and if you're unfamiliar with what they do it's it's really like ci cd progressive deployments all like kubernetes native for the cloud here's a snippet from that threat model that i found really interesting while argo services often support risk or threat prevention methods yay so like security is there security configurations are there they're frequently under documented and provided on an opt-in basis rather than the default that goes all the way back to including the tech writer into your conversations what good are all of these beautiful security features if people can't find the configuration docs if they don't know it exists it's unrealistic for us to expect for every project that you're using to keep up with the minutia of every single release so yay for tech writers but i get it i was a developer one day once upon a time writing documentation can be hard we can get writers block we don't know what level of information to provide are we trying to help a new person who's not on our team spin up our app are we is this just something internal only for the platform or sre team and you don't need to go into details of how to set up terraform like i get that it can be difficult and one of the findings from the argo threat model was this concept of minimal service documentation what is sort of the minimum amount so you don't need to get writers block you don't need to be frustrated what's just enough so that any one of those groups we talked about whether it's your tech writer whether it is your qa whether it's developer p.m whoever what is just enough info for them to understand the scope of what the service or feature does and be able to participate in the threat modeling combo we're not going to read all of this but what i want to point out is these last two sentences just very straightforward after a workflow is compete completed a copy of the workflow is made and stored in a sequel database produce artifacts are stored in services like s3 and minio simple it's to the point it's following the flow of data which is something that we are concerned about and it's not going off into rabbit holes beautiful something to aim for and at the end of this presentation there is a section for i think it's about five or six questions that you can answer that fulfills minimal service documentation so you can use that as a template and throw it in a read me so what about threat modeling and you this is strategies and it's sort of a roadmap for you to follow to conduct your own threat modeling conversation if it's a new practice for you maybe you don't have a security team yet or security engineer start small pick maybe the most important feature or workflow maybe pick one single service to start with it should be an important one then you want to schedule the meeting add in the agenda the four guiding questions if you are not using the agenda field in your meeting your meeting invites this is the time to start we want people to come to the table prepared understanding why they're there what they're there to talk about their responsibilities sort of what the end goal is so if you give people the four guiding questions up front gives them time to think about it in the background do some more research and come really well prepared then assuming that you also are bringing in your tech writers and your folks who don't typically think of themselves as security practitioners or professionals send them the threat modeling manifesto it is very short it is like maybe two scrolls on a web page um and just let them become familiar with what this process and this conversation is going to be because again the whole goal is a very fruitful discussion um which will not happen if people don't understand what they're doing and why it matters another great thing to have before the meeting is that minimal service documentation take a stab at answering those five or six questions that I'll throw up at the end um and as a visual learner a picture is worth a thousand words go ahead and make an architecture diagram go ahead and make that request workflow if you've got distributed tracing you get that for free go ahead and take a picture of that service chart now this is also this is just like good meeting habits that I don't see practiced often enough assign a scribe to capture notes that is very important so that person can clarify can slow the roll if a bunch of people start diving in or disagreeing or things get heated the scribe can say hey I'm trying to keep a record like let's slow down one at a time can you repeat that very important role and the second role that you want to have is a facilitator they keep time they make sure no rabbit holes are gone down and they're there to just steward everybody through this process and finally sharing is caring tell everybody your findings write them up and then go to the next feature the next service whatever it is maybe make this a monthly meeting a weekly meeting figure out a cadence that works for you until you've got kind of a good set and a good understanding of risks and how you want to mitigate them or if they're worth mitigating in the first place dow Jones big company has a big impact on the world what happened with dow Jones and security in 2019 they experienced a data exposure and what happened is an unauthorized third party vendor failed to password protect an aws hosted elastic search instance and because of that that database was available to anybody and was very easily found by any iot like search engines it was not even security through obscurity like people could find this and this misconfigured database was not even discovered by dow Jones it was discovered by security researcher who reported it props to them and if we wanted to take a look at how would we write this down in a threat model or how do we think about this type of situation most of us probably have hosted aws resources or cloud resources you don't always operate every single thing yourself or on bare metal so we would say all right an actor that's what we'll call our attacker they any actor would be able to with a very simple search access this database and get the data inside okay what is the impact there well depends on what data is in there we talked earlier about a data breach that involved passwords a data breach that involved encryption keys like that scope of impact is huge maybe it's an encryption key for something in your staging environment less of an impact than something in production still not great and you should still probably have the same sorts of security configurations all throughout your environments but that's sort of one way to think about it who is this actor who's attacking and the fun part about security it can be an external threat or it can be an internal threat lots of lots of scenarios to think through and so again the thing to take away is threat modeling is not a report it's not a checklist it's not this passive activity that regulators are forcing you to do it is a very real conversation between all of the different people who work on a system that are in charge of making it reliable available and secure if you have data that you care about you should be having these types of conversations so the slides are available I've everything that I've referenced here these are all links the manifesto I threw an oasps top 10 web apps because that's just good to have if you would like a very fun dive into security vulnerabilities hiding malware and docker desktop is one of my favorite reads I go back and read it like once a year and then the minor attack adversarial knowledge base they've got a one-on-one blog post on getting started but they also have a huge list of different specific attacks and scenario so if you're having trouble coming up with what you think they could be for your org or your system or you just want to ground the conversation in reality people have done this research it is available definitely make use of it and again here is that minimal service dog documentation and you want to think think back to those two sentences of that data workflow simple high-level run it by somebody before don't just call it good maybe run it by someone say that's too much too little a feedback framework I use is abcd what's awesome what's boring what's cool and what doesn't need to be there and again we definitely you're focusing on the data workflow because that's often what people are trying to get at the end of the day and again the egregious 11 this is them in order of egregiosity I suppose with our top three that we already talked about and let's chat I will be around the conference I generally wear exciting things and I will be staffing the CNCF booth g3 this Friday afternoon I'm on mastodon you can ping me at my work email if you would like or you can visit my blog post most of it's on security reliability stuff but I am delving more and more into security these days after hearing this stat that was something like I'm not going to get it wrong now because I'm thinking about it on the spot but it was like 80 percent of developers don't consider security to be a part of their job which like wow shame on us we need to do better and I'm hoping today you can also start to be that change and have a threat modeling conversation tomorrow when you get back to work with your family it's a it's a very good practice to get into and with that we've got some time for questions and if not thank you very much there we go hi awesome talk I learned a lot practically when I look at services oftentimes there's like there are just too many microservices what a mistake yeah so how how does one do threat modeling when they have hundreds and hundreds of microservices that they deployed without realizing that threat modeling was a thing and they should have done it yeah that in that case that's where I would go to the workflow approach so all of those microservices at the end of the day they're all work most of them are working together to fulfill a smaller subset of user needs so whether that's login whether that's checkout whether that's I'm going to upload my passport so that united can prebook me or whatever like very sensitive data definitely hope they're hopefully they're not on my breach list next year so I would think through maybe what are the top 10 workflows or what are the most critical if you work with big companies or small companies who do you need to protect what data and why and then maybe as far as the services go that's where you want to I think I keep hearing this thing called like the salsa framework like there's a lot of things you can do for the automated platform platformification of security scans and stuff I would throw that on sprinkle that on your microservices but save the conversations for those bigger workflows that traverse multiple then you'll be bringing multiple engineering teams to the table and it'll be a richer discussion yeah it's kind of the same thing with monitoring like so many microservices SLOs go to the workflow go to the the things that your users are actually doing with the system this is overwhelming any more question any tools to automate like the threat modeling to like just grab some of the information from these services to fill in some of the like instead of having a meeting with like each team and asking all these questions yeah so the question I think because I think the mic is a little low the question is how can we automate this what if we've got a bunch of teams or services what can we do to sort of speed this along or do this at a bigger scale yeah because it might change also like that changes all the yeah clouds are very dynamic especially if you've got lots of the microservices so in that case when I when I hear things like automation how do we do this that scale I think what is machine readable what's available if you have if you use get hub or get lab there are like PR templates make the minimal service documentation a template then from there you could write a parser whatever do your jq your yq pull out the data and then essentially what I didn't get into is the I don't know if I can pull it up but the argothreat model the reports that comes out is a big table it says these are the risks that we see in order this is the component it affects this is the the risk or the threat and then this is either we're going to patch it either we're going to redesign it or we're going to leave it alone because it's not a big deal so what you could automate is sort of the generation of that report and then at least have smaller discussions about what are you going to do about it because the computer computer can't tell us how to secure itself otherwise I wouldn't be here having this talk so you can automate the drudgery of that totally and then just scope the conversations really targeted to we're only going to talk about the top two on this list or you I mean you could have people submit a form I'm big on the conversations just as an sre we're left out a lot in the design and it's always better to just catch something early on yeah go seahawks love the hat all right excited to hear how all of your threat modeling conversations go we're going to secure the world no more breaches yes thank you thanks