 Cool. All right guys, let's get started. So today we have a guest lecture by John Jenset He was a PhD student with us a number of years ago and he TA'd this class many times and he's built a bunch of things in his class Including the anonymous question system that you should feel free to use to ask him questions But also he built an upper lab assignments and lectures and their stations for us So he knows a lot about this class and when I asked him to give a guest lecture he said one thing we don't actually teach well enough is actually reasoning about the security of lots of software and He's thought you guys should know about this from his experience working at Amazon now. He's working Well, he'll say what he does that, but thank you Hi folks, my name is John So as as Nikolai said, I am previously from MIT. I was in Pitas for many years And then now I work basically maintaining all the rust build tooling and infrastructure at Amazon web services I'm John who on the internet if you ever want to find me anywhere else Feel free to send me questions after this lecture if you're interested like I'm always happy to talk about Anything really and use the anonymous question system like if you don't feel comfortable raising your hand then just use it I will happily take questions whenever you have any that's the best way for you to learn and for me to figure out What I'm not explaining well So when Nikolai reached out to me asking, you know, what should I cover? He basically gave me the open floor of like what do you want to talk about and I look through the the semester notes And I was like there's one thing that I spend a lot of time thinking about at Amazon that is not represented in the curriculum at all and it is supply chain security and Over the course of the next like hour I'll hopefully be able to convince you that supply chain security is hard and important and we don't know how to do it Well, and hopefully you'll be scared by the end and then you'll go do your research and figure out how to do it well So what does supply chain security really mean? Well supply chain security is basically the observation that all of the software that you use matters with an emphasis here on all Not just the code of the current thing you're building But all of the other software that's part of that pipeline be it the dependencies you take be it the other software That's present on the host be it the stuff that you use in order to build your software All of that and it's not just whether that code is insecure It's also whether it could be insecure whether it could be manipulated and we'll look at some of the ways in which that Both can happen, but also has happened repeatedly in the past where I want to start here is there was a Survey done of projects both open source and private over the past many years there's a survey that runs basically every year from a company called sonotype and one of the things they found was that there's Been almost eight hundred percent increase year-over-year in the number of supply chain attacks over the past three years So this is clearly a growing cost for concern and not just that but six out of every seven vulnerabilities and projects come from their Dependencies not from the project itself It is not that you wrote a bug in your code is that other people wrote bugs in their code and you're using what they built And what's interesting is it's not just this one study this pops up everywhere So this is some organization called the NISA which is the European Union Agency for cybersecurity and how that ties into that acronym I'll leave you to find out on your own But but they ran a survey this year where they tried to figure out on behalf of the EU What are the biggest cybersecurity threats for 2030 and the number one thing is supply chain compromise of software dependencies And that's what we're going to talk about today Now interestingly in the EU we've started to see a regulation for this So there's regulation now in the EU that requires companies and parts of governments as well to document how their supply chain is secured both hardware and software We've seen this elsewhere too like in Japan There was recently a law passed that mandated that both certain government Sort of branches, but also key critical infrastructure like the power grid Was required to guard a lot of their supply chain because that's where they worry about like state nation actors coming in and attacking their infrastructure Japan is not the only place even here in the US. We got an executive order in 2021 That basically said among other things fix your damn supply chains And then of course there wasn't really anything that came out of that for a while because figuring out How do you fix that is really really hard in? In in 2022 NIST put out this 400-page document on how to like even reason about whether you just apply chain is secure and what's in it and I don't recommend you go read this document It is very long and there are a lot of words when they're don't necessarily need to be but some of the things that We'll talk about in this talk are very brief summaries of what came out of this document and other related kind of service of a happen In the US there aren't any laws yet that require you to secure your supply chain or even prove that you know what your supply chain is But we expect that to actually come pretty soon probably in the next year or so and there are a lot of companies They're scrambling to try to get on top of this now ahead of when there might come regulation The UK is also giving out supply chain guidance mapping Trying to tell companies how do you even figure out what all your dependencies are and how should you represent? them and so this is a worldwide phenomenon that we're seeing and something the industry is only really starting to realize over the past three to five years and When you have a moment, I recommend you go to this particular URL This is from that sonotype survey that I mentioned earlier where they go through the past 25 years or so and look at all Of the issues and attacks that they've seen that depend on supply chain vulnerabilities and it is Filled with huge problems and it is a very scary thing to read And if you look especially over the last few years, you see a lot of them crop up and we'll talk about some of them a little bit later What's particularly worrying about supply chain security is when you talk to engineers versus managers Engineers all know that this is a problem. They all go we have all this code here that we don't know how works We haven't checked it. No one has looked at it. We don't know what it does and managers all cabinet claim We have a total overview of everything we use. We know it's all fine. If there's ever a problem We can fix fix it within a day. No problem whatsoever So this is clearly a problem that engineers know is real But further up in these companies people are like adds probably fine it it really is not Know so what really is supply chain security? Well supply chain security is basically these three questions Do you know what you are deploying where do you know where it came from and you know? What's in it and the answer to all of these three better be yes, but in practice? It's almost certainly no or at best a sort of So let's go through you these one one at a time. Let's start with what are you deploying where? So this is all about really deployment logging This is about being able to answer questions like this like what software is currently running at a given host What software was on this host at some given point in time earlier? Why did a deploy happen to a given host where are the artifacts of some version of some software deployed at what time? Were we no longer using a given version anywhere in our deployments and what? Configuration did that software have on that host at that time Some of these are for knowing where we might be vulnerable right like if you learn there's a vulnerability in some Software that you're using you want to be able to say well that's in use on those 10 boxes over there But not on this data center over here. So that's sort of very standard But some of it for is for where and when were we vulnerable? So you discover that there was a vulnerability that was first like started being seen in the wild in May of 2021 do you know exactly what window you were vulnerable during and which of your hosts may have been compromised? This is all in order to figure out after the fact where you might now have problems. You need to go investigate Some of them are for for proactive analysis like how many different versions are we using at once? I can imagine you pull in open SSL into your dependency tree for your software There are a lot of versions of open SSL out there There are a lot of variants of open SSL if you discovered they're using like 15 different versions across all of your fleets of hardware That's probably a problem and you should go clean that up and this kind of analysis can tell you that Note here that I'm using artifacts of software version and not software version And this is because as we'll talk about later It can be really you can get into really weird situations where it it's a compromise of let's say your C compiler If your C compiler is compromised then your C compiler isn't deployed to any of your hosts But things that you built using that C compiler may be deployed to lots of your host So you want to look at that whole supply chain Now the conclusion from this particular segment is really you should just log everything And in fact every time you deploy software anywhere you should log exactly why and how that happened How was it initiated? When did it happen? What went into it and what was deployed to and by now? You know as security conscious students You're aware of the fact that this system that does the logging is itself something that needs to be audited and logged Because if an hacker manages to subvert your logging then they could just hide the fact that they were that they did anything bad in the first place by Obscuring your logs So this system needs to be append only it needs to be durable It needs to be kept long-term like you might not know that there's a vulnerability in the wild until like seven years later In which case you still need to be able to go back and figure out that there was a problem This first one is a little weird like why track how it was initiated and what does that even mean? This from comes from a class of attacks that we've seen a lot over the past I want to say three years Which is around leaked credentials from continuous integration and continuous deployment systems like Travis CI Heroku GitHub actions those kind of things where those are pipelines that usually have access to your deployment keys and your deployment secrets And so if one of those services get compromised which happens all the time Then suddenly now people have the keys to deploy software on your behalf And so you really want to trigger what caused this which credential was used in order to publish a new version of our software So that you can then go back in and validate that if necessary And this append only bit as important to even if you roll back your software to an earlier version Like you just you deploy something you discover this a bug and now you deploy the old version again Because it's going to take you some time to fix it you want to record that there was a Period of time where you were using that other version of the software in case that was the window during which you were vulnerable so everything has to be logged for a long period of time and Every host matters here We're not just talking about like physical servers that you put software on if you put things on developer environments You need to track that because those might be compromised if you deploy to beta environments or testing environments or benchmarking Environments you need to track those if you deploy to embedded devices like robots or drones You certainly want to know what every single one of them is running at any given point in time Customer devices are even worse like if someone if you deploy software to like Android phones or to Pacemakers or something like it you really want to know exactly what version of every piece of software went into that So there if there is a problem you can notify the people that it might affect and hopefully update them in time There are other environments too that aren't even really computers that are just sort of Abstract computing resources like lambdas or cloudflare workers or whatever it might be That's also software that you run where a compromise there might lead then to a compromise of your database And you really want to track what goes in there, too Okay, so let's move to the the second question here where it came from this one is Surprisingly difficult, but let's start with a relatively simple question Can you trace every artifact that you deploy back to sources that you trust? And this turns out to be not quite a turtles all the way down kind of problem But it's pretty close as we'll see You want a verified path from only trust anchors to your deployment and what do I mean by trust anchors trust anchors and security are Sources that you assume are trustworthy rather than derive are trustworthy So these are things like you might just say I have a contract with Microsoft I'm gonna assume that everything assigned by their key is indeed from Microsoft and they have their shit together You you might assume that might not be a sane assumption But you might assume that and you might assume it because it's okay If it turns out something is wrong there because you can go sue Microsoft because you had a contract with them Right, so that these are often soft kinds of things that affect your security assumptions. Sometimes the assumption is You know, I trust the software that a niche writes So if it came from a niches private key, I'm fine deploying it But anything that I have to derive from that it has to derive back to a niches key. Otherwise, I don't trust it and There's sort of two paths here one is if you just downloaded an artifact from the internet somewhere and then you deployed it then you have a couple of questions to Ask yourself. The other is if you built it yourself You have to ask a different set of questions if you downloaded it from the internet Then the question of course is do you trust the entity that built it and you trust the entity that you downloaded it from and This might be a simple. Yes or no question depending on where you got it from This is also a question of how do you know that that entity actually built it if you download something from, you know Github releases Github didn't build it some engineer built it and uploaded it to github Do you trust the engineer that built it and do you trust that that's the same person who authored the software in the first place and Even if you think you know what entity published the thing that you downloaded Do you did that entity verify the other questions? We're about to get to that you should ask yourself when you build something and crucially, how do you know? Even if someone puts on their website, we checked all of these things trust us They've a stamp on there and they'd sign their name It doesn't really matter if you don't have a way to verify that they actually went through the steps because then that just means that you are Trusting them by assumption So what are these questions you might ask yourself if you are building from source? Which we generally assume to be the safer thing to do right you build it yourself. Well, how did you get the source? Where did you get it from? How do you know that this is actually the true source for the software that you're intending to build is That source what the author intended to publish like you got it from github But does that mean that the author intended to publish that code to github and there's been no interference in between? Do you trust the tools that you downloaded the source with do you have a malicious git installed on your computer? Malicious curl malicious kernel malicious router in the middle and you didn't verify your TLS certificates Do you trust the tools you verified the source with let's say you downloaded it and you ran like, you know GPG verifies signature and you have the public key of a niche and it says yes the niche signed this you trust So you don't have a malicious GPG on your box Maybe how do you know do you trust the tools you built the artifact with your your GCC your clan? Your you know a Glasgow Haskell compiler whatever it might be. How do you trust that? You trust the host you're building the source code on if you're building this on your laptop What's the chance that your laptop has some other nefarious software on there? That might actually be running in the background and modifying the source code just between when you check the signature when you run your compiler Now all of these sound like I'm just being paranoid But in reality this is what we mean by supply chain security you want to make sure that there's no point in the pipeline from like the author's brain to your deployed artifacts where someone can sneak in and modify or or infect in some way that supply chain and Hopefully even just from these questions you realize this is really hard because there are so many places this can go wrong And some of these are just fundamentally difficult problems as an example of some of this so In Java there is this Website called Maven central that is a repository of third-party software So anyone can upload software to Maven central and if you're using Java It's relatively easy to take a dependency on something that someone has published there Now in Java generally what people publish are jars or Java archive files, which are essentially just binary blobs There is zip file of Java bytecode So if you get one you don't really know what the source that went into that is But Maven central has thought of this and they allow you to publish the source code Alongside the jar right so you as an author can upload a jar and then also upload the source code The problem is there's nothing that guarantees that the two actually map in any meaningful sense The two are separate artifacts in Maven central and one is source and the other is the jar But you don't have to give any guarantee that that jar came from that source It's just whoever uploaded it gets to say that they were the same But there's no way for you to verify that that is actually true Ultimately you need to choose what you're going to trust here and that might be the authors like you might want to go all the way back There it might be that you mark particular instances of source code as trusted like this hash of the open SSL source code I John have looked at and I know it's okay, and therefore you build it in that way You don't need to trust the authors Maybe you trust tools that you run over that source code to do vulnerability scanning or whatnot You trust that those are okay, but you have to trust something and figuring out what those things are and making that explicit in your threat model is really really important because you can come back to bite you otherwise and Tainted sources are real and what I want to do now is give you some examples of All of the ways in which these parts of the supply chain have been exploited over just the past two years So in 2021 there was a big hubba-balloo Especially in the sort of commercial world around dependency confusion there was a researcher who observed that you know a lot of a lot of Software these days takes third-party dependencies great And they wanted to try to hack into PayPal for unrelated reasons And as part of that they found some public source code that was part of PayPal It was on a github repository somewhere there weren't any vulnerabilities in that code itself, but they found this little bit of code And it might be hard to read the red text here But these are just more package names that happen to include PayPal in the name And when they looked really carefully they discovered that all the ones in blue here were on the public registry of no JS packages npm.com Totally normal packages someone owned the ones in red though were not present there So they were presumably some kind of internal private PayPal dependencies that weren't published anywhere And the security researcher went I wonder what happens if I publish one of these Like I'm gonna take PP logger, and I'm just gonna create it on npm.com and I'm gonna stick my own code in there They did so and the code they stuck in there didn't do anything nefarious It would just ping in their server like a server that they controlled to let them know if someone had downloaded and tried to build the package They did that for these packages and within about 15 minutes of publishing it They started getting pings to their server from inside of PayPal And then they started doing this for a couple of other package names that they thought might be in use of different companies And over the next couple of days they managed to get direct contacts like there were servers at these companies contacting their build server and could run arbitrary code as part of the build at Tesla at uber at Yelp at Netflix at Apple at Microsoft at PayPal and the list goes on Because these companies didn't have their build infrastructure configured in such a way they would prefer First party like private internal dependencies to third-party ones the way their builds would work Is that when they look through this dependency file? They would first look on npm.com and if it was there it would stop searching and just use whatever was there And only if it wasn't found there would it go to the internal repository? And so all of these companies found themselves scrambling for fixing their internal build infrastructure Another example of this is the solar winds attack from 2020. This is a little over two years ago So I apologize for lying to you now solar winds you may have heard of because this was a big news event it was called like the largest cyber attack ever and The subtleties are of the attack are a little convoluted, but very briefly Solar winds was a company that produced a piece of software called Orion and Orion was sort of an IT security monitoring software and I know oh, yes the irony this was a piece of software that Solar winds sold it to companies for companies to install on all of their hosts and especially any like laptops or desktops That they gave to their employees to monitor things like do they have all the security updates installed? Is there any nefarious software running on this host to give some kind of like remote? management admin capabilities logging of like temperature like just things that IT wants to monitor about the devices that they own and Solar winds Orion had a built-in auto update feature like a lot of the software does seems pretty reasonable It would download updates from a server the solar winds controlled and those updates were signed by a private key at solar winds seems pretty reasonable and Then one day there was a new auto update for solar winds Orion and this auto update happened to include a DLL like a Windows shared library file that included a backdoor that would dial out to a server fetch a payload and that executed locally The update was signed so it came from solar winds No one really knows how it got signed. We do have an idea of how it managed to get Picked up by the auto update system, which is solar winds was running an FTP server That this software would connect to and download the update the password on the FTP server was solar winds one two three And so of course people would upload files there We still don't know how they managed to get a valid signature in there though But this just demonstrates that it's hard to get these things right like this company was signing their updates And yet it still hit them How would you even detect if something like this what's happening right if you were running a company and you install this to monitor the security of the host that you manage How would you detect that there was a backdoor in the thing that was supposed to detect if bad things were happening on the host? Another example came up in 2021 and this is a different kind of attack again All of these are different attack vectors So you need to think about the unit some researchers at the University of Minnesota Ran a study on the feasibility of stealthily introducing vulnerabilities in open-source software via what they called hypocrite commits and hypocrite commits by their definition was essentially a commit where the author is introducing a legitimate bug fix or feature into an open-source project But as part of the code that they submit that code has an intentional error in them that they will later exploit and The way they chose to do this was they submitted changes to the Linux kernel And these were legitimate fixes to the Linux kernel like they actually fixed problems or introduced features that people wanted they went through reviews and They almost ended up landing on the main branch of the Linux kernel and that it only got caught at the very last end Where someone said hmm this code doesn't look quite right And it was like some off by one error like it looks completely fine But the authors knew it was there and were planning and exploit later on for when it actually made it into Linux And this is really worrying because it means that if it hadn't made it into the Linux kernel This is the true source code like endorsed by everyone who reviewed that code If you downloaded and checked all the signatures everything would check out But the moment you deploy that software you're now vulnerable to the attacker who introduced those commits in the first place in this particular case the Linux kernel banned the entire University of Minnesota from making contributions to the Linux kernel which we can argue about whether that was the right solution or Not but clearly this is a worrying problem This attack vector is very real and scary and we don't have great defenses for it except make sure you review code well, right? We also have alluded to this earlier credential leaks and these happen Constantly there's a company called Git Guardian that runs a sort of survey every year of Just public credential leaks they scan things like github repositories and they just just look for secrets that people have committed like you get a Secret token from AWS or you get a secret token from github actions or Travis or heroku And you put it in your like source tree and you commit that file You push to github or you commit your like SSH private key on github and then you happen to have a public repo They found that in 2022 there are 10 million new sequences as exposed on github in 2022 One get to our code author out of 10 exposed the secret in one of their public github repos this happens all the time Travis CI had a break in in 2022 where attackers stole a hundred thousand npm.com logins This means that the attackers were able to publish new versions as though they were the author of a hundred thousand npm packages if not more This is worrying because it means that if you take a dependency on anything from npm.com How do you know that the version that you downloaded was actually published by the author that's listed there? It looks like it was uploaded by the right person, but it could just be someone who has that credential This makes it hard to trust that any third-party artifact or code is actually from the author And you might say but John just have them sign it and we've talked about how there are problems with that too But even even if you could many of these repositories like npm.com like rusts crates.io Like pi pi do not support signing You just cannot sign these packages that you upload in a way that the standard tools will verify So there isn't even a way to sign except you have the token that lets you upload and And sometimes these registries like npm.com and I'm picking on npm here Just because there are a bunch of examples and they're pretty large not necessarily because they're doing anything particularly bad but Those registries can be compromised themselves if someone hacked into npm.com They could publish new versions of any dependency that they wanted or modify the code of existing ones If you want to go to something else like in in 2021 the PHP projects get repository was hacked so PHP for various reasons decided to run their own get server rather than use something like get over get lab and Running your own for infrastructure is always hard but in this case their get server got hacked into and then someone introduced a commit into the PHP get history authored by one of the lead developers with the subject line fixed typo and The contents of the diff of that commit was introducing a remotely executable backdoor into all PHP servers So if a new PHP version had been cut from this commit Then anyone who ran PHP as of that version you would be able to get remote access to just by passing a particular HTTP header How do you check that this happens in this particular case? They managed to find out about the break in a couple of hours later and managed to revert that commit But again, this is modifying like the version control the thing that you would assume is the ground truth for a project And it looks like it's from a legitimate author too because no signing was involved We also have rogue maintainers like sometimes it's not even the process. That's the problem. It's the actual author That's the problem in 2022. We had a bunch of instances of this So for example, there was an open-source developer of the faker JS and colors JS JavaScript libraries Who was just fed up with companies using their code without paying for it and decided to just completely corrupt those packages They published new versions of each of them that got automatically picked up as a dependency and those corrupted versions We just break your software They would print the some like string like pay money or something 10 times and then just exit your program This is a legitimate version from the author, but it is malicious that this top-right one is funny This was a sort of civil war amongst Developers who published packages that helped you write malware So on npm.com. There are a bunch of packages that are intent. They're like libraries for malware authors and There was like a feud between some of them where one of these libraries if it detected that the other Library was present on your machine would go in and rewrite to that other version of the software to do something else Again, this is the author being malicious This bottom-right one, too This was a maintainer who was wanting to protest the invasion of Ukraine and the way that they did this was they modified their JavaScript library to have a background thread That would constantly ping a server and look up your GUIP range And if it was Russia or Belarus, it would corrupt all the files on your hard drive and make the contents of them Be just the heart emoji like How do you defend against something like this if you have a dependency on this thing that was totally fine in the past? This was the node IPC package This is difficult, right because ultimately we have to either choose to trust particular instances of the source code or the author And if you choose instances of the source code getting new updates is really hard and if you choose the author They can do things like this Back in 2016 just to demonstrate how far back this thing goes the Linux mint distro had someone hack into their download server and Modified the disk image that people would download in order to install Linux mint and they replaced it And I think it was replaced for about a month or so That version of Linux mint if you would download it during that time period you would install Linux Everything would seem to be fine, but your kernel would have a backdoor into it The they hacked into the website so all the hashes checked out because the hashes are on the website how Would you detect that this happened and Fighting these kinds of tainted sources is really difficult. There's no doubt about that There's some things we can do to make things better things like SIG store, which is a protocol for Signing artifacts that go up into package repositories this stuff like the update framework, which is a protocol that Registries like npm or pi pi or crates.io can use to ensure that new versions cannot be masked from other users The new versions to get released do have to be eventually displayed that the registry can't start modifying Versions of software that they've previously published or claimed to release new versions without the knowledge of the author So that helps a little bit you can mandate to factor authentication for publishing To any of these registries to mitigate things like leaked credentials You can have automated monitoring of known risks like if someone files a CD for let's say this colors are JS library You would want to know immediately in an automated fashion that you're now at risk, right? And so some of that can help but ultimately you're at the mercy of authors And what this means is when you're taking dependencies you want to think really critically about who you are willing to take dependencies from Is it a company or is it an individual if it's a company is it one that you can have a contractual? You know connection with if it's an individual can you do the same Can you pay them to make it less likely that they do something bad? Or otherwise just generally how trustworthy are they as someone that you think you can rely on over time? It's difficult There's a lot more to this list to like think of things like if I as the publisher of some source code if my box That I'm publishing from miscompromised Then I can do all the checks that I want and I can not have leaked my credentials anywhere But something else might still tamper with the source code before it gets published There are tools here like automated scanners that try to detect, you know nefarious code patterns of whatnot you can use those But they're ultimately all heuristics they help a little bit, but they're not going to solve this problem entirely for you This reminds me of this quote from Carl Sagan if you wish to make an apple pie from scratch You must first invent the universe it feels a little bit like that right where if you don't want to have any Vulnerability your dependency closure you have to write everything yourself that includes your compiler includes your kernel and cruise like Everything and so ultimately you have to choose somewhere where you say stop. I'm going to trust this thing and hopefully with some decent Reasons for why you think that thing is trustworthy Okay, third question. What's in it and this might seem similar to the previous question Like what's the difference between where it came from and what's in it? But what's in it is a little bit different This is a list of all the stuff that ended up getting pulled into the build or affecting the build so that later on you Have some insight into all of those like edges in the dependency graph that you might need to think about Ultimately, you know, you might have one artifact you might have a single binary that you deploy to your hosts But you have many inputs to that and we've talked about regular dependencies, right? That's one thing but you can have dependencies from the build host like imagine that I have open SSL Just installed on my computer. It's not getting pulled as part of like my npm dependencies or my rust dependencies But it is linked into the final binary or it might even be a runtime dependency So only at the time when I run it does it like dynamically linked with open SSL These kinds of dependencies from the build hose need to be captured Downloads during the builds there are a bunch of packages in the open source ecosystem that when you do the build They'll helpfully go like oh, you don't have lib SSH install So I'll download it for you build the source code and then include it in the binary that I ship Very helpful, but it does mean that now it downloaded lib SSH as part of its build And you need to track that you have that dependency in there because if there's a vulnerability in lib SSH You need to rebuild your software and update it because it ultimately way it made its way into that binary Same with vendor or inline sources Sometimes projects will just copy paste source code from some other project or the entirety of that other project source code You really want to track that too because now there's a dependency edge between you and that project That you might need to track Sometimes they just bundle binary artifacts, and I'm particularly here looking at the NVIDIA graphics driver drivers for Linux where there's just a binary blob in there That's just like this is the firmware that NVIDIA gave us and we have to use this blob and no one really knows What's in there? You probably want to track that you pull that into whatever you deploy to because that might be issues with that as well and Of course any of these above So even if none of your direct dependencies have these problems your transitive dependencies your indirect dependencies might have these problems Finding all of these is really tricky Even if you have the source code and if you don't it's doubly so there there exists some software that tries to just like Automatically find all of these kinds of semi hidden dependency edges for you, but they're all heuristic space. They're all best effort And ultimately here heuristics will only get you so far We need something that can do better that can truthfully represent Every piece of software or code that is in your actual software that you deploy And so that's where we get to the sort of It's called the second main topic of the day, which is the software bill of materials The software bill of materials or bill of materials in general is essentially an an inventory Sort of like a recipe, but really more of a list of ingredients It is a list of all of the things that Made their way into or affected a given software artifact And a lot of the things that I talked about earlier like the the EU directives the US directives Talk about companies having to supply these for every piece of software that they have deployed anywhere And so this is something that we're seeing increasingly being adopted by build systems by companies and being mandated by law Now software bill of materials as we'll get to in a second are also sort of a trust exercise Because they're not verified. They are an assertion by the author saying I pulled these things into my software they are a Contribution from the author that you may or may not trust The observation here though is that it's better to have this than to only rely on heuristics Ideally you do both so you have a declaration of what went in and then you also have detection of other things that might have made it in That aren't in the bill of materials and ultimately that should give you a more complete overview of all of the places where you might have vulnerabilities And these bill of materials have existed elsewhere for ages they started in car Manufacturing and since they've just been deployed everywhere from like the aviation industry to bike Manufacturing to book manufacturing Everywhere and the the basic premise is that you list like in car manufacturing for example If you have an engine in a car you list all of the parts and you list when they were made what factory they were made in How did they what was like the transport route that they took to where they are when were they produced who put the engine together? What location was it put together at so that you have a full inventory of anything where if there's a problem with any of These things that we know that this engine was affected and It turns out that having this inventory helps for a lot of things it helps for designers. So if you are Designing whether it be software or a car engine you can say This part goes here and when you say this part, there's some actual meaningful way to identify that part It is for sales like which parts do I order in order to build one of these engines and Software this might be something like which licenses do I need to pay for in order to be able to deploy the software? or which Which software in my deployment stack includes GPL and so maybe I need to think twice about whether I'm using it in this way It helps for manufacturing like when you're putting a thing together Which part goes where I think of it like a Lego instruction manual or a key or something where it says like peace like three B goes here In repair which part broke and software this would be something like debugging where you might want to go back and say well our software broke when we bumped the version of Lib SSH from 1.2.3 to 1.2.4 That's when our application stopped working And so you know that by virtue of the fact that your bill of materials says that was the change between these two deployments and Recall is the affected part present if someone discovers that a particular engine part had a Manufacturing error in it they can go back and look at what are all of the places where like this batch of parts That we know as faulty made it into and you can recall from those people saying these things are now problematic and In software, you know It's the similar kind of thing where you publish something like a CV Saying this version of this software is known to be bad that don't use it and hopefully the people on the consuming side know To look for those CVs Now this kind of provenance or origin information is is useful for a number of different things It's useful for security, right? It tells you if something is at risk something like a CV But it can also tell you how it's at risk. So for example if Let's say there is a vulnerability in bash Actually bash is a terrible example because it affects everything. Let's do Lib SSH Or Lib Z the compression library. So in Lib Z, let's say there's a problem in Lib Z And Lib Z is used to build the Haskell compiler which is used to build the Documentation generation system for a test package of a dependency that you use It's probably fine if there's a CV for Lib Z. It probably does not affect your stuff It might but it's a very extreme path So by virtue of knowing that that's the way that you got Lib Z You actually know that you don't need to care about this problem Or similarly if you detect that I'm not using Lib Z anywhere then you know you're not at risk there are a lot of CV's published every day and Not having to deal with all of them is valuable And of course obviously it can tell you if you are at risk and how and you might need to deal with it in that case Yeah, as I mentioned it can help with license and compliance information But it can also help with supply chain funding at least in theory It will tell you all of the open-source projects that you depend on and maybe you should go pay some money to them So that the authors don't turn malicious or just you know out of the goodness of your corporate heart But there are all sorts of ways in which you can use this use having this inventory to improve your processes like waste Identification like I mentioned before if you have 15 versions of open SSL Maybe that's a problem But even imagine that you discover that you have like five different implementations of parsing x509 crypto keys Well, maybe you should just have one of those parsers because one is bad enough Right just to limit the amount of exposure that you might have both two different kinds of authors But also two sources of bugs Quality assessment is one you can do too So you can go through your dependency graph and look at which of these things are no longer maintained Which of them are end of life from the manufacturer which of them have now been taken over by a different corporation And I no longer trust it Which of these haven't had updates in the past six years and so we assume it's been abandoned Having this inventory lets you answer those questions or at least know where to start There's also an argument that often comes up with bill materials that this is really a roadmap for the attacker Right like you're telling the attacker everything that goes into your software and Maybe that's bad and it is true that it's a list of potential weak points But at the same time attackers already have a sort of leg up when it comes to exploiting your software because usually They have a list of things that they know how to exploit And so they really just want to see are you vulnerable to any of the things that they know how to exploit? And they can do that by just probing for the weaknesses directly They don't need this list and in some sense They don't care about this list because if this list is missing something that they might be able to exploit They would still want to exploit it right? They already have decent heuristics and other kind of channels for figuring out is this a kind of target that I would want to attack So this kind of software bill of materials is more incrementally useful to us as defenders than it is to the attackers Now what goes in an espom? I've talked a lot about the sort of abstract notions of why they're useful, but what actually is it? Well, it's a hierarchical list and we'll get back to what a hierarchical list even is but it's a hierarchical list That includes a list of records each one holding the following eight fields the component name The component name is just like the name of the library or the name of the software The version string which is the version the hash which is a cryptographic hash over the artifact that it's talking about So if you brought in, you know Open SSL into your build you might say this is open SSL version 1.0.1f for something and it has this hash The reason you want the hash there is so that you can detect that you're actually using the same 101f as was published by the author and not someone that's been manipulated in the meantime The UID which is a sort of unique identifier used to distinguish if there are multiple variants of a given version Like it might be that I build, you know 1.2.3 of my software, but there's a developer addition and there's a nightly version and there's a I know beta version and there's a production version and there's a pro version There's a home version enterprise version and all of them are the same version But they're different variants as you can track that in that field The supplier name and author might be a weird set of fields But they are they're there to allow people to incrementally adopt these bill of materials Which is if I build my software and I take dependency on something Anish has written I might have a bill of materials for my software, but Anish might not have published any for his and So as a result, I might want to populate some rows on Anish's behalf in my bill of materials Just so I track those dependencies as well So the supplier name here is who wrote the software who is the supplier of this software? The author is who wrote this row of records So if the supplier name is equal to the author name That means that it's the author making an assertion that this is what my software includes If it's different, it means that one of the consumers is saying I have looked at this dependency And I think it includes these things So one is more authoritative than the other The relationship is how is this row? Related to the parent row in the table So at the root of the the sort of hierarchical tree as we'll see in a second You might have your application the thing that you're actually building and producing a bill of materials for and there The relationship will be self saying this record is for the application itself and for things that are under that self row You might have things like is included in or was built by so this is how is that? Component related to the rest of the tree of software in this list and The relationship assertion is a little bit weird But it's basically a claim about the list of dependencies and how much you know about it So it is for example. I know that there are no dependencies Which is different from I haven't listed any dependencies So you need that way to differentiate or I might list three dependencies and that I can say in the relationship assertion Do I know that there are only these three or is this a partial list? So it's a way to communicate. How much do you know about the graph? Yeah, so the idea is that for any version UID pair there should be one correct hash and And so the the UID is generally going to be human readable mostly But but it doesn't have to be necessarily. It's just it is a supplier provided value The hash is this is what I observed and So what I'll give you an example a little bit later of what what might be an issue if you discover that the hash Doesn't match what the author gave you But the hash is something you fill in as the consumer of what did I actually end up with? There there are multiple data formats for representing bill of materials bill of materials is sort of a data model more so than a data format Two of the common ones are the software package data exchange spdx and the software identification tagging standard Which is SW ID? There are a bunch of different ones, but these are the sort of two major ones I'm not going to go into either of the data formats because they're not that interesting really There's just a bunch of XML and XML schemas are not that interesting to talk about And they're usually tools that exist to convert between them Now the sort of magical properties of these bill of materials is you can combine them So I can write something for my software and one of my suppliers can write one from their software And I can take their s-bomb and concatenate it to my own to get a better view of the totality of my dependencies Right that gives me a deeper view into the dependency graph because if I just say I depend on Anisha's library But Anisha's library has like a hundred dependencies, and I don't declare any of them I'm missing a view into that part of my application But if I take Anisha's s-bomb and concatenate to my own then now I end up with a much deeper view of the dependency graph And it's okay for s-bombs to be incomplete What we want here in fact one of the the design goals for the construction of s-bombs in the first place was that it has to be incrementally useful if We require everyone to adopt s-bombs all at once before they are useful then no one will do it And so instead we start with saying you can get some incremental benefit by just keeping one of your own Dependencies or the subset that you know and then you can grow that set over time as more and more of your suppliers and other Sort of partners produce their s-bombs as well, and then they combine in this nice way What's interesting though is that there's no requirement for s-bombs to be signed Anyone can write one. It's just like you know for these existing formats It's an XML file and the supplier name and the author is a Freeform string it doesn't have to be cryptographically secure in any meaningful way You just be a name when we all know the problems with using names for things like this. They're not unique But in general if you want the kind of guarantees that we've talked about for bill of materials You really want these s-bombs to be signed so that if I get a s-bomb from Nikolai I know that it's not just someone who wrote Nikolai in an XML field It is actually Nikolai publishing the true s-bomb for his software So what does an s-bomb look like the bottom here is essentially the the data model representation and the top is just a Visualization of what that looks like so in this case we have an application That's sort of the top-level thing that's being built You see its relationship has it's called self because that is the thing we're building And in this case the application is built by ACMA corporation and it's supplied by ACMA corporation Right, so that record is fine. It's authoritative We know that it's true and when we built the application at version 1.1 We got a hash of 1 2 3 great We also declare that the relationship assertion here is known meaning that we know that This application only has two direct dependencies the browser and the buffer those are the only two dependencies There are no others that's the claim. We're making by setting known in the last column here So look at buffer first so buffer is supplied by bingo whoever or whatever bingo is bingo published version 2.2 Bingo did not publish an s-bomb so ACMA has written their own record for buffer here And what they're saying is that when we built bingo we ended up with a hash 4 2 3 We have no idea what's inside of bingo So the relationship assertion here is unknown. It could be the bingo has no dependencies It could be that it's not lots of dependencies. We just genuinely don't know any of them For the browser you see here that the supplier is Bob and the author of this record is also Bob What this generally implies is that Bob wrote somewhere that the hash for 2.1 is 2 2 3 So Bob has guaranteed that that record should have this format But Bob didn't declare anything else about their software They haven't told us what the dependencies are they didn't publish a full s-bomb They probably just gave that hash and so we happen to know that this compression engine needs to be there for Bob's browser to work for us to build it This could be something like if we try to build it without the compression engine We get a linking failure, so we know that has to be there We don't know if there are other things to get pulled into that source Maybe we got browsers just like a binary artifact from Bob, but all we recorded is this hash that we got and The compression engine you see the supplier is Carol, but here we acme wrote that record as well So we didn't get anything from Carol. We didn't get anything from Bob This is just the version of the compression engine that we built and brought into this But we do know that the compression engine has no dependencies the root here means the list of dependencies is Empty and we know that it should be empty how they know that I have no idea This is an example from the spec this seems weird for this for acme to know this but they happen to know this it could be For example here that they got the source code for the compression engine They built it from source they ended up with an artifact with a hash of 3 2 3 and they know that in order to build it They didn't have to link with anything and there was no other source code being brought in Maybe the source code for this library is very straightforward Now let's imagine what happens here if this was concatenated with another s-bomb Let's say the Carol comes along and Carol now publish it as an s-bomb for the compression engine and Carol's s-bomb is indeed just one record. There are no dependencies and that record has supplier Carol version 3.1 author Carol UID 4 3 4 relationship is self for Carol's S-bomb and then we concatenate it into our own relationship assertion root But the hash in Carol's s-bomb is 2 3 4 What does that mean? Well, what that means is Carol is saying my compression engine as of this version when you build it should have the hash 2 3 4 And if it's not doesn't hash 2 2 3 4 that means you're using a different version than what I intended to publish And there are a bunch of reasons why this could be could be we use different compiler flags It could be that we happen to patch the source code in some way or it could be the sign of something malicious happening But at least there's an indication here that something is off And then we can use this to augment our s-bonds right so here all of the rows are either self or included in But you can imagine there being other kind of relationship assertions like was patched with Like if we had a row here that says this compressing engine is Carol's original source and the hash of that source and This patch and the hash of that patch file Then now we might actually be able to Validate that we are using Carol's source, but because we're using a patch on top of it We wouldn't end up with the same binary artifact And so the hope is that by growing your s-bombs in this way you can do more and more sophisticated analysis over your dependency chains And so the kind of the dependency edges you can declare here the kind of relationships you can build There are all sorts of things like this is basically a it's not quite a freeform text field But it's pretty close so you can have things like was built by for which version of clang or GCC or GHC or whatever It might be you were built by Whether something was present when built like you can imagine here For example solar winds Orion might be one of the things where if it was present when you built it at this version Then maybe you don't no longer trust these software artifacts Generated by if you do source code generation from other things patched with red data from so this this last one you can use for things like configuration files Right, so you might now have a way to capture what configuration a piece of software was running with directly into your s-bom So you can use this later on to figure out what are all the places where this configuration file was used as of this version You keep adding in for over time and that improves the insight you have into your dependencies And then you can also improve the analysis separately from what's in the s-bombs Okay, so where we ended up was these are the three questions what you are deploying where Where it came from and what's in it? And I hope that I have convinced you that the answer to all of these should be yes I hope I've also convinced you that it's really hard to answer. Yes to any of these questions and There are a lot of open research problems in here, too If that happens to be the kind of thing that you're looking for of figuring out How do we verify all of these things that I've talked about how would you prevent these kinds of attacks? It is very very tricky But hopefully it's something that we can solve over the coming years because we're seeing more and more attacks Take the form of exploiting any of these three things That's all I had to say. Thank you. And if you have any questions, please ask You can also ask anonymously and then I'll just take questions from down here. That's fine, too And I promise I'm not scary mostly so One of the things that's interesting with s-bombs is that there's not a requirement that you publish them Usually the way that the laws around this work and the way that the guidance works is If you are vending libraries, then you should publish s-bombs with those libraries Whereas if you're vending services where the abstraction boundary is more of a like You have an API, but that's the extent of it. You're not actually linking with the software then as the requirement is that the service provider has an s-bomb So that if there's a problem they have a way to navigate backwards through their dependency graph But there's not a requirement that they publish it to Customers because it shouldn't matter to our customers which version of open a cell we are using on the server side Now you could argue that it's better the more we share of this and that is true But no most companies are not going to be publishing their s-bombs is my general expectation and Amazon currently does not my favorite kind of My favorite kind of attack I really like hash collisions Because there's sort of an assumption in a lot of the software that we build that hash collisions can't happen We tend to write software in a way where if the hashes are the same then we assume that the artifacts are the same not just for supply chain security, but like everywhere like if you look at your build tools your package managers if you look at like cryptographic checks like Everything assumes that if the hash is the same the message is the same and if you can manage to break that You break a lot of things by assumption We saw this happen with MD5. We've seen it happen now with sha one although only for subsets of its security guarantees I don't think we're gonna see it for you know Sha 256 or any of the more advanced schemes that we have now for a little while But there have been some interesting variants in this attack so one of the attacks is currently going around is It's really hard to generate a collision where the hashes are the same But it turns out to be quite easy to generate a hash where the first and last end digits and are the same And it turns out when humans compare hashes which we do a decent amount of the time We tend to look at the beginning and see that it's the same and we tend to look at the end and see if it's the Same because in the middle, it's really hard to know where they line up And so these attacks rely on cases where users are comparing hashes like Is this the fingerprint that you expected this file to have or is this the public key that you intended to communicate with? or is this like the certificate that you expect the server to have and If you just compared the start in the end It turns out that's not enough anymore because of these attacks where you just generate things until you find a collision You have enough random bits in the middle to play with that you can actually generate collisions pretty effectively It's a really interesting kind of attack where the target isn't computers. The target is humans because Oftentimes we end up trusting humans in this way. We we Shift the burden on checking certain security things on to users onto people and people are bad at Cryptographic security just because we don't work that way Yeah Why is it hard to know where things come from can't we just look at the library imports? So there are a couple of answers to that the first is that assumes that you have the source code if you don't have the source code There are no import lines to look at The second one is You can look at the import lines, but how do you know where they're importing from when you write import foo? What does that mean? You know ultimately that's gonna bring in a source file from somewhere that gets brought into that module In C. It's like a literal hash included it might be a path But it might also be from any of the system include paths Which gets set by the build environment and so who knows what's there when you say, you know hash include You know you enter standard lib or standard IO What really happens is the compiler looks through the system include search path for the first first file That's called it like standard lib dot H and then it brings in that file But how do you know how that file got there or what if someone modifies your system include search path to include some other things at the beginning where they control the contents if you're taking dependencies through like JavaScript modules or something then you run into kind of like the PayPal attack for example where Does the thing that brings in the dependencies does it prefer internal or external sources? Because if it prefers external sources, you might say import PayPal logger But that might be a different PayPal logger than the one you intended Do you want to make a secure in a career in security? So security is this interesting field where There are so many different kinds of security and I think one of the things that's useful is to figure out which kind of security is interesting to you is it to like hack on individual bits of code and sort of Figure out whether they're breakable whether there's an error there whether it's exploitable and like figure out how to Chain rocks in order to manage of like a jail escape. That's very low-level sort of attack or oriented Kind of security work. Is it to work very far on the other side of like security policy? It was a very different kind of security work But also often more impactful if you can manage to do it well And then there's all these sorts of positions in between that are more about How do you do defensive programming? So if you work at Amazon in build for example, you might think about things like well How do we sandbox builds if we want people to be able to use the the public build tools for For these languages like if we want people to use npm or yarn. We want them to use pi pi We don't want them to use cargo and crates. I owe then We want to enable them to use those tools But we want to do so in such a way where they don't accidentally shoot themselves and the company in the foot and Figuring out how to build so technological solutions that mean the developers don't have to worry about this so much It's also really interesting. So that's more like Infrastructure security and then you have operational security, which is security on the sense of like how do we know that these server racks are secure? how do we know that our network firewall is set up correctly and Knowing roughly which of these directions you want to go into Makes a lot of difference in terms of the kind of things that you end up You know learning and and deciding to focus on the other thing I would say is that It turns out there are a lot of really cool problems in security and many of them You don't know about until you start following the people who do them So I generally recommend that you go follow security oriented people or whatever social platform is your your Cup of milk or whatever we want to use as the expression today So so the the idea here is that there are security professionals in all sorts of spans here that you can follow And it's not necessarily that the work that they do is interesting to you But they might then loop in loop you into other parts of the security sphere There were people are tackling problems that you might not know about So examples of people you could follow is like Swift on security is fantastic if you don't already follow them They're on I think all of the different social networks now Thomas Petacek who is I think TBQF or TQBF on Twitter and a bunch of other places Matthew Green as a security researcher Off the top of my head is hard But I you can message me after I'll send you a whole list But there are a bunch of people that it's worthwhile to follow them just to see what kinds of security problems Are they thinking about? What kind of security issues are they seeing on the horizon and then start digging around there start asking questions start reading The things that they link you to In these slides, which I'm sure we'll find a way to upload in the speaker notes I've put links to all of the news stories for many of these backing Stories and regulations and whatnot and some of them have really good other resources too So for example, the dependency confusion attack was surfaced on a website called bleeping computer Fantastic name for a website and they publish a lot of interesting New developments in the world of security breaches. So that would be another place to monitor So just keeping on top of what exists out there is a really good way to figure out what draws your attention What are you drawn to and then you can start learning more and focusing more on that? Yeah Okay, so the question is That I said that amazon does not publish s bombs, but do we use s bombs internally and how big do they get? um so Amazon records a lot of this information internally I don't know whether we specifically record them in the s bomb format But these this kind of provenance data does get very large I can't tell you how large But very large to the point where we need specialized systems to query them because it's so large Because as I mentioned like you want to know what you're deploying where Which means that every single deployment to every single host gets logged somewhere And for every deployment, we want to track where it came from and what's in it So we essentially have a one s bomb for every deployment to every host And so this accumulates a lot of data constantly And so it's it's hard to run a system like this at scale Our supply chain attacks detected a lot faster Not necessarily the solar winds attack took a while before it was detected. I think It it really depends on how good your infrastructure is You know for many of the examples I gave they were detected quickly because They were often destructive if you look at the The sort of rogue rogue maintainers or rogue authors examples all of them broke their packages And so that's a very visible kind of supply chain security attack Solar winds much less so But there you might be able to detect that there's like anomalous internet traffic, for example Where suddenly a bunch of your hosts are contacting this random IP address that's probably a sign something is wrong But it really depends on how the attacker tries to get in Right, if you have a targeted attack against the business You might not even need it to reach outside at all Right, you might have it sort of sleep in the network or modify source code over time So that when the deployments eventually happen, then you get a remote code executable kind of thing and so They're not I don't think supply chain attacks are inherently more detectable But it is true that often they are they have a big Sort of fan out in terms of impact Which is sort of the reason they're attractive, right? Is that if you compromise one npm package you might compromise 10 companies or more But but it all depends on how Identifiable the attack itself is Is it easy to have it affect only one company? I mean it's all code, right? So if you can run arbitrary code you can do whatever checking you want It's it's it depends on how much insider knowledge you have about the company too, right? If you know that the host names for developer machines at this company always start with the following like six letters Then you could use that as a proxy for figuring out should I execute here or not You could use things like ip ranges you can query an external server that then tries to identify based on where the request came from So there are some means you can use here, but it's often hard to target specifically unless you have insider information to go on Another audience question. How valuable is a phd for a career in security? Oh, it's funny. This okay. So the question is how valuable is a phd for a career in security? um We were discussing this during lunch earlier today actually which is at least for me The the phd itself was not useful and what I mean by that is like the diploma didn't matter But the six years that I spent Doing all this learning and getting to explore a lot of these spaces and spending a lot of time in just educating myself Learning how to dig deeply into problems and really focus my attention on them getting to sort of Not put on blinders exactly but really just Immerse myself in some of these fields and just Learn whatever was interesting. I think was extremely valuable for my career later on like now My current job is not insecurity in the sense that I own the build infrastructure for a programming language But it turns out that there are a lot of security implications of Owning the build infrastructure because that's where all the vulnerabilities come in and so having that security exposure and security understanding Has then been extremely helpful in making me not just do my job well But also in educating the company about the kind of security risks that I see flowing through my role And so that's all to say I think The degree is not the thing that's valuable But the ability to spend that amount of time with that amount of focus On the things that you care about matters enormously And you get that freedom more so in academia than you do if you go into sort of a junior position in At a larger commercial company unless you land like a junior security engineer position Where maybe you get to do a little bit more pen testing and the like But but being able to explore this through academia. I do think is helpful Yep, what types of companies are looking for security engineers? So what's interesting is the companies that aren't looking for security engineers are the ones that most need security engineers At least my experience is that Companies are constantly understaffed when it comes to security engineers like there are So many security problems Everywhere and there are so many things that again as as I mentioned at the sort of head of this The engineers know about many of these security problems, but leadership does not necessarily know that they're there How critical they are how hard how how severe they are And how at risk they put the business like how how critical would it be if something went wrong here? And that leads to this interesting split where Everyone who's an engineer at these companies Knows that you need more security engineers, but the company doesn't necessarily prioritize hiring security engineers even though they should Which doesn't really help in terms of the question But usually a lot of companies need them and the question is just whether they are seeking them out Sometimes you'll find that if you start to Talk to people at any given company and you ask them about their security posture Whether you can talk to some of the security people They might go yes, we are definitely hiring But it's not necessarily clear that there'll be like an open job position. That's easy to find that's about security And again, like if you look at something like my role, which is not a security role If we publish the job posting from my job today, there would be nothing about security in there Even though so much of what I spend my time on is security. It's the integrity and security of the build process So oftentimes you can find other positions where you know that security is important And it's not necessarily a security job But you go into it with a security mindset and you make the job a lot about influencing the security aspects of the work You have a lot of questions today Useful classes to take or skills to learn for a job in computer security well missing semester, of course Is the class everyone should take I think this class is really useful and I mean I'm biased here, of course But like I think one of the things that this class does pretty well is expose you to a broad range of secure like classes of security issues And ways to think about security and hands-on experience with them. And that is really really valuable In terms of I don't know that there are other classes where I would say You should definitely learn this thing because it's not so much about learning The things that a class teaches you as much as it's getting hands-on experience with some of the problems and problem domains Like for example, if you take the distributed systems class You're going to find a bunch of race conditions in your code and race conditions are a great place for security bugs to hide And so debugging your distributed systems lab is going to teach you things that are useful for doing security Would I say that the distributed systems class is when you should take in order to get better at security? Probably not. It's not a security class, but it will help you learn the relevant or some relevant skills I think in terms of getting Things to learn on your own to become a more attractive job candidate in the field of security You're going to want to look at things like Debugging skills is going to be really useful I think following a lot of the security discourse that's happening Like following these people on social media as one example But just in general keeping up with security news Is actually a really useful way to keep on top of like what is the security industry currently thinking about? What are the what's the current space of problems? And when you find that security news, don't stop at the headline Don't stop at like the Third reprint of someone published a blog post about a security researcher Who like actually go back to the origin material find like for the PayPal attack for example the dependency confusion attack. There's a really long blog post by the person who The security researcher who originally found the problem giving the full details of how did he discover this? How did he find the github repository initially? How did he go about trying to exploit it? How did he roll it out? How did he expand it to multiple companies and going all the way back to those origin stories? And then reading about the techniques that these people are using I think is one of the best way to accumulate skills And I don't know that there are specific named skills that I would say as much as it's more sort of Biosmosis learning a collection of techniques because I think insecurity one of the One of the things that differentiate the sort of I don't want to say best hackers because it's sort of the wrong term here, but the best Secured and minded people are the people who are relatively well rounded in terms of being able to consider multiple different kinds of attacks And how they might build on on top of each other if you get really really good at buffer overflows You can do some really impressive attacks But at the same time you're limiting yourself to a very small subset of how you might be able to Achieve a given goal Remember that attackers and therefore also defenders are relatively goal oriented. They're not mechanism oriented The question isn't how can I use a buffer overflow to take down, you know xyz or to steal money from y It is I want to steal money from y How do I achieve that in the best possible way and the more strategies you have available to you the more skills You can lean on the better you're going to be able to do that and as a defender You need to think of all the things that the attacker might do so that you can defend against them Dependency confusion for example has basically nothing to do with securing code It has to do with what does a name mean What is the policy for how we bring in third-party dependencies? And if you only focus on buffer overflows, you would miss that entire attack vector And surrounding out your skills. There is really valuable rather than focusing on holding specific skills to the extreme I'm just I'm just looking at you now in case you have more I'll also take them directly from you all if you want to do that instead of uh via a person here, but Prefers text editor any of them It's that's the right one It's not preferred. It's just the best one I'm glad everyone is in agreement I know we have some emacs people here, but all right. Well, we're almost at time. Thank you everyone