 Hello. Welcome to room 101. This room is sponsored by SpiderBat. You're narrow, and super deep, the emergency exits, and all that. So you get this little bit of output there. So it'll write out to the console, establish connections from, and then it goes over that set and prints out the IP addresses. And I'm going to show you what this looks like. I mean, you can probably imagine. So a typical small-scale sort of Zeke invocation would be just Zeke-R to run a P-Cab, and assume that I take this previous script and put it in awesome.zeke command line and run it. And then this is a small P-Cab, right? I actually have it on top. The next thing that Zeke is is also a log writer. And this is probably the most important sort of aspect of the system for most people, because you are very unlikely to use Zeke without actually consuming logs that the system generates. It's actually sort of a put this together or somebody else did that sort of conveys a little bit what this looks like. So you have sort of a structure where you can sort of cross-link pivot across different logs in the system. And there are logs basically for protocols, for particular sort of notices, for sort of themes and things you sort of sneak previews sort of there in the bottom, right? And so the way that works is basically the script interpreter can say, okay, write a log entry of a particular log out to however Zeke has been configured to do that. And so that usually goes back into the core and here again we have the plugability. So depending on where you want your log entries to go, you can basically configure the system to write out to that sort of log ingestion technology. The default is just to write logs to disk, but there are many others like Kafka as a common one. So, and the important bit is that you get that functionality by default. So even though I wrote this little Zeke script here that was digging into the originators of the connections in this P-CAP, what I actually get if I look at what's now in my directory is a bunch of logs. That's basically because the DPI engine and the analyzers have dug into the flows and basically sucked out what was the HTTP specific traffic, what was SMB and so forth, which files were in that P-CAP and basically just dumped it all out into logs. And it's just getting very small. I hope I warned the folks in the back, but you know, here you go. Just to convey sort of a little bit what that looks like. So the con log is sort of our hub for logging. It's basically sort of the thing that connects to essentially almost every other log to give you sort of the central state for every connection that Zeke sees. And I'm showing you that output here in JSON and that's basically one default way that the Eski logger can write out as, and this might be a little bit reminiscent of NetFlow. If you've seen that in the past, but if you have also already tell that this is richer. So you have things in there like the connection state entry, which basically sort of tells you what happened over the lifetime of that connection. And then there's the connection history, for example, that is sort of the near the bottom there. That is really cool that basically sort of is a very short sort of slug that summarizes what happened over the lifetime of a TCP connection. And so since I've sort of been in this for a while, I can tell you that, okay, so there was a SIN here, there was a SINAC coming back, the handshake completed, there was data exchanged in either direction. We use uppercase for one direction, lowcase for the other, and the As are pure acts. And just from that little string alone, for example, you can often identify tapping problems. Like when something isn't quite right, when you only see one direction of a flow or so you get entries there that tell you right away like, oh, this is suspicious. And near the top there, you have this sort of string, the UID, and that is basically how we tie logs together. So if you also looked at an HTTP log, for example, for this connection, if it were one, or in this case, SMB, then you would see that via the UID, you have a means to cross link those entries. Okay, so that sits on top, back to this. And so the next important thing I want to flag is basically sort of a architectural, sort of philosophical sort of aspect of the system. And it is that everything in lighter blue here is mechanism. So this is just implemented in Zeke to do one thing. Like in the log engine, a plugin just writes out a log entry, there is no sort of complicated algorithm to that, you can basically just implement that and you know you're done. Or an analyzer can just sort of parse into a protocol and that's sort of a predefined task. Where the script layer shines is policy. Basically whatever you load into that interpreter defines what Zeke does. And this is exactly the reason that sometimes makes it a little hard to explain what Zeke really is because the answer is, it depends. The default that we give you is a lot of logging. Like on that previous slide, so you learn a lot about what's been going on in your network, but you can do other things like detections and so forth. So that sort of guides a lot of like how we implement things in the system. Is it policy or is it mechanism? And we want to keep things that should be left to the user completely possible to enable it to implement for the user while only providing the machinery that you need to actually do that sort of important. And I think it's a differentiator from other network monitors. So for example, thinking of signature based intrusion detection systems, they built this a little differently, right? There is an assumption that what you plug into that system is a detection. That is not the case here. So it's in a way a lot more flexible. Anyway, so what we have now is a Zeke process. So I have built this up sort of bottom up and everything you see on the left here is running in one Zeke process. And so, why am I pointing this out? Well, earlier I said Zeke is a distributed event-based system and if you look at this and you know it was a process, then you might go like, why distribute it, right? That is not a distributed system, it's just an event-based system. Well, that's because usually in production you don't run Zeke as one process. So you can probably imagine what's coming, but instead of one process, you run a bunch, a cluster. And so the way you go about this is you basically take a set of processes, how many is usually defined by the magnitude of traffic that you want to parse, how beefy your machine is, how many cores it has, sort of all that kind of stuff. And so you assign roles to your cluster. These are sort of the predefined ones, although you could technically have others. So the common ones are a manager at the top that is essentially sort of the central coordinating unit and often also the main node that keeps a lot of sort of centralized state. There are these things we call proxies underneath that have a somewhat sort of misleading name. They're basically just a load balancing mechanism so that as you start to in your scripts sort of distribute state across the system you don't put all eggs in one basket and put everything in the manager, but actually sort of spread that out a little bit. And then at the bottom there you have the workers and they do what you would imagine. You basically sort of like, they get the packets out of your network tab. If you run this on a single machine that's basically done by the NIC for you, if you run it across multiple machines you need a network load balancer, something that distributes packets out. And that then goes up. And I just skipped the node on the top right. That's a logger. We usually dedicate separate processes just for the task of logging out because those can easily be bottlenecked as well. And so it's nice to have that sort of separated out. And so what happens next then is that here comes the distributed aspect. The nodes speak to each other and do event pops up, publish, subscribe. So it's kind of hard to visualize this in a diagram but basically picture that there are topics for every kind of node or other ones. It basically again just depends on what you put in the script layer and then different nodes in your cluster can subscribe to those and then whatever happens in your network is automatically communicated across this cluster so you can react to it. And so this then is a Zeke cluster. So I just walked you through the basic node types. I thought I'd flag that we built this stuff in-house except for the message passing library that underpins all of this which is called CAF which is the C++ Actor Framework that you might have heard of or not. I didn't say that all of the core is implemented in C++ and scripting there is its own thing. So if you want to scale these things, your first go-to is usually to add more workers if you find that you can't keep up with the traffic or you need a little bit more flexibility and where you place state in your system and as your Zeke installation grows you will often find that if it's a really big network then you don't get by with one machine, you need multiple machines and then you can really start thinking about well which parts of my cluster do I want to put where and you often have scenarios where you have multiple machines that are just running workers and that are basically funneling all the traffic analysis up to manager proxy and logging and so forth. That whole aspect is pretty tricky to figure out. So I wasn't gonna talk about this a lot in this presentation because I think it could easily fill an hour just by itself but to really get sort of to the bottom of packet capture and how to scale clusters, I think it's a ton of fun and it'll take you a little bit and you will learn more about your network by doing that than many other things that you can think of because it's just so essential to the whole thing. And it also, I should mention this too, means that our notion of a cluster is very different from how many other people think of modern cluster topologies. So we're not talking auto-scaling across a thousand nodes here or anything like that. This is basically defined by your hardware's capabilities and what you can do with network capture because that will guide how many workers you bring up. If you don't have one of those workers running, it basically means you have a little blind spot there for the traffic that would have ended up in that worker. So it's a much more sort of rigid structure than you have in these sort of like highly dynamic sort of come and go style clusters which has certain downsides because we have less reliability, if you will, but certain upsides too because we can argue about that cluster in ways that are perhaps a little bit more nice than you have in very, very large and dynamic clusters. We can basically say like this cluster is now up and running. We had a nice sort of philosophical debate in the team the other day whether that still makes sense for the Zeke clusters but we came out on like, yep, it pretty much does. Which is nice. Okay. Now, so what is Zeke? So Zeke is a mature, extensible, distributed, event-based, scriptable network data logger. So that's a mouthful and you will be forgiven if that is not impressing. It's basically too complicated, right? Like there's a much easier analogy for what Zeke does and it is this. It's the network flight recorder and those of you who are old like me might remember that this actually existed sort of in the early 2000s. There was an actual product if Marcus Ranum were in the audience he would chuckle right now but this is basically like version two and then some of that, right? It's way more flexible, extensible and so forth. Everything that was on the previous slide but if you want this one bit in your head for what Zeke gives you, then it's this, right? Because it allows you to record network data at scale for long amounts of time and you can control the extent to which you log into the details of this. Okay, but Zeke is a lot more. So this was the main Zeke system and we build a bunch of stuff sort of around this as well. And I wanted to just sort of walk you pretty quickly through this. So the first thing that you should know about is what we call Zeke AG which is a package manager for Zeke. So this has existed for probably, I should know this better, but I think just under 10 years now and before that it was always tricky when somebody had written something meaningful in a Zeke script, like how do you distribute that? Because you had to sort of like put it up on the internet somewhere, point people added and they had to download it and put it in a folder somewhere. So the obvious solution to that is basically a package manager. So we built our own. This is sort of part of the outer ecosystem which we often build in Python. So this is Python stuff as well. Jumping all the way to the bottom of the slide there you can see you can just install it via PIP or depending on how you install Zeke you get it sort of installed by default anyway. I'll talk about that later. But to just sort of convey the flavor I think this thing is super cool because it basically lets you build packages for almost anything that you want to plug into or modify about Zeke. So it's pretty large. At the moment we have just over 200 packages, 210. I think I did a quick grep again yesterday. And sort of, I just sort of like scan through sort of the types of packages that we have and they're basically packet sources. Like so, oh I said package there. Packet sources, right? Like earlier with AF packet and so forth. So we've got the ecosystem there, I think pretty covered. Then on the other end of the system it's log writers. So like what you will write logs to. Pretty nice setup there as well. If you're using sort of exotic log ingestion technologies we would very much like to know about that, right? Because that's sort of what we have to supply so you can use the system to ingest data from us. There are lots of log extensions where you basically take the existing logs but tweak them. Like you add something to it. Like there's one package where you get HTTP bodies or one that includes VLAN tags, geo tagging. There's one for this thing called community ID that is perhaps on your radar if you're in the space a little bit and it takes sort of this notion of a connection identifier that I mentioned earlier but up levels it from Zeke to basically any system that taps traffic and identifies connection. So you can connect and pivot across data sets across these technologies. This is all stuff that is pluggable and you can sort of add to your system. There are functional add-ons. I didn't really sort of cover that in the architecture but the plug-in machinery in Zeke allows you to tap into more things than I showed in that diagram. So basically you can add implementations for pretty much anything in C++ and make that exposed in the script layer. Like if you wanted to, what did I mention here? Yeah, storage IO if you wanted to talk to Postgres or testimony was super cool which is this basically packet distribution mechanism. I think that was built by Google. I capped this sort of like built in closed loop solution for making decisions about whether a packet is benign or not or a flow is benign or not and so forth. So it's really pretty powerful stuff. Extensions for frameworks. So like basically shifting over from individual new things but sort of the core functionalities of Zeke. You can add analyzers. We have had a lot of contributions around industrial control system traffic which is awesome because you have to have a lot of domain expertise in these protocols I think to be able to write parsers for them and then detections and there are lots of those. I'll cover that a little bit later but basically picture that you have a lot of freedom in how you express a detection and if you've been in that space for a little bit then you start thinking differently around how you build these things. Sort of thinking more across flows, across hosts and more about state and state machines and at which point you have sort of flagged a detection. So you can shove all of that into packages. You can also start sort of from the other end if you just want to dabble in this a little bit. This is a package manager that also lets you create packages, it's templated, no great magic there. You still have to do sort of the deep thinking to understand what you want to put in there but you don't have to fuss too much to get this going like the main layout and the metadata files and so forth like that's all there for you. So that's a great one I think to be aware of and then there's a whole bunch more and this next one here called Spicy could totally be a talk for next time maybe Don but this is sort of a sub project that we've had for also I think about a decade and that is the brainchild of a good friend and colleague of mine, Robin Stummer and this is basically our take at how network traffic parsing should really actually work because if you look at how most projects go about this today, the answer is that it is ad hoc. So every system that needs to parse TCP has its own TCP parser, every system that needs to parse HTTP has its own HTTP parser, it doesn't scale, it's buggy, if you look at Wireshark's history which is a project we are greatly impressed by because they implement so much parsing code, of course they've had a ton of vulnerabilities because you have, we're no different like us too, Suricada too, TCP down too, it's sort of a common truth sort of in the industry that this isn't that great and this is basically a parser generator that builds on experience that we've gained over the years with an early prototype that we've had in Zeke for a long time called Binpack and so depending on how you feel about programming languages, you might chuckle now but it basically compiles a DSL, a domain specific language that expresses the syntax and the semantics of a protocol into C++. So if you're a safe language diehard you might not find that very safe but the win here is that it is predefined how that implementation happens, right? So you can tweak a bug that might come up there in one place and all parsers written for this thing benefit and it's designed to work on stream data which is really important because if you build a system that parses network traffic then you never just get the whole thing, you get sort of the next chunk and the next chunk and the next chunk which makes this really important differentiates it from things like Yara for example where there's a built-in assumption that you have a file and then you can parse it and the final point there is the one that excites me the most, this isn't designed for Zeke, we happen to start using it in Zeke and we're really only starting to use it, it's been a long process. We're working with the Wireshark folks to get that in there, this is early prototyping on there but the vision for this is basically like wouldn't it be cool if you could go to this repository and you had prewritten parsers for lots of protocols and you want to use this in your system and you just grab that parser and you link it up in the code that comes out. Anyway, so this is I think among the most exciting things we're working on these days. Broker and CAF, so this is our middleware, I basically covered that in the cluster slide. It's kind of cool, it's kind of quirky, it is one of the things where we've invested a lot of time and we sometimes wish, well, how should I put this? We were using something a little bit more mainstream because it is just so idiosyncratic to be using like one, a message broker that is built in-house based on an actor framework that I think not too many people know when there are lots of other go-to technologies out there but it has worked extremely well for us, it's matured technology and we're benefiting from a bunch of functionality that we're getting particularly out of CAF that has really nice telemetry support that also at the broker level implements some of our data store functionality and so forth, so this is really the key underpinning for everything that we're doing when you're running Zeke as a cluster. And speaking of cluster, again at the bottom there I'm mentioning something called Zeke Control or Zeke Cuttle, this is our original cluster manager, this is sort of also an interesting one has been around for a long time and it basically is funny because it dates back through a time where if you wanted to run a Zeke cluster it by definition meant that you were running multiple machines because a machine had one core. And so accordingly that's sort of like how the thing was designed, like this is very sort of SSH and R-Sync sort of heavy because it actually sort of goes into machines and changes the configurations and brings nodes back up and so forth. But it's like I said, it's very mature, it's very flexible, it's very powerful and it's the go-to way these days still where if you run a big Zeke cluster this is the tool you manage your cluster with. There is a replacement in the pipeline called Zeke Client because we're basically rethinking the way you should be able to interact with a Zeke cluster to basically bring that to the modern era a little bit. And so instead of thinking about machines and things you log into, we're building something that is a lot more service oriented where basically since you have the ability to communicate in a Zeke cluster we thought we could also build the cluster management itself in Zeke so you can talk into this thing and say like bring up a node and so forth. And that has started to appear in the current line of Zeke releases and it's another thing that is written in Python and you can basically install it separately and start tinkering with it. And then we're going off sort of into sort of even more periphery sort of, we started to invest a little bit in developer tooling this is sort of because we write a lot of Zeke script code and if you do that then you would like some convenience with that and so we've built out sort of alternative parsing technology for Zeke scripts that lets us write in-denters, linters, that kind of stuff. If you're into that aspect of languages I think that's pretty cool. We have B-test which is an old tool but still underpins all of our testing essentially and it's I think useful to know about because it too is not about Zeke at all it could be used with any project that basically needs sort of system level tests where you say here's a test case here's input to my test case, here's output and I just want to understand if that output was correct so like very useful for baseline testing and it comes with a bunch of functionality that's really useful if you're into sort of test-driven development. Okay, two last ones, two websites. Try.zeke.org is especially awesome if you're sort of relatively new to Zeke because it lets you go to a website and without installing Zeke or anything sort of tinker a little bit with the scripting so you could take the script that I showed earlier and basically paste it in there and run it and you can even select at the bottom like which Zeke version would you like to run that with which PCAP would you like to run it on? You can't upload your own but we provide a handful that kind of stuff. It even comes with a little bit of training if you want to sort of explore aspects of the language so great stuff hopefully around for a long time still and the second one is packages.zeke.org which is basically our online package browser for ZKG for our package manager and this thing is a little dated there's actually an effort underway right now to revamp it but it's the thing you can go to for searching the packet database a little bit like is there something for the protocol I care about or for the CVE I care about and so forth. And this is a biggie so all of the stuff that I just covered is BSD licensed. This is something that is sort of in our lineage in our sort of heritage. We just feel pretty strongly that this is the way we want things to be and so it should enable you to use it in pretty much any setting you can think of which is I think pretty great. So that sort of concludes the technical side of things and I wanted to walk you through a couple of use cases that are common for this because I think it sort of just conveys a little bit of what the system is about. So this part I've already touched earlier but so the first like the main bit that I would like you to keep in mind when you think about Zeke is just that it's better network data. So if you think of the big go-tos right you usually have PCAP and if you put this on this two dimensional like continuum. So how accurate what's the fidelity of the data sort of going to the right? Like how close is this to what actually happened on the network? PCAP would be all the way over there. It is literally what happened on the network because it's every packet. And on the data volume scale it's also all the way on the top because it's everything that happened on the network so it's really voluminous. On the other end if you don't care at all about what happened on the network and you just wanna know was there something malicious? You have alerts, right? This is what every intrusion detection system in the last 20 years has been up to and that's the exact opposite. So you know almost nothing about what happened on the network other than some system said it's bad and there's almost no volume because if it's good alerts you only get these sort of golden nuggets that you care about. Oftentimes unfortunately you have crappy alerts so you're in the same space in terms of data fidelity but higher volume, not super high but hopefully right but not great. And then you have things like NetFlow that are sort of somewhere down there because you know you learn more about what's going on on the network because you get a pro-connection sort of summary so you can say like oh yeah this guy has been talking to something strange but not a whole lot beyond that and as a plus the data volume is hopefully also manageable because you're now abstracting all of the packets that make up a connection to just sort of a one-liner and the point I want to convey is that Zeke is all over this because again policy-driven it depends on how you configure it what you get out of it. So if you want mostly detections for example you can configure it such. You don't have to have all of the logs that I showed earlier you can turn those off if they're of no value to you then you would sort of land in the bottom left there or if you want very rich logging with file extraction, HTTP bodies, SMTP stuff whatever you name it sort of then you would end sort of on the top right there of the blue sort of sphere but it's not predefined so if you ever come across sort of marketing materials or anything that try to place Zeke in sort of one particular place then you should pause and go like well it depends right because it depends on how it's configured. All right so that's the B and then just to walk through some typical use cases so by far the biggest that comes to mind is sort of for network detection and response workflows it's empowerment because if you build out a storage solution for these logs then you have this incredibly rich information about what happened in your network but at a volume that depending on your infrastructure allows you to go back days, weeks, months and in some cases years. Like we have a very good sort of working relationship with folks at Lawrence Berkeley Lab and they literally have like decades of Zeke logs and it is incredible. So for example if you think back to Heartbleed when that came out obviously a huge deal right and there were all these questions about like okay so well was that exploited before the vulnerability was announced right and the folks at LBL basically dug into their logs and at least for them the finding was like nope like until that thing was announced we saw no exploitation of it because you could tell from the Zeke logs that weren't built for Heartbleed or anything right but they had rich enough logging that you could sort of grab that out of the data. For threat hunting so for folks who basically want to sort of proactively look for things that have happened in their network in their infrastructure it's a goldmine for the exact same reason basically you don't just have NetFlow, you don't just have alerts you have all this structure so you can look at what are the hosts that have spoken the most in a protocol that I didn't expect to a country I don't want that to see happen for example right and what I particularly like in this is that because Zeke scripts basically codify analysis you can sort of push this whole thing to the left so you don't have to collect lots of crappy data and shove it into your data analysis solution when you have hunches about how you will usually be doing your analysis you can codify that in a Zeke script deploy it drastically you'd use your data volume but get sort of the entry points to your remaining analysis out of it which I think is really great. There's data enrichment here is just another sort of example application and we have a couple of sites that we're aware of that do this incredibly well and they've basically built this sort of workflow round tripping workflow where they use Zeke to learn about what's in their network infrastructure then they enrich the data with sort of context for what that means and so the cliche ones would be like this is the HR department, this is the engineering department and so forth and then sort of iterate and it's a really good way to understand new machines coming on your network, phones and so forth and then finally detections so I tried to touch this earlier but sort of basically think of the possibilities you have when you don't need to think about having a bit that you can set for a particular connection but like arbitrary tables state setting, state exploration and so forth and what kinds of detections that enables and you can just go to that website and search for some CVEs and I think it's since the 2020s it's been about two dozen for example so for many dominant modern exploitations we find that we can pretty easily sort of but very little scripting code build good detectors for so once you're in that space it's sort of a little hard to sort of come back out of it because you're basically used to the flexibility that that enables. Okay, switching gears a little bit, like if I have convinced you that this is great I wanted to tell you a little bit about like how we do releases so you would know what to install and I'm sorry I'm getting over a cold which is why my nose is so runny so our release cases is as follows so we aim to do three releases per year where the x.o is our long term support release so the latest of that right now would be five.o and then as time goes by and we find you know bugs or security vulnerabilities we do a lot of fuzzing for example to find these we make point releases right so like after the x.o and x.o.one would come out and in the meantime development continues and the next release that comes out is the .one so a couple months later and that is our we remove deprecations release so if you switch to that you want to be extra careful because we might have changed something in sort of slightly bigger ways and we have a file for that in our distribution that you can always check out that is basically like a release log but it will make sure to call out sort of things that you should be aware of and so once that has come out the same sort of style of development continues we might do point releases if there are bug fixes and so forth but we will also back port that to our long term support release so like you know if there is a bug fix in that .one then you will also get it in a bug fix release in the .0 line of releases and so again development continues and the .2 comes out so for example just a few days we released 5.2 and at that point the .one line of releases is essentially stale like we will not back port fixes to it so if a new sort of you know release comes out that would be a .2.1 and an according back port to LTS and then the cycle is complete and we don't really guarantee three releases per year but I think we're pretty good at sticking to that and I'm going to take a quick moment to blow my nose I apologize okay so I think that's useful context because otherwise you would just sort of see release numbers and wouldn't really know what to get so when in doubt get an LTS release if you have the freedom to sort of like grab something recent make it the most recent not everybody does that in production we see a huge spectrum of releases out there but that's sort of our recommended approach so here's where we are right now so in the 5.0 sort of in the 5 I should say cycle of releases we just started 5.2 and so that means 6.0 should come out sort of at some point in the mid to late summer you can go to our wiki down there if you want to read a little bit more about this ways to get Zeke so if I have you interested now you have a couple of options so three in particular so if you just want to tinker real quick and like run some Zeke stuff maybe crank through a pcap or just sort of dig in a little more you can grab our Docker images this is Docker pull Zeke Zeke and you have a lot of tagging options there you can grab a specific release you can grab latest you can grab the latest LTS release just by a tagging or you can grab a nightly which is pull Zeke slash Zeke dev which is basically if you want the latest and greatest but don't want to build it yourself and just since 5.2 we started to do multi-platform builds particularly for the folks running max this is great because there is also now an image that is on on arm most convenient for folks who want to take it a little further is probably our binaries and our setup there is sort of interesting I think in that we use Suze's open build service to build the packages which is really handy because they basically give you the repo infrastructure they make it pretty easy to build for different distributions and so we do that basically for the most popular Linux distros and the way it works is then basically as usual you just have to do one extra step by basically adding their repo and then installing from there and the thing to keep in mind is that this isn't as seamless as it would be if we would offer packages that are distro native like if you were on Fedora and you would just say DNF install Zeke or something like that basically because we lack cycles so I'm pretty sad about this the OBS setup is sort of the closest that we can come to this right now but if there are packaging wizards in the room like the Delta from the spec files and so forth that we have built for Suze OBS to actually making it distro native is probably small enough that it's for somebody who is well versed in this I think probably pretty easy to get going so that would if you're curious and would like to get involved that would be a terrific way to get that started and then there are a couple other platforms where people that are not us basically third party external community contributors have built packages and we're hugely grateful this is homebrew for the Mac and then FreeBSD and OpenBSD in their ports and packages FreeBSD maybe a couple words on used to be the go-to solution for running Zeke and then Linux basically kept getting better at packet capture and so forth so I think the defaults have switched and OpenBSD we have personally have the least experience with and are therefore also most eager to hear sort of feedback in case people are around to who use it on that platform because it's obviously great from a security perspective and then finally you can install it from source and if you think like oh dear God like it's not actually that daunting the only thing that's daunting is like checking out the code because at this point Zeke is pretty complex there are lots of sub modules and stuff it's very sort of get oriented but like basically you can just grab a tag in there and build it and the build requirements are actually very straightforward they're pretty much what you would expect which is basically a compiler tool chain Python and you're sort of there and maybe a cool thing to mention there is that if you're building from source and you are at home on Windows which is probably sort of historically not the go-to for somebody doing network monitoring that is starting to become feasible because we had a really cool contribution from Microsoft who started to use Zeke internally in their products specifically Microsoft Defender and we're willing to open source the stuff that they did there so there was this mega PR that especially my colleague Tim worked on very closely with the Microsoft folks and that is now in the release in the 5.2 release and maybe just to throw that out there real quick so that's emerging and if you're in that space we would love your feedback but it's not at all at the same level yet of running it on Linux or another popular Unix platform. All right, more details in our documentation down there. So that basically concludes all the technical stuff. We never talked about Q&A but presumably toward the end is better. Yeah, okay, now would be a good time. I think I have five, 10 minutes maybe still. Is that okay? Yeah, okay, cool, that's perfect. So the Zeke project, so there's one thing in particular to start with that you need to know. So if you go into a checkout of the sources and you look at the tail end of a file called changes, you see this and if you look a little closer, you see this, namely 1997. So I'm not here to tell you about anything new. This thing is old and there's a good reason for that. It is basically because, well, where to start? So there is a mailing list that is called Bro for reasons I'm about to explain that goes back all the way to 1998 to almost the beginning of this project and it is a ton of fun to dig in there sort of in the early days and look at who showed up and what was discussed and so forth. And the reason I'm mentioning this here is why is it called Bro? It's well because the thing used to be called Bro and in fact it was just last year that the creator of the system who I obviously have to mention here which is Vern Paxson was awarded a test of time award and if you know academia a little bit then those things are a big deal and it's just pretty funny that it was 2022 for a paper that it's very tiny down there but that was written in 1998, right? And at that time the system was called Bro and you can also see the security sort of perspective in there because like in the title right there it was detecting intruders, right? And the reason we renamed the project which was in 2018 I believe was because in the modern era Bro is a very loaded term in this community and we felt that was no longer sustainable so we changed it. I have overheard marketing conversations like you wouldn't believe it's like yeah so we're commercializing this thing called Bro, no really it's called Bro sort of like that, it's hilarious so we had to change it and so that has now happened but you will find that some of the Zeke packages still have Bro in them and so forth if you dig deep enough in our code Bro is still mentioned in there and so forth so that is why, okay? Another thing that this conveys is because Vern back in the day did a bunch of cool stuff if you don't know Vern look him up sometime he was on the early TCP congestion work he did SCTP, lots and lots of interesting stuff but what I'm trying to convey here is that it meant that all of the early for a definition of almost 20 years early work on Zeke happened in an academic context so basically research and research papers and this slide I'm greatly indebted for Robin again who put that together and I just really like this because it conveys how much happened with the main focus being academia and it sort of tail ends there to the right of the slide with Corelite because the other thing that this conveys which I will cover in a second but the other thing that this conveys is that for the first almost 20 years of the project the way the project was funded was academic funding, the National Science Foundation and so forth and this is a total detail here but the blue boxes on this slide basically convey funding that was mainly to do hacking I'm just gonna call it that to basically put this thing sort of like tech transfer to put this out there and the others were essentially like research like how do you build a system that does X, Y and Z and the reason this matters is that it explains a little bit about our project philosophy and especially the first point there because in academia you don't get very far if you say like well what's the off the shelf thing we can use to solve problem X you kind of go like well in an ideal world like what's the best possible thing we can build for that and we have tried to do this and spicy is a really good example for this like this is just how we feel parser technology should work we could have also gone out there and taken Rust which is a completely reasonable solution in my opinion but it is more interesting to build something that like really really really nails it the problem with this is we've almost done this to a fault because over the years and now you can say decades we've built a lot of things that we therefore also need to maintain and that gets harder and harder because we're a finite team but the code keeps growing you could say like well keep growing the team but that's tricky and so that's a really good one to keep in mind we tend to build things so that they're the right thing even if it takes us much longer and from academia we've also inherited the spirit a little bit that we'll release this thing when it's ready you know whatever is the new thing we're talking about and I have to sort of caveat this because our release manager Tim gets grumpy if we push that too far so this is Tim Wojtulowicz he's our release engineer and he would very much like us to stick to you know three releases a year so Tim whenever you see that so I've made your point and then the other two are pretty reflective I think of any open source project listen to the community because they often have feature requirements that are really really interesting that we didn't think of and you might find it interesting to know that actually all of the core developers of Zeke aren't really Zeke users this is a little strongly put but most of us don't have a network that is where we have Zeke as a security critical application that we monitor 24 hours a day we basically just build this we have a lot of context we talk to people a lot in that space but we have to listen to the community because otherwise we might just build the wrong thing and therefore plans change every now and then so sometimes we have sort of embarked on a project and then thought like oh no this this is actually the wrong thing let's sort of like back up a little bit and build it differently all right so I thought that that should be out there project structure and now I'm getting into details couple minutes perhaps I would like you to know that there is a lot of thinking behind the scenes about how this project should operate we have sort of like sort of a leadership team that guides the project we have technical leadership we have a greater project team and we have various subgroups that focus on specific aspects of this and just to sort of mention the names of the people involved like if you think of like the Zeke team it's essentially these people most of them are very much involved others were particularly involved in the recent past and still sort of contribute as they can and if you spread it out across the roles a little bit it looks like this and I'm mentioning this mainly because on the top right there you see the core folks doing development work so that includes me right now and I technically hold the technical seat on the leadership team at the moment which is why you saw that on the first slide but that will rotate over the years before me Robin Stomer held that role for many years Vern holds our founder seat because he created the whole system and we wouldn't be here without him there's an election process for the leadership team every year so if people are interested you're all very welcome to apply and get elected and so forth and a few words about the Zeke community we have I think a pretty interesting structure in the team in that it is split across very different roles so we obviously have security teams that use it to understand what's going on in the network but we also have network operators so people who don't necessarily have the security context and who really know their networks but for example often times we find aren't really developers so it is easier to get a security team to go like okay I'm gonna write a package now than it would be for some network operators of course it's blurry right this is not pretty fine but that's sort of a split that we've noticed and then the newest is perhaps sort of on the bottom right there what I call observability engineers I don't really know if that's a thing yet but my hunch is that it soon will be and these are basically people who sort of try to understand what is going on in their infrastructure and the key bit there is infrastructure as opposed to network because we're increasingly thinking of Zeke not so much as a network monitor but an infrastructure monitor because it's broadening to host-based stuff so you can get events from your hosts and not just out of the network and it basically makes for a much more complete story about what you understand about what's going on in your infrastructure so that was basically it you are all very welcome to get involved I have sort of a long list here of ways in which you can get involved but I will not read through this there are three key bits in there like tell us your use cases like if you're using this or you're interested in using it and you find it does something well or not well or not at all just tell us it would be really, really good to know if you want to contribute, cool we have external contributors all the time it tends to work best this is another open source truism if you want to build something big get in touch first we can help you we can provide a little bit of context like maybe there's an older pull request somewhere that you could base on and so forth and if you're not a developer not a problem at all you don't have to write X plus plus code you don't have to write Z script we need a bunch of other folks for helping out with testing, documentation training is a big one right now all that kind of stuff that's it that's all I had here are some links you can find us on Slack you can find us on discourse we'd love to see you there so thank you very much are we doing Q&A? or I can happy to answer anything I'm also around for hours still so yeah hi to the left yeah yeah it's not a yeah yeah yeah so it's essentially just sort of an idea so picture so if you engage in threat hunting it's a data analysis task like you dig into like tables and so forth to find certain subsets of entries that you then sort of build more analysis on the idea there was that since you can codify what happens right in the network as you're watching packets you don't have to potentially export a lot of data into your data ingestion solution that mostly will not be valuable that because you will filter it down anyway to something that is more actionable for you that you then build further analysis on because you can just write that in Zeke and build your own log for it for example or like filter the existing logs in a way to get rid of stuff that you don't care about so that that's what I meant by shifting left if you picture the pipeline sort of where you start from packets and then you have Zeke and then you have logs and then you have data ingestion and then you do your analysis you can basically do the meaningful stuff further on the left yeah go yeah so HomeLab is HomeLab a thing or do you mean your home okay cool yeah yeah this is great yeah that's a really good point so the crux so this is a great topic because we would actually very much like to be better at telling you things about your home network the problem usually is that for most people it is actually hard to tap their home network so you basically have to go out of your way a little bit to do something and we've all sort of dabbled in this I don't think we have the one go to like you can go all the way from like building like having your own switch in there to something sort of like for really small networks you can open WRT is a contender although I think you have to tinker a little bit to get a build in there that'd be a good one for us to deploy out of the box Don you had a if you go to Black Hills Information Security they actually did a small version of Zeke off of a Raspberry Pi thing yeah and they tied it in with a very small hub something that you can actually buy through you know one of the vendors like New Egg or something like that and then you can do that in your home lab but yeah there are certain hubs that you can buy that will have a port that has a broadcast port that'll take everything in there yeah yeah but do you go through their archives? They have a, I forgot what they called it a Pi something but it's a Zeke Pi look for Zeke Pi in Black Hills and even before that though before you start tapping and stuff if you have any way to get a P-CAP or download some of the internet literally like to get started and get a flavor for the whole thing that's the way to go because yeah exactly just start with that right and then you understand what the data are you can try everything you can use the package manager you can change functionality you can do it all based on P-CAPs like I didn't mean to suggest that you have to run a cluster it's just if you want to do live monitoring then you will probably end up there I think yeah mm-hmm yeah right so the question was any suggestions for basically like digging into particular kinds of malware for example like CNC families or C2 families right yeah or oh right I see like okay yeah yeah so it's that's a great question so some of the packages out there will detect certain kinds of C2 but the more interesting one is probably like if you can think of a particular one to then map it to like how you would detect it in ZEAK and I don't have particularly strong recommendations there yeah like we could go for a beer and there would be lots of fun stories like back in the 2010s sort of we dug way into botnets as part of research work still and that was all done with ZEAK but that was because we had the time to dig into like how botnets were speaking command and control there isn't there isn't a one-size-fits-all solution for sure like it depends on how these guys build their stuff but what I can tell you is that if it's some combination of you know something DNS based some connection pattern you definitely can detect it like there's very little you can't detect because it's sort of another sort of mantra or truism that's like there's a fingerprint on the network and you can almost always get it and it's a great one to discuss further actually because it's usually also the reason why whether it's encrypted or not does not matter this often comes up when people are new to this isn't everything encrypted these days why are you bothering sort of right it turns out for that kind of stuff you can still work just fine like basically different topics sort of but the concern about encryption has been out there for almost as long as ZEAK has existed and it's just fine sure would we love perfect visibility into HTTP today sure right and I'm actually I'm very sympathetic to any effort that tries to conceal activity for privacy reasons and so forth I think that's all great stuff but for the kinds of analysis you want to do in ZEAK you usually can because despite encryption present you still have enough structure to sort of pick up that you can build C2 detectors and then so forth yeah okay cool get on Slack, ping me I can give you my card like it's all easy to discuss yeah cool yes right on yeah right on this is cool um so there's a yeah yeah yeah so oh this is a this is a great one yeah so so I would need to know a little bit more about this one there is a DHCP analyzer and we've iterated on that one a lot because so much gets conveyed by a DHCP these days right so it's really fun actually one of our you know main early contributors and well still but but sort of key you know developer Seth like totally went off on DHCP for a couple for a year or so because just it was so much in there and it I think depends on the exploit is the short answer to which extent you will be able to see it in there if it's literally you know as as easy to spot as an unusual field length totally I was chuckling when you mentioned this because we use a bunch of fuzzing to test our own parsers and the stuff they throw at you you wouldn't believe right like surely you will have an email address that is 10k long or stuff like that right like it's and then you have all of these discussions like well what do we do with that like should be like where do we draw the line like now we don't parse it anymore right and then you do experiments like we checked the other day like what's the longest email address you can send into Gmail like it is long you know like it's all pretty funny so so some of those things have no very good answer and it's this this old you know be liberal in what you accept sort of like we try to parse as much as we can but sometimes you have to draw a line because yeah yeah yes yes this is true and I would never deny the value of a peek at for that right like there's there's a reason why people want that it's just in our experience virtually never paying off that you do bulk pick up capture like maybe for a very short time window but with modern networks even that you know it's hard so yeah the on the flip side the a lot of the signature based intrusion detection systems I think really like that idea that when you when you when you have a signature fire on something give me you know for that flow like that packet and maybe a couple before and then it usually gets fun because implementing that is challenging but um but that's a nice idea it makes sense to me like as an analyst I'd also want to know like okay this weird byte pattern match what was actually in the packet because maybe it's sort of off a little bit or yeah so that's that's tricky yeah that's right that's right maybe this is a place where AI just testing testing can someone in the back oh no that's clearly do this you can hear me in the room so I think we're probably okay uh hello uh welcome to room 101 this room is sponsored by spider bot spider bat spider bats platform for cloud native runtime security shows exactly shows you exactly what is happening and secures your Linux VNs and kubernetes clouders throughout your software development life cycle visit spider bats booth to see it in action all right so those of you now should hear me all right excellent so I see people filtering in so I'm going to give another couple seconds but my name my name is Greg Brosser as it says on the slide here I'm a security engineer at meta and I'm on the red team operations group and this uh I'm specifically on the operations group which is the part where we do end and exercises to test our defensive teams I'll get into it as I go through the slides but I'm here today to share the perspectives of an outsider to red teaming I was recently on what you might call a defensive team or a blue team and I'll explain a lot of these terms as I go through and I recently switched into red teaming and having been on both sides of that fence I'm hoping I can share some lessons and some thoughts and some things that I've reflected on as I've gone through this that you can use to make sure that you're two teams if you do have two teams or if you're dealing with the opposite side can still stay friends because you get the best benefit when your secured engineers are working together not always against each other and I say that even though I'm against most of our my secured engineer friends so anyways so what we're going to talk about today this is briefly what I just said we're going to talk about what a red team is and the exercises that I'm talking about and the objectives on that and what we do I'm going to talk about who am I and who I was because that's relevant to the conversation I'm going to talk about how spoiler alert I spent many years at META being a SRE or DevOps person so for those of you in the room without security experience there are some lessons that you might pick up from my experience from that point of view I'm going to talk about how we do our red teams and the things that a lot of people probably actually don't stop and think about when they see the end result of a security exercise certainly if I look back on the the conversations I'll show you as I go through my notes here I didn't think about it and so I think that the it's a it's a common perception to just assume that everything is great and rosy when you see the end of the exercise and you don't see all the stuff that goes into it and I'm here to talk about some of the lessons I've picked up from that and as it goes into the perception part where I talk about reflecting on what prior me thought after seeing the end rails of these red team exercises like how is the how can you use the perception that might appear to help keep people stay friendly at the end of the day I'm going to tell a story about my first operation on the red team it was about a year and a half ago now and it was quite an experience to figure out exactly you know coming from an SRE background I thought these red team people were just hacking things all the time and it was an interesting awakening experience and then I'll go back and I'll recap some of the stuff and I'll wrap up and talk about what you should probably be doing to take these back to your teams if you have a red team or if you have offensive people dealing with contractors whatever your situation is so for those of you in the room who aren't intimately familiar with this is with what a red team is or for those of you who are you can probably tune this part out what is a red team so in security we often think of the people that do security work inside a company usually you might have you're probably familiar with your blue team they're your incident responders they're your people who are securing your servers doing that sort of work to keep things safe they're responding to alerts making sure things are configured securely we have this concept of adversarial testing or using vulnerabilities or penetration test or anything like that that's typically what we'll consider the red team they're the offensive team they're coming at you instead of trying to defend your organization if you don't have an internal red team or if you haven't dealt with one before a lot of people are probably familiar with a lot of the sub concepts here such as like penetration testing you might have retained a penetration test firm or seen a report to talk about like what vulnerabilities are found in your software red team is quite similar to that you might be familiar with vulnerability research where there are people even in the open source community especially where they'll do research of a certain application or a certain technology and figure out the weaknesses in it file issues get them fixed anything like that when we talk about a red team and we talk about the day-to-day stuff that I do as the operations group it's a good mix of all of those things in our team it's kind of what we like to think of as a test of the full security stack and so what it means is we're doing bless you penetration testing aspects like we're breaking into systems obviously we're doing a good amount of vulnerability research we're trying to figure out the vulnerabilities of the systems we're trying to get into and we're rolling these all together into what we usually leveraging those particular attacks those techniques to get unexpected access to something and then the key part of what makes us a red team operation and I'll get into this in a bit is we're usually escalating this to the point of triggering a response so for those of you with some background in the room who might have been an SRE have DevOps people you know people who run operation stuff and manage servers whatever they're doing failover testing if they have two databases and active and a passive they're gonna they're gonna switch between them to make sure everything works properly and they want to do it in a controlled scenario they want to find out their run book or their documentation or the configuration doesn't work in a situation where they can recover quickly not in the middle of the night at 2 a.m. when your pager goes off right and the same concepts can apply to your security organization so your security organization has a list of run books a list of documentation where they're gonna find the key logs you need in an incident or how to disable an account for example and you actually want to make sure that those documents are up to date when they need them in the point of time you don't want them at 2 a.m. when their pager goes off being like man how do I stop this attack that's actively going on and so that's what I talk about when we get to the response situation here so why do we why do we test response I just talked about that we want to make sure that those playbooks are up to date and everything else essentially what you're trying to think of is you're giving practice to your people who are doing incident response in an ideal scenario and I realized that saying this most people are probably not in that ideal scenario you were gonna have your incident responders not responding to big incidents all of the time and so the situation in which they do you still want them to be sharp and ready to get there you can kind of you can kind of think of what we do in the red team in an ideal situation kind of like training a prize fighter your prize fighter in this case is your often your defensive team sorry responding to incidents fending off people configuring your servers right if you have a red team or if you have adversarial exercises offensive security things what you're trying to do is you're doing those training exercises if you come in too light your incident responders are going to be bored they're not going to learn anything they're not going to see the attacks in the outside world if you come in too strong if you're finding a whole bunch of like zero day vulnerabilities and throwing them all at your defensive team that's not fair either and you're going to tire them out they're again not going to learn anything they're not going to want to collaborate with you and they're still going to be unprepared for anything that happens in the real world because the vast majority of what they're going to deal with is not necessarily going to be that level of attack so in an ideal situation in this sort of training exercise you want to come in the middle you want to have them prepped you want to have something realistic you want to look like an attacker that would actually do something to your network and do an attack that is relevant to you as opposed to something that isn't if you don't use email on your day to day basis running a phishing test is not actually going to prove anything to you in that particular situation the interesting part about that and this is what's led me to give this talk is it is that sort of attitude can lead very easily to an us versus them mentality and this is certainly true in the slides I'll come up to and what I've experienced in the past a lot of times if you are not directly involved in the response maybe you're just an SRE helping on your security teams for example configuring something and you're on the sidelines you're not responding to an actual incident you see the end result you see someone broke into your system and did x you see someone walking away with a bag of money and it looks super exciting but you don't end up knowing the prep time that went into that and so you end up thinking the people who just did that like one and you lost and that's not really a helpful scenario or a helpful thing and it's certainly not what we practice in our red team we want it to be a fair fight and we want people to walk away and learn things from that right and so you don't want that us versus them mentality you're there to help strengthen that relationship and make it make it better so you don't want them viewed as the enemy because they won't want to work with you like how many conversations do I start on a daily basis with people in the company and they look at my title and they go wait is this a red team I don't want to talk to you I don't want to tell you about the systems I'm working on you can't see my documentation that's not really fair you want everyone to work together you don't want it as adversarial as it seems to be so I'm about to talk to you from the perspectives of the red team but what if you have none so don't feel bad this isn't super uncommon obviously and so we have a pretty prepared blue team and a response team that can respond to a bunch of different incidents and we have to all the time you have to be an organization of a certain size and you have a mature enough security organization for these exercises to make any sense if you're a smaller organization you're probably retaining a pen testing firm doing one off targeted reports or something of that nature just because your response team is not as practiced in that this is not something they have to think about if you're in that situation there's still a lot of lessons you can probably talk take away from the conversations I'm about to have because you're still getting results you're still getting people breaking into networks or something like that the difference the main difference between what would all consider is a red team exercise and some of the other stuff I've talked about like a penetration test is there's a good amount of an element of surprise in a red team operation and I talked about the response part and that's key they don't know we're coming so if you retain a pen testing firm and you're like can you please look at this application the engineers who manage that application the security team who secures that application they know what's about to happen they know if they see something in the logs oh that's probably the pen testers going away in our particular situation when we're trying to manifest a response they don't know that they're in the dark they're catching us by surprise and that's key right and so that's one of the main differences but the lessons I'm about to talk about probably don't necessarily need to rely on that element of response or that element of surprise kind of trying to remind myself to slow down and breathe and it's not going very well okay so this is what we have is so at meta we have these internal profiles and you can look up any other employee in the company people are probably familiar with something like this this is a picture of my internal profile probably from around 2017 and so I joined the company I was an assistant admin dev ops SRE sort of situation we call them production engineers and I had been doing so for about 10 years I had like an amateur security hat on at the previous companies I was working on I was doing a little bit of security stuff and when I was recruited and joined meta I was recruited as a production engineer and as I said that's they do our SRE and DevOps each the production engineering team is a large organization but it has individual pods or teams that focus on particular things so you might have a production engineering team whose job it is to maintain our cash servers so you might have a production engineering team just like keeps the website up and things like that I joined the company and I joined the PE security team I had had the amateur security hat on for a while and I was like let's give it a shot while I'm here there's you know a bunch of stuff I can pick up and learn from that and so I joined that team and our first project is an example was maintaining SSH for hundreds of thousands of servers and tens of thousands of engineers sounds kind of trivial but when you get to that level of scale you do cool things we one of the projects I worked on and still still missed to the still used to the stay actually is like we used SSH certificates rather than keys because if you have to check 10,000 keys at sign-in it's going to take quite a long amount of time it's pretty sweet I'd never thought about it before and it was a fun project and a lot of stuff to learn and we maintained it from that SRE point of view but obviously this is a core security system across the fleet like you're touching SSH and all of our servers right this is important stuff the reason I mentioned that well actually it's important to note that the PE security team over a couple of years and throughout time eventually just became responsible for the security posture we say of all the servers that we call production so every server you're talking to when you're loading Instagram or Facebook or using your Oculus or anything of that nature you're talking to something that's when the in the purview of PE security it's a bigger team now that when I joined it does a whole bunch of more stuff the reason I mentioned that is it gives some context of what I was doing in the past right and so I was doing SRE dev-up stuff pretty cool but I was essentially if you think about part of that blue team and so I wasn't responding to incidents I wasn't looking at logs or triaging anything of that nature I wasn't in the heat of it when the pager goes off in the middle night of a security incident unless it involved the SSH configurations that was rolling out or anything like that so I was part of the blue team but I was behind the scene so it was like helping out not necessarily part of them but not part of the red team either like on the good side I used to call it before I switched sides yeah and so not necessarily a blue team but related enough to be involved and plugged into all the things going on more from the sense of like when something happened I'd probably get some of the findings that happened or some of the lessons learned from that incident applied to the stuff that we were running eventually in the bonus of time this is me now this is well 2021 when I took this when I started the slides I joined the offensive security group I put I'm now a security engineer and I put offensive in brackets on this slide because I think that's that's an interesting perspective for me I joined a team of red teamers who have been doing this stuff for many years and me coming from a necessary DevOps background like I wasn't an expert on this I didn't really know what I was doing from an offensive point of view I'd followed along for a while knew the general concepts but I actually considered myself like a rank amateur in terms of the level of stuff that happened on this team and this is relevant for the rest of the slides I'll get into that I became a member of the red team operations group so I had someone before this slide talk about the offensive group I mentioned earlier we do penetration or there there exists like penetration exercises vulnerability research all those sorts of stuff the offensive group is a group which has all of those offensive security concepts so we do all of those things and it's all it's all within our group my particular team the operations group is where we do those end and operations that I talked about earlier where we're using the research or the penetration test techniques or whatever we're going to do to get access and do an end to end operations so we'll pick an objective we'll pick a goal a system to target and we'll do whatever we need to get to that point and again remember I've been surrounded by people of doing this for years so I have a lot of helpers who help me stay on the offensive security lines so the interesting part that I didn't know when I joined the team as I mentioned I was kind of like around security enough to see these things happen but not directly involved as I didn't quite understand exactly how much research and preparation goes into a red team and this is kind of the crux of why I'm giving this presentation I would argue that if you have a team that does a similar thing or if you've read red team reports or vulnerability research or whatever you're probably just seeing the end result you're seeing the vulnerabilities they found you're seeing their findings you're seeing the evidence that that happened you're seeing the giant vulnerability websites that pop up and everything that the response that happens as a result of that but you don't know how much research went into making that happen and so when we start a red team exercise while so little bit of background here we're an internal red team and what that means is we have access to like the source code the wikis and even the engineers if they'll talk to us as I mentioned it's sometimes a little adversarial and so we can do a number of a lot of research to come in and know where we're going to land and what we're up against and so what I mean by this again if you're if you're an SRE background like maybe you have a particular service that does a particular job and maybe it runs out of memory at 11 p.m. every night and crashes and your band-aid solution is cool I'm going to restart that at 10 p.m. we're going to find that and evidence of that happening in the system we're going to know that happened so we know the rough systems you're using we know there are weaknesses we might find evidence of that bug, a task or a wiki page or something of that nature and we're probably going to find out the mitigations you've put in place so if we're talking about if you're gluing various systems together and there's a security boundary there we probably know that that is a common weakness that people are going to go up against and we know the mitigations you've put in place to make that not an obvious vulnerability and we've do that through a good amount of research the interesting part about that is we usually once we find these mitigations rather we're also going to find your detections we're going to find the places that you are going to watch to see if something is suspicious if that system does crash we probably know who's going to get paged who's going to get the alert maybe we found their playbook that tells them what they're going to do to respond and it helps us plan or exercise that we are in control of like what happens so if we're coming in with a vulnerability in your service and it crashes who's going to get that alert and what are they going to do and this is relevant because like if we're testing a security thing we don't want necessarily a bunch of deep dive maybe at some points and so we're going to structure it in a certain way we're going to touch this system whereas we know if this one ever crashes this is bad and we'll save that one for last because that's going to trigger our response right this is the level of stuff we know at the end of that research again going back to what I said earlier a lot of people don't see this they see the end result they see the big bang thing they don't know the steps that led up to this and this is an interesting thing that can lead to that us versus them mentality so I mentioned or I'm an internal red team which means I have access to all these things the source code the wikis the systems whatever you're going to do if you're not doing this if you're a penetration test or something like this you're probably having access to a lot of this anyways you're just doing more guesswork we still have to do guesswork because everyone knows that writing documentation is not everyone's favorite job in the world it's probably out of date the moment you put it down the same thing applies to when we're trying to figure out how the system works we also I'm not going to get the newest information but your your penetration testers your other security people they're doing what they can to learn from OSINT or like open source intelligence they're looking at your systems and seeing what your DNS names exist common ones they're seeing where your infrastructure sits or your AWS or GCP they can do all that even though they might not have access to the same things we do but we are the internal team and we can see those things so the interesting part about this is after we've gone through this and after we know the weaknesses the mitigations all the detections we know who's going to get paid we know your security systems we end up looking like this we can dangle that operative into the middle the vault where all the money sits and we know that if you put my hand up over here I'm going to trigger this laser in my hand down over here I'm going to trigger this one and ideally I know who is going to respond in the event that I do those things we have a plan we have imperfect knowledge as well my manager is common one of the quotes my manager commonly says is everyone has a plan until they get punched in the face going back to that box or analogy and so we know we think we know according to the documentation that existed when we found it what will happen when we trip that laser but we don't really know and so we get imperfect knowledge but like we think we look like this and this is going to be awesome right this is what we see at the end so a little bit of context of the stories I'm about to tell we have an internal group called red team operation debriefs it's as boring as it sounds the red team reports we write and the findings we generate end up going into this group for people on security teams to consume and understand what's going on this group also serves a second purpose though I talked about driving response and the things that we do to make sure the security team is actually responding to us and generating an incident and things of that nature and I talked about how we do that by surprise right so what will happen is just like if you're like a person responding to a pager for a system you're going to get alert you're going to be like that's suspicious but they're not going to know it's the red team they're not going to know what's going on and so this group also exists as a like pressure relief valve if things get too serious so if our responders are like lost in the woods they don't quite know what's happening or maybe they're doing too good a job and they're about to call like the authorities to actually have us arrested we don't like getting arrested obviously it's not really productive we'll actually drop the way we announced this to the other team is we drop like an image into this group to let them know that all this was us and they can sort of calm down and relax and we can deal with the rest of the red teaming process and so as the blue team responder I talked about or well blue team not responder earlier I was talking about the person who helped like maintain the production servers I was a member of that group and I knew roughly where these things going on and so the next slides I'm going to talk about come from the context of oh my goodness send image just dropped there's a red team going on this is exciting right so this is a screenshot from 2019 I was firmly in my production engineering days and an image just like I had talked about had dropped and I knew enough of the red teamers as friends to sort of get the crux of what was going on they wouldn't give me any super secret details but they would let me know what was going on and so I knew they were doing something and I saw the image and I messaged them I said well you know your op is done and they said yeah well you know and I said well did you get detected and they send back a face right and they say and I asked if they haven't yet that's really hilarious nice and they said no it's been weeks right now I want to talk about the different perspectives and there's a reason I've highlighted this red and blue in the slides here the different things you can take away and how I reconsider this now that I'm on the other side of the fence so 2019 me was like oh man the security team didn't see them in your environment for like weeks how is that possibly the case like what's wrong with our security organization that no one noticed that this was going on at any given time right and it's interesting to think of like you don't want to well sorry the instinct I had was to go and say well you know there must be a defect somewhere there must be something wrong for that to have happened the interesting part about now that I've joined the red team is that when I talked about earlier when I talked about that research and I showed you the picture of us dangling into a vault knowing all the right things this level of sitting there for weeks and not having tripped the alarms yet that's roughly to be expected right so the reason I call this out is because the red teamer at the time who was talking to me probably had that perspective and didn't think to say oh well you know it's not as bad as it looks we knew all this stuff and me I didn't have any idea that I should ask that sort of thing I've only seen the end result I know they weren't caught for weeks right and so it's interesting to consider the different perspectives from both sides of that conversation both equally correct in their own minds ending up in the wrong spot right the truth is somewhere in the middle maybe they should have had detections in all the places in all the files they would have seen they would have caught them but then maybe the system runs horribly inefficient so what's the point right and there's a balance to be struck there and we knew exactly where to step or not to step to not have been caught right and so this is something that most people don't see unless they stop and think about it or have been on the other side so to speak both of us have like entirely different perspectives walking away from this conversation so this is another one from 2020 actually this is the same person and another image dropped and I was you know super excited another red team happened I can find out what's going on how did we what happened this time I asked what happened and I get the response he says between us it was fake flash that dropped via code malware and I look at that and I go oh my goodness it's the year of 2020 how on earth is fake flash still trick people into into running stuff in the organization how is that an entry vector that still works the part that I didn't know well there's several parts that I didn't know first of all that whole fake flash thing was the smallest tiniest bit of that operation the very beginning that didn't actually matter very much those of you who are on a security team probably probably know that you know even in even in a really good organization you have super talented engineers you're still going to have that one percent of people who might be tricked or might not be operating at their best that day who are going to engage with that and they're going to get that sort of infection right so even though this is uncommon this is rare this should be harder it's still going to happen and so it's not unrealistic and so you talking about the things you want to practice this is one of those things right the other thing I didn't know is that we have a technique in red teaming called assisted executors and so we have a lot of things that will take a decent amount of time to come to fruition if we didn't do this and what I mean by that is if you're doing a phishing attack and you're depending on that one percent to click on your lure and maybe you don't know which one percent it is and whether it's actually going to be relevant to the exercise you're trying to do you end up cheating a tiny bit you find someone who will get that phishing email and engage with it just to have a plausible story to go through and this applied in this case the person who installed that fake flash like they didn't do anything wrong they were being asked by the red teamers to do that to kick off an operation to start that thing right and so when I say this is the most minor detail of the operation it was super minor the other thing that I didn't know and I've learned then since joining the team is the second part of that sentence is actually the key thing they we dropped a VS Code malware thing where we landed on a machine and like hit in the VS Code process that whole thing the amount of research involved in setting that up the whole technique of doing it all the exploitation stuff that took the team weeks if not months to perfect that and have that work excuse me reliably enough to be used in that exercise right and so here I am again talking about the different perspectives is the red teamer has told me what's going on and they have no idea like I'm I'm focusing on this well they do know I'm focusing on the fake flash bit but they you know they're probably agreeing with it at the end of the day why is this still a thing that we have to deal with in this company but like I don't realize that that VS Code malware is actually the key part and they don't think to correct me because they've spent just weeks planning and this is just part of the operation they've gone through right and so we both again walk away with entirely different perspectives I'm walk away thinking man our organization must be so interesting that we still fall vulnerability to these things right when in reality this has been like exquisitely crafted specifically for this to happen and we would have found another person to engage with this if we didn't I take it for granted that this all worked 100% I see the end result this happens after the image has dropped so I know something has happened this is you know I it all worked but it took tons of iteration to get to that point this is a quote from JFK I believe he was talking about the CIA it's not super relevant what he's talking about in this case but it's this applies equally to your security teams your successes are unheralded so you know no one talks about the 10,000 hackers that your firewall actually kept out or the attacks that didn't happen because you have defense x in place your failures are trumpeted we care about the one thing that does actually get through it's a common thing that I've heard before that defenders think in lists and attackers think in graphs attackers have to find the one chink in your armor to get in the front door and then they can move an over through different things your defenders have to secure all the things all at once all the time and so it's a hard problem to solve right and this is especially true here I was not directly involved in the incident response here I'm sort of following on at the end and I'm jumping on the bandwagon of the exciting red team that happened at this point and I'm kind of like trumpeting the failures that happened the red team that happened to succeed at that point and that's not really a super helpful behavior you should talk about all the things they had to get there but most people don't end up seeing this so like when I saw the image drop in that group and I saw that there was an event that happened and I talked to the red teamer I don't I don't think like this I just see this I see the end of the operation I see that red teamer running out with a bag of money finding the password file the encrypted database whatever is going to happen I don't I don't see the amount of planning that got to the point where they could do something like this or think about that I only see the end result and I see someone taking a picture but I don't know what slide they're taking a picture of okay cool all right so and like this is when I switched before I switched to the team that's exactly what I saw I saw these these badass hackers so you know taking all the monies and doing all the things you know I'll go join them little did I know that I was about to join you know a research factory at the end of the day so I talk about the amount of research and preparation we have to do and I want to give some clear examples of things that most people would not think about when they think about exercises like this and would go unnoticed and like are significant time sinks now that I've crossed the fence what you see here is a common example of something called a kill chain the details don't matter too much but these are a number of things that you have to do to have a successful operation the the X filtration at the end would be that bag of money that's the end result that's something exciting that happened but the first six steps of that diagram are equally important that you can see there's a pathway there to get to that end result and me in 2019 when I see that image drop I don't think of any of those things I only think of the end result here's some examples I talk about the vs code malware and landing on a machine the interesting part about that is not only do you have to get you know a piece of malware in an environment that works and runs you typically want something called command and control right having this run pretty useful maybe you get one shot at grabbing something great that's usually not what you want what you want is repeated access you want like remote access into the system so you can do interactively what you need to maneuver your way around get to other systems or anything like that and the way we do this is through a number of like we have a number of different ways of doing this you know it's if it was straight up over the network that'd be pretty obvious maybe we have stealthy channels like we can use DNS or we hide messages in ping or we do something really sneaky like that right there are a ton of different ways we can do command and control in fact like we spend an inordinate amount of time on command and control but that's beside the point and we have to think of these things right we're going to go into an environment and we're doing that preparation we're doing that research we have to figure out which one's going to work right because we're developing that payload that malware that's going to live there and we're not usually going to have the opportunity to fix that in post right so we have to land it in there and have a communication mechanism that's going to work and so when I do an operation let's say I'm going into a particular environment I have to figure out if I need to use the DNS channel or the ICMP channel or TCP or whatever what's going to work in that environment the end result is that I find the one that works and I do enough research to get it up and running and everything works great but no one talks about the iterations and the research I did to get to that point right no one talks about your security team is good enough to catch the DNS one so you have to find an even more covert one or something of that nature and so there is like success baked into that for your security team to get to the point where you have to be covert or sneaky right like you don't get that for free another example I use the acronym EDR and I forgot to change it on this slide EDR is Endpoint Detection and Response I believe doesn't matter it's the next evolution of AV that is on most enterprise workstations and so when I talk about running malware it's not exactly trivial to just like run malware.exe on someone's machine there's typically a bunch of system processes that are checking to make sure malicious things don't happen there's a bunch of access pattern checking that happens behind the scenes and you have to manipulate your malware that you're going to use enough so that it hides from that and still accomplishes what you're trying to do but the key point of that is no one talks about the iterations that you got there you ended up at a final result that did run and did succeed in what you wanted to do you got control that's great but like the 1400 other builds that you had that failed in some interesting way don't get seen as the end result they may end up as a line item in your report but they don't get called out and again that's something that your security team is responsible for maintaining has set up and done and that's work that doesn't get recognized when you see that end result at least by default and so it's easy to forget as a red teamer especially that you don't see those as the outsiders and it's also easy to forget even if you're a responder that a ton of that work goes to this point to have this succeed and so even in some cases the responders see the end result and they go oh man what happened but they don't know that there's been like tons and tons of research to get to that point so I'm going to talk about when the first operation that I was on and one of the first lessons I learned when I was at a red team in terms of my first operation sorry the same words you know the production engineer me met the offensive security engineer me in my first operation so this is the it's January 21 I've joined my the the red team the operations team and I'm kind of a little bit freaked out about it I mentioned earlier you know I put the offensive in brackets I'm surrounded by people who have been doing offensive security for a while I'm just a slowly dev up sky like what do I know about offensive security am I going to find are they going to find me valuable enough to keep me on the team something of that nature right all those thoughts are running through my head and so like I end up like feeling a little bit trepidatious about myself I'm unsure the next thing so again this is the first operation as well so you've got the first operation jitters I guess I should point out I find out shortly after that that we're operating in production so I skimmed over it earlier but production refers to the productions the servers that will actually answer your request at the company like when you're browsing Instagram Facebook things of that nature but more importantly like when I was a production engineer when I was doing those jobs in that I was operating production as well I was securing production right and so I feel a little bit better about that I'm a little bit more excited I'm a little bit more confident that things are going to go well because like I know what I'm up against and that said like I'm now defeating the things that I had spent years set up and so there's a little bit of fear involved in that and I'm like oh man this is actually going to be hard because I know this is we've done x y and z but at the end of the day like we're operating in a place and I'm a little bit more comfortable I feel better about it okay I'm less terrified than I was at the beginning feeling a bit better so after staring at some stuff for a while doing a little bit of research probably a couple weeks of research I end up finding a configuration management misconfiguration those are a lot of big words but essentially I found a key vulnerability that let us continue I was doing a bunch of wiki research and reading some code and I'm like oh we can we can do this and this and this and we'll have access and this will be great and so now not only am I feeling good because this is in production I've shown a little bit of worth and I know what I'm talking about I've done some research I know how these systems work I've come up with something on top of that like I find out this actually leads to root on all the servers on our environment because it's a configuration management system right so there's a lot of elevated stuff and so on I'm feeling absolutely fantastic right and so I've gone from you know way way back here my first operation a little bit worried and through all these sorts of things I'm feeling on top of the world I feel I feel great I've joined this offensive security team I'm doing offensive security work the things I'm doing are useful interesting impactful I guess you should say I'm feeling better about it you know I feel like I've just won the jackpot all these things have come together I didn't think I would get the the opportunity to do these things like my first time around as an example I thought I have to you know pick some stuff up as most people do on a new job this is great one of the other things that's cool about this is like I've spent years inside the company doing doing this sort of work right and so if you're a DevOps person like you're probably familiar with like putting the duct tape over the thing that's broken in the system to keep it together well enough that it doesn't break for at least another couple of days or you know trimming the log files or something like that to give enough operation room for another couple of weeks right but I've spent I've spent time doing this in the security spaces and more importantly I know the really great targets across the company when it comes to security related things right because I was on that piece security team I know where the the key logging sources are I know where the the interesting servers that might exist that do you know specific operations I know where the secret stories and all that sort of stuff and I'm like this is cool I'm in production I found a cool defect we have high level access I can totally use you know this skeleton over here that I hid in the closet years ago to go and find the particular thing like there's all these cool targets that I can now attack as an attacker that I'm on this side this is great if I think back to 2019 me when I used to see these red team exercises happen this is how I used to think you know they they find some way they get rude everywhere and everything's great they just you know bust up everything right and I remember I was having a conversation with my operation lead at the time when we were talking about the rough plan the one we formed before you get punched in the face as I said and my lead sort of looks at me and goes that's cool you know found this that's really awesome I'm like yeah we can go after this we can you know that that key server over here we can do that and he's like hold on we we can't do any of that and I said well I mean I mean why we we have access in here we can do just these these cool things right and they said well because you have to think like a threat actor and I pause for a second and I said I'm sorry I you know you I have to think like a threat actor can you elaborate a little bit you know what do you mean and he says like the people they're going to break into your organization the attackers they're not going to have a map that looks like this they're not going to know where all those cool secrets are that you spent years protecting right they're not going to be able to know exactly what cool thing to be impactful this operation to do they're going to be making guesses right and I said well that that's that's kind of interesting and my my if you look at the the faces slide earlier like it took me down at least another notch or two in terms of the ability to leverage the thing that we had just found but we couldn't go after those critical security services because if you think about it from an attacker's point of view if you landed in this environment you're not even going to know where to start looking you might have some guesses right like maybe you've done your research and you know Facebook uses our meta uses a certain technology to do a certain thing you can go look for that right but what if you know we've changed it over the last year we haven't told anyone right like you don't know this stuff for sure and you don't have this map that tells you these things you're probably just going to go for the quick win I think in this particular operation we ended up just searching for some you know finance-related data or something like that to generate a response of that nature but I was sitting there going well wait a second like I want to do all these cool hacking things and it's like well no there's a story that you have to tell I talked about the response piece earlier I talked about having your blue teams be trained by these operations to prepare for these exercises and I talked about how you have to do this with an element of surprise if they know it's you they're not going to respond in the same way there's no going to be no urgency no worry to that you're not getting the same sort of in-depth test of your runbooks and staff that you think you are if it's not like secret and covert or anything of that nature the key part of that is you have to be able to tell a story I talked about some of the the things that an organization will face from a security perspective you know phishing as an example or malware served via a website or something of that nature you want your security team to prepare to or to prepare to respond to things that actually happen and so you want to be able to to use those techniques that will actually you know be practiced in real life right and so me breaking in and using this particular vulnerability to get rude everywhere that's cool but unless the attacker we're facing is someone who's worked at the company for six years that's not a realistic attack that you're going to face in reality right and and so this was an interesting message for me because I thought you know red team cool you just go in there and you hack all the stuff all the time but there's a whole element of storytelling that's involved in planning these operations and thinking about the response that you know you don't think about by default when you when you think about this cool stuff you just see the people running away with a bag of money as I called out and so here we are back again at the slide here where I talked about the thing the seven things you think about during an operation and the first six being invisible that's the main message I want a lot of people to take away from this when they see when they're interacting with their defensive teams or maybe reading a vulnerability report or anything that happens it looks really bad when someone gets hacked or some incident that happens but there's usually a lot more story that happens to that right we can do the best security things in the world by unplugging our computers and leaving them in our desk drawer and then no one can use the system we need right but that's not useful in your day-to-day life you have to make some trade-offs and sometimes they're gonna bite you where you're worried about and things that actually are concerning our patterns right you don't want the same red team to succeed over and over again you don't want the same pentest to come up with the same results over and over again right and so keep these things in mind keep all that preparation in mind when you're looking at the end result of these operations it's easy to forget those things so you know red teams look totally badass but at the end of the day it's only because the amount of research we had to do it's the prep work we do to get to that point the other interesting thing is your blue team does hard work and you should remember that and so I talked about the amount of time that we have to spend on things like evasion of antivirus or EDR solutions or command and control of all those things and I talked about the amount of effort that we have to go through to end up in a situation where our plan is actually gonna succeed and I talked about how people don't see that but the reason that we have to do those things is because your blue team has done hard work getting to that point right if I go in an environment and I have to use like a really slow covert channel to make sure things are gonna work or things are gonna succeed it's because they've blocked all the other paths that are gonna work trivially right they've done some successful work to get to that point it's easy to forget that sort of thing no one talks about at the end of the day the malware that didn't run in that environment or the things that were blocked or the hackers that are blocked by the firewall people only look at the successes and that's not necessarily the best part so next time there's an incident of that nature dig in and try to figure out what sort of stuff had happened to get to that point so I'm at the end of my slides and the interesting message that I want people to take away is to think of if you have offensive security teams treat them nicely and realize that they're only here to help you protect your environment just like you guys are if you're defending against them right at the end of the day you're doing work to protect systems and we're just you know helping you prioritize and helping you think about the things that you might need to do in the event of an incident and yeah that's my talk thanks everybody so I have some time for questions if anyone wants but I prefer to take them up at the podium or whatever so come up and chat with me if you have more questions thank you for coming I will be at the meta booth in 401 in the expo hall probably for the rest of the day if you have questions and want to go somewhere else but yeah thanks for listening everybody and I hope that you're nice to your blue teams check check mic check one two this one works too yes check check one two one two all right okay welcome to room 101 this is sponsored by spiderbat spiderbat platform for cloud native runtime security shows you exactly what is happening and secures your Linux VMs and Kubernetes clusters throughout your software development life cycle visit spiderbat's booth at booth 112 to see it in action thank you Paul you guys can hear me this is all right perfect yeah well thank you as well to scale 20x thank you you guys for coming and being here and thanks to Firetail for the organization I work for for sponsoring meets to come here and today yeah I'll be talking about the WASP API top 10 the top risks associated to APIs we're going to be talking about the 2023 version actually the release candidates that was announced just a little under a month ago so only the freshest stuff here so a little bit about me I'm Paul Mansour Geoffrey on I graduated from the University of Montreal in 2014 and international bachelors international studies bachelors then worked for the fourth intelligence company a sort of Canadian forces reserves unit then moved on to sort of two startups digital innovation and cloud security joined a cloud security posture management startup cloud conformity in 2017 that essentially helped organizations maintain their their posture in the cloud and then more personally I'm a fan of gardening space I did an international space university course over the summer sailing and house music so that's enough about me let's actually get started but before we do let's back it up just a little bit maybe just a couple hundred years and talk about some threats to the industrial revolution or threats that were present in the industrial revolution I should say so yeah high women stagecoach bandits you can imagine if you're traveling city to city you may be at risk of being stopped by some unsavory people and then robbed of your belongings or worse another threat high seas parrots actually the US's first overseas conflict guys didn't do that well with modern day Libya actually and yeah some some not so great stories out of that but yeah and then the third risk well the third sort of major threat nation states of course so we've had couple world wars some more wars before that and you know wars up to this day now fortunately we've mostly moved on from this at least in the sort of developed western worlds mostly I say there's still many many exceptions so now we're more in the digital revolution so what are the sort of threats that we might encounter here well obviously physical attacks on data infrastructure I think about an AWS data center that gets attacked there's a disruption that's going to happen necessarily further than that there's disruptions to your operations so digital disruptions that can include ransomware you can think of the Texas power grid hack that happens in 2021 where a an organization that was part of the oil supply chain was disrupted through ransomware and sort of that led to some you know very long lines at gas stations and very high gas prices and then third the intellectual property theft I think that one's pretty self-explanatory but also asset theft so you can think of for example satellites that get sent up to space oftentimes the way that those satellites get controlled is through well APIs or other digital means so if that gets hijacked then you can end up actually losing that assets it can it can be satellites could be drones could be rovers could be any sort of remote controlled assets and then finally personal identifiable information that sort of theft leads to real monetary and reputational loss for businesses so definitely not something to discount so how did we get here so we've we've established that the threats to today's industry are different than what they were before they're also consistently changing so five or six years ago was at the forefront of getting organizations safely migrated to the the cloud I saw all kinds of cloud infrastructure issues so the issues aren't the same that they are today best practice it evolves right it's it's an organic thing today what we're seeing is that APIs are actually on their way to becoming the number one source of vulnerabilities for organizations running IT infrastructure so that brings us to the subject of today's presentation application programming interfaces the attack surface that connects us all and then more specifically the OWASP API top 10 as I mentioned today we'll be covering the release candidate of 2023 so there's some new things if you guys were familiar with the 2019 version there were some changes that were were made still subject to change because it's a release candidate so yeah but so only fresh stuff so number one a broken object level authorization so developers often incorrectly assume that authentication equals authorization it's probably because they're both often just shortened to off so devs will assume that API calls will only be coming from good client software like a like a mobile app and that that will help them control the parameters of API requests which will then make it impossible for user a to request user base data unfortunately that's that's just not how reality works any sort of predictable pattern in your app's data structure can be exploited by a malicious user so if you're not using random and unpredictable values to identify your data objects they can be predicted and so are much easier to exploit so let's use a right share app example say a malicious user guessed your right share account user ID because they figured out that the IDs were sequential like ID 101 ID 102 103 etc so the attacker could then mock a legitimate API call to fetch information about a legitimate user's right history that's personally identify the information you don't want someone to know about your right histories so essentially this unauthorized user gained access to a data object that they shouldn't have had access to if the right share app's back end doesn't check that this user is allowed to access that specific data object in question then of course the API is vulnerable so the the mock requests made by our malicious user to the right share apps back end would only go through if the mock API call is not properly checked for authorization the user making the mock request is actually receiving the right history of that they requested then of course that's a breach so an authorization check was definitely not made in this case by the application logic and an object level authorization that's been broken that's the sort of vulnerability here so if I haven't been clear this is really really bad this is the sort of breach that security professionals lose their jobs over user A getting user B's data is really really terrible so yeah it's a good idea to use universally unique identifiers or globally unique identifiers to mitigate this risk and ensure that you know all your functions are actually checking for authorization so moving on broken authentication the second one on this top 10 list so this one's about user impersonation can a malicious user trick your application into believing that there's someone else so there's a few common ways that this can happen credential stuffing is just using pre-made lists of user names and passwords and then trying every single one of them out until one of them works another one is using automated brute force attacks against a specific user where an attacker tries out different passwords or MFA tokens until they find the right one third poor in transit security like lack of encryption poor or just no hashing putting off tokens or passwords in in the url this can mean or can make your users very very vulnerable so malicious actors can intercept that that stuff and sort of just by sniffing the network for these api requests and responses can lead to a yeah a man in the middle attack fourth tokens or other credentials that are properly generated by the front end so your front end is doing what it's supposed to but then the back end is not actually using those authentication details so they're never actually checked for validity and just the back end assumes that everything's okay and then finally you know exploitation of deep links or other one-click authentication mechanisms so to summarize broken authentication is it's essentially a malicious actor gaining access to a real user's right share accounts and then ordering for themselves as many rides and food deliveries that they can get away with so some best practices to implement to limit your app's risk here use minimum password lengths rate limits api calls don't rely on long lived credentials and implement capture or other like are you a human type tests to your authentication services so an easy example here of a breach Optus this is an Australian telecommunications company they were affected by a breach last year because the network configuration changed exposed 100% previously internal API so this API was meant to be internal and it didn't check for authentication so security through up security is just it's not good enough so before we move on to the next issue I want to be clear that these two first vulnerabilities they're the vast majority of the causes of breaches to to APIs that have passed happened at least in the past decade so yeah watch out at least for these two the other eight importance but those first two vast majority of breaches caused by those so moving on broken object property level authorization again let's use our right share example so marketing team had a great idea they want to show the world how diverse their right share apps user demographics are so they're having the public website show a sort of rotating feed of user first names and the cities that they're from so that's that's fine right it's nothing too crazy stuff we've seen before the problem is in the the implementation so they only really need the first names in the cities of the users right unfortunately the back end service that they're using is returning a complete user object meaning it's much more information that they actually need well why are they doing this well unfortunately it's just easier to develop that way you take all the data and then you just cherry pick what you need the actual few fields that you you need to use fortunately the data the all the extra data about the user object is still there it's only a few clicks away for someone with very basic technical knowledge and unfortunately for our example right share company the extra user data contains some pretty sensitive info like a new field for bookmarked addresses for example and suddenly it's not just a fun new addition to the website it's a serious user data breach so what could have been done to prevent this a couple things one handled data sanitization on the server side only return what's needed to your client side you've heard the expression need to know well since the website doesn't need to know about a user's personally identifiable information just cut it out before it's sent back as an API response and also maintain an API contract and make sure that it's being enforced so the contract should have stated that user bookmarks are for specific scenarios that don't include the public website so never send it back to the public website moving on the fourth vulnerability so this is unrestricted resource consumption so if your API is being hit with so many API calls that your servers are made unavailable you've probably done something wrong with a design of your API so unrestricted resource consumption this can lead to denial of service attacks which have made the news on more much more than one occasion hacktivist collectives they love this because very visible makes a very big PR splash even though it's pretty easy to recover from these so if a request can make a call for an unlimited number of records from your database and your server tries to fulfill that request it's going to affect the performance of your database and all the services that then depends on that database additionally if a malicious user can send your server a two terabyte JPEG as their user profile picture and your server doesn't handle that gracefully it's going to have again an effect on service availability for legitimate users so this is where rate limiting pagination and a transparent API contract come in rate limiting will prevent users from sending requests at a rate that is too quick for your back end to handle and pagination will allow users to slow down your their access to your database by returning a different set of results with each request so items one to 99 items 100 to 199 and so on so a clear API contract will mean users including the team responsible for handling the client side are aware of the limitations of interaction with the API and at the very least pagination and rate limiting will minimize the scale of any breach so it's it's difficult to be completely saved from denial of service attacks it's even more so difficult with distributed denial of service hackers can easily change which IP and which device ID that they're they're paying your servers from and that makes it really difficult to differentiate between what's a legitimate request and what's a what's a fake one so for now it's okay the more visibility that you have on your incoming requests the the easier it will be for you to sort of catch the ones that are intending you harm moving on the broken function level authorization so each API call made to a server is received and processed by a specific function right in the application before executing the code functions evaluates the metadata that's been passed along with the API request this data includes information like who's making the request what type of request is being made and where the request should be going now if a user who requests a ride is the only one it is the only one that should be able to interact with the the function that handles payment processing for their account the driver who's giving this user a ride should not be have the ability to to mock a user API call to the same payment processing function and change how much they're being paid for right for example if you're not using the principle of lease privilege though principle of lease privilege where you you know define a limited scope of user allowed actions then you'll be vulnerable to broken function level authorization so to to avoid this devs should use a granular access control system to define who is allowed to do what so as a rider I can modify how much I leave the tip for a ride that I've taken as a driver I can view how much was left for me as a tip for a ride that I have given so very clear sort of scope of permissible actions so unfortunately it's it's a lot easier to just write an application that treats all users the same without making the distinction between modify or view or any other permissible action so this next one server set request for you that's actually a new one for for 2023 this vulnerability allows attackers lateral movements from a public facing server to internal only resources so this can lead to data exfiltration if the server can be forced to interact with third party resources so to help mitigate this make sure that your network topography meaning how you set up your network architecture make sure that it doesn't let the servers that you use to fetch external resources then also interact with internal only resources create a clear segmentation between those two types of resources also you should be very explicit about what sort of URL structures and which ports should be used by external resources if you can it's a really good idea to use a white list or list of allowed IP addresses or DNSs and then as always you should always validate and sanitize input data before it reaches your data layer also check outputs but yeah at least validate and sanitize inputs before they're sent back to them data layer security misconfigurations all right so this is a rather large category it can include configurations that are sort of directly about the api and it's it's infrastructure but it can also be about the application higher up in the tech stack or the infrastructure lower down in the the stack so these misconfigurations are often going to be part of vulnerability chains so essentially the the idea of a vulnerability chain is that you're using multiple exploits in a row to gain access to most the most valuable parts of the infrastructure that you're going after so for organizations that maintain apis a single vulnerability probably won't cause a catastrophic breach unfortunately usually hackers are going to spot more than one vulnerability from the the orasp api security top 10 and that's going to allow them to leverage the chain of vulnerabilities to into a breach that has a much higher impact great for them not so good for the security professionals in this room so in our let's go back to our right share app example a simple two-step vulnerability that would allow for for like a chaining to allow for account takeovers yeah let's let's get into this example so the the first vulnerability in this chain will be in the the login process specifically logins through Facebook a faulty implementation of the login mechanisms mean that the right share apps authentication step can be injected with it the ID of an account controlled by the attacker instead of the true ID of the legitimate user that's being sort of attacked so despite passing this Facebook login step there's still two FA in the way right so we've sort of exploited the first vulnerability we still have MFA to go over as an attacker so unfortunately here we never implemented rate limiting for the the MFA step so a script could then sort of be used to try every single possible combination of a six-digit verification code until well we're successful in our in our breach so this was about vulnerability chains let's go back more in-depth into security misconfigurations so there's there's a very long list of areas where potential misconfigurations can be can slip through so a few of the more more common sources let's get into them one using insecure transport protocols transport layer security TLS version one point through is the latest version it was released in 2018 the most up to dates HTTP methods so don't allow for unnecessary HTTP methods on your endpoints there are a lot more than you think just use the ones that you're expecting and allow the ones that you're expecting third require the latest HTTP security headers so HTTP strict transport security HSTS implement content security policy and X-Frame options the fourth overly descriptive error messages Windows is notorious for this to use an example so Starbucks dot net server that was an important link in a the ability chain this led to a PII breach this was reported by a security researcher so researcher so like make sure when your endpoints return an error message to not expose anything about how the underlying application works don't send back the full stack trace for example keep it short keep it generic still help the users and figuring out how to use your API but don't give them too much information about how your your application works fifth enforce the use of use of HTTPS I know this is basic stuff but it's still vulnerabilities that we see you know presence on infrastructure sixth use cores cross origin resource sharing to restrict essentially the data flow to only pages or domains which your application should be communicating with and then finally infrastructure misconfigurations this one is also quite a large one just follow the well architected framework I don't know if you guys are familiar with this it's a it's a framework about how to architect well in the cloud to make sure that your infrastructure isn't misconfigured so eighth lack of protection from automated threats also a new one to 2023 so ticketmaster actually was famously recently affected by this during the Taylor Swift concert ticket sales so malicious users were able to abuse the ticketmaster API as to scoop up every single Taylor Swift ticket at what turned out to be bargain rates it was all over the press right this was a fiasco there was even a senate investigation if you guys remember this is pretty recent so you've all also been affected by a sort of automated Facebook or Instagram spam comments we see them all the time the point here is that attackers are going going to go through very normal user flows within your applications and APIs but they're going to go through them in a way that's exploitative so at fairytailer lab we've researched and we found that 90 percent of requests to APIs will be bots that are scraping for secrets or credentials in dot and get dot config graph query or other admin endpoints so to mitigate this type of threats OASP recommends using mechanisms like device fingerprinting or user flow analysis to identify and block automated threats so sometimes when I'm googling stuff at a super frantic pace I get redirected to a capture page before I can actually get my results back that's the sort of mitigation strategy that isn't too obstructive to normal users but would most likely vastly reduce the impact of automated threats second to last so APIs are routinely use versioning to account for the sometimes conflicting priorities of development velocity on the one hand and a support to legacy systems on the other hand so despite releasing for example the version two of your API you may still want to give a grace period to legacy users of the APIs version one so that they at least have time to update their applications to work with that latest version and they don't have any disruptions to their their operations so in theory this would be fine right however in practice and given real lifetime constraints dev teams will often just make the decision to implement latest security best practices on the latest version of the API and just forget about updating previous versions and leave them running so to at least be aware of what infrastructure you have running what API's are running are sort of being allowing connections to your application many organizations will maintain an API hosts inventory so let's make sure that we understand a distinction between an API contract and an API hosts inventory so that the contract is defined by documentation it's often created through open API spec and it describes how to interact with a specific application or service on the other hand an API hosts inventory will encompass all of the API contracts that are under the scope of the team responsible for the the API's maintenance so properly managing new releases also falls under the scope of assets management you should always set up distinct environments for dev tests staging and production and you shouldn't use production data in pre-prod environments aside from just being like a risk for additional data breaches it exposes your internal users data to your teams and that may also be a violation of of GDPR whatever California and equivalent there is so improper assets management this is often mitigated through operational controls so you should categorize your public API's and APIs that are meant for internal or private use make sure that your APIs that can interact with PII is tagged as such and so is given special treatments and if you have a a deep understanding of the data flows and the connection between the services that comprise your infrastructure then there's a much better chance that you'll be a step ahead of you know attackers attempts to exploit your assets so unsafe consumption of of API's that's the last one and also a new one to 2023 so we're clear that user inputs into your application are vulnerable to exploits so what about third-party service inputs so devs tend to trust and understandably so other companies API is more than their own users inputs into their their applications the conclusion of this is that the devs are more likely to adopt weaker security standards for input validation and sanitization when it connects to third-party services versus just user inputs so there's a lot of different exploits that can be achieved through this vulnerability anything from data exfiltration remote code execution denial of service can be done through a yeah a compromise integration to a third-party API so the message here I hope I'm clear you should apply the zero trust security model to integrations as much as you do for for user inputs so imagine for example what will happen if Twilio or Stripe were breached there's a lot of companies that depend on those services and all of their downstream integrations would potentially be able to be exploited through those those breaches so this is my last slide I'm going to actually repeat this talk in a few minutes if you guys if some of you arrived late we're looking essentially to partner with organizations that are building API reliance and modern web applications and yeah we'd love to to to sort of get in touch if you if you guys have any specific struggles or concerns we've we've built open source API security libraries we've love to share we'd love to to sort of get your feedback on and and see how it could be of use to you guys so just yeah come speak to me email me if you want the slides just as a reminder of what we we've covered and yeah thank you for for coming here I don't know if there are any questions so here we're not talking about like what's what sneak does about like packages right sort of checking that the packages that you're included in the libraries that you're including in your in your application are we're not talking about stuff like what what sneak does I'm talking more about integrations to for example Twilio or Stripe where your application is expecting some and you're doing a validation to make sure that what's coming into your application is is what's what's expected but if you're not doing that step of sanitization and and validation then you are at risk of being breached through that that that surface now I guess this is where we sort of have to in some ways hope that the organizations that we're depending on for services are also applying best practices with when it comes to their API is that they're also enforcing an API contract that is very clear to you as a developer that's integrating with the the service and yeah just don't implicitly trust that because it's a brand name that they're doing things correctly you should have the steps in place to make sure that what's coming into your infrastructure isn't going to affect it negatively it's a funny one this this one this unsafe consumption of of APIs there's actually like a little bit of a debate right now in the OAS community about whether actually this one should be included and the main argument for its inclusion is the fact that there's less just awareness of this being a vulnerability versus just inputs from users inputs from users definitely I'm pretty sure every developer knows that there's many vulnerabilities that come through that but inputs from third parties not as much it's part of the API economy I guess which is like a growing thing yeah well I worked for firetail I would say that's one we have some open source components which is why I'm sort of allowed to talk about it during this this sort of during scale 20x so that's that's definitely one I'll open the room if anybody has any other suggestions for I'm not sure there was another question in the in the back yeah oh okay yeah I'm glad that we've moved on from all the violence from the industrial revolution we can laugh about these sort of breaches that don't have any sort of you know violence associated to it yeah that's great example I love it yeah segmentation definitely an importance an important principle so I don't know it was everybody in the room present when I started the presentation because I was considering sort of just redoing it since this was like a 30 minute presentation I have time to do it another time but if I hear everybody was already present then I don't know if it was actually worth it I can brush through it if if you guys want we'll have like a 20 minutes to to go over it yeah may as well all right so yeah the subject of today's presentation top API security risks the OWASP API top 10 in a nutshell so a little bit about me graduated from the University of Montreal I was born and raised in Montreal then worked for the the fourth intelligence company Canadian forces reserves units and then moved on to sort of startups digital innovation worked in cloud security and then I have some interesting hobbies as well or at least I find them interesting so before we actually get into the subject of API security let's talk about what came before the sort of threats that happens in the industrial revolution so what we had back then were things like high women or stagecoach bandits that would sort of accost you as you were traveling city to city and would unfortunately try to steal your stuff or sometimes even worse beyond that you also had high seas pirates the US first overseas conflict was actually with the some barbie coast parrots in modern day Libya it was not a conflict that went particularly well for the US at that time this is long time ago early 1800s and then of course nation states had many many world wars more than just the two of the 20th century even before that but yeah we've we're no longer just in the industrial revolution now we've moved on to the digital revolution so what sort of threats do we have nowadays well you can get some physical tax directly on a physical data infrastructure so think an aws data center for example can be can be attacked you can have digital disruption to your to your operations so think of the breach or the the effects that happens on during the in 2010 one on the Texas power grid where a company that was managing the supply chain for oil suffered a ransomware attack and essentially you had some very long lines at gas stations you had some very high gas prices as well nothing good third you can have some IP thefts or actual asset theft I think IP theft that's generally understood someone steals your code base and then just understand what you've been working on asset thefts think for example NASA sending a satellite to space and then the only control system for that satellites being taken over by some malicious group could be more than just satellites could be drones could be rovers anything that's remotely controlled essentially and then finally PII data in exfiltration so yeah there's some real reputational and monetary loss that can that can happen off the back of PII data theft so how did we actually get here so we've established that today the threats to industry are different than what they were previously they're consistently changing so five or six years ago I was helping organizations safely get migrated to the the cloud I saw all kinds of cloud infrastructure issues so the issues are not the same that as what happens today best practice changes and today what we're seeing is that APIs are becoming much more of an important source of vulnerabilities to the two organizations running IT infrastructure so this brings us to the subjects of today's presentation application programming interfaces and the OAS API top 10 the release candidates of actually 13 February 2023 so very recent stuff less than a month old this version of the the top 10 so number one broken object level authorization so if you weren't here the first time I gave us this talk the first to this broken object level authorization and the next one broken authentication this represents the vast majority of breaches so 70 plus percent of of breaches according to our research at at FireTail looking at the past 10 years of publicly disclosed breaches so for this one broken object level authorization devs will just often incorrectly assume that authentication equals authorization that's not the case they're going to assume that an API call will only be coming from known good client software like a mobile app and that's going to help them control the parameters of API requests which will then make it impossible for user A to request user B's data unfortunately that's not how reality works any sort of predictable pattern in an applications data structure can be exploited by a malicious user so if you're not using random and unpredictable values to identify your data objects then they can be protected and so are much easier to exploit so let's use a right share app example say malicious user guests your right share accounts user ID because they figured out that those IDs were sequential for example 101 102 103 that attacker could then mock a legitimate API call to fetch information about that user's right history the legitimate user's right history so that's that's a sensitive information at PII data breach so essentially this unauthorized user gain access to a data object that they should not have had access to so if the right share apps backends doesn't check that this user this malicious user is allowed to access that the specific data object in question then of course the API is is vulnerable so the mock request made by our malicious user it's only going to go through if that mock API call is not properly checked for authorization if the user making the mock request actually receives the right history that they're requested that's a breach the authorization check was not made by the application logic and object level authorization of course has been broken so if I haven't been clear this is really really bad this is the sort of security breach that security professionals lose their jobs over user A getting user B's data is really really terrible it's yeah it's worse than sort of marketing data breach where emails and first names get released it's much worse than that so yeah it's it's a really good idea to use unique user identifiers or globally unique identifiers to mitigate this risk and ensure that all functions actually check for authorization number two and again these first two by far the most importance of this top 10 list this one is about user impersonation can a malicious user trick your application into believing that there's someone else so there's a few common ways that this can happen credential stuffing that's one of them is that's using a pre-made list of user names and password and trying every single one of them out until one of them works another one is using brute force attacks against a specific user where an attacker tries out different passwords or MFA tokens until the they find the right one and take over the the count that they're looking to take over third think of a poor in transit security like lack of encryption or just poor or no hashing putting off tokens or passwords in your URL this can make your users very vulnerable to to attack so malicious users can then are then able to intercept those authentication details by sniffing the network and then this would lead to a man in the middle attack fourth tokens or other credentials that are properly generated by the front ends but then never actually checked for validity by the back end so your front end here is doing everything right it's your back end that's not using what the front end is giving them and checking that the authentication is legitimate and then finally exploitation of deep links or other one-click authentication mechanisms that aren't some properly secured so to summarize broken authentication is essentially a malicious actor gaining access to a real user's rideshare accounts and then ordering for themselves as many rides and as much food delivery as they can get away with so some best practices to implement to limit your app's risk here use minimum password lengths rate limits API calls don't rely on long-lived credentials and implement CAPTCHA or other RU human tests to your authentication services so an easy example of a breach here happens late last year so not very long ago Optus this is an Australian telecommunications company they were affected by a breach because network configuration change exposed a previously 100% internal API and this API well didn't check for authentication so security through obscurity it's it's not good enough implements authentication everywhere yeah again these first two by far the most important of the top 10 moving on so broken object level authorization again let's use the right here example marketing team had a great idea they want to show the world how diverse the right here apps user demographics are so they're having the public website show a rotating feed of user first names and the city that they're from so it's fine right it's nothing too crazy they really only need the first names and the cities for the website that's not personally identifiable information unfortunately the back in service that they're using to to power this this new website feature is gives them the full user objects that returns it to the the public website well it's just easier to develop that way right it's easier to just get all the data and then cherry pick the few fields that you need to do with your to build your feature unfortunately all the extra data is actually still there it's just a few clicks away for someone with very basic technical knowledge and unfortunately for our example right here company the extra user data contains some pretty sensitive info like a new field for bookmark addresses and suddenly it's not just a funny addition to the website it's a user data breach so what could have been done to prevent this a couple things you should handle data sanitization on the server side only return what's actually needed you've heard the expression need to know well since the website doesn't need to know about user PII just don't send it back cut it out before it gets included in your API response and also you should maintain an API contract and you should make sure that it's being enforced so contract should have stated that user bookmarks are for very specific scenarios that don't include the public website with this in place it yeah that's uh that PI would not have been sent back to the the public website so fourth unrestricted resource consumption if your API is being hit by so many calls that your servers are made unavailable you've probably done something wrong with the design of your your API it can lead to denial service attacks which make the news very frequently activist collectors they love this technique it's very visible can make a very big PR splash doesn't look very good for the the organization suffering this so if a request can make a call for an unlimited number of records from your database and your server actually tries to fulfill that's a request it's going to affect the performance of the database and all the services that actually depends on that on that database so furthermore if a malicious user can send your server a two terabytes jpeg a jpeg as their profile picture and your server doesn't handle that gracefully it's going to have an effect on the service ability for other users so this is where rate limiting pagination and a transparent API contract come in so rate limiting that's going to prevent users from sending requests at a rate that is too quick for your back end to handle pagination will also allow users to slow down their access to your databases by returning a different set of results with each request so I think items one through 99 items 100 through 100 and 99 and so on and so forth so a clear API contract here will mean that users including the team responsible for handling the client side are aware of the limitations of your API and at the very least pagination and rate limiting will minimize the scale of a breach so it's difficult to be completely safe from a denial of service attacks even more so distributed denial of service attacks hackers can very easily like change their IPs change their device IDs and yeah it just makes it very difficult for you to differentiate between what's a legitimate request and what's a fake one so for now suffice to say the more visibility that you have on incoming requests the more will be possible for you to to catch the ones that are intending you harm fifth each API call that's being made to a server is received and processed by a specific function in the application so before executing the code the functions that evaluate the metadata that was passed along with the API request this data is going to include things like who made the request what type of request was made and where the request should go so a user let's go back to the the right share example very easy one a user who requests a ride should be the only one that should be able to interact with the function that handles payment processing for their accounts the driver who gives them a ride that's a different user in the application should not have the ability to mock the user API call to the same payment processing function and change how much they're being paid for the fare you can very easily see how this would be beneficial for the driver but not very good for the person paying for the ride and for the the organization managing the application if you're not using the principle of least privilege here to define a limited scope of user allowed actions then you'll be vulnerable to broken function level authorization so to avoid this you should use very granular access control to define who is allowed to do what in your application so as a writer I can modify how much I leave as a tip for a ride that I've taken as a driver I can review how much was left for me as a tip for a ride that I've given so unfortunately it's it's like a lot easier to write an application that treats all users the same without making the distinction between modify or view yeah 6th server side requests forgery so this is where this is actually a new one to 2023 this is where the vulnerability allows attackers lateral movements from a public facing server to internal only resources so this can lead to data exfiltration if the server can be forced to interact with third-party resources so you should make sure that your network topography meaning how you set up your network architecture you should make sure that it doesn't let the servers that you use to fetch external resources then also interact with internal only resources so be very explicit about what sort of URL structures and which ports should be used by external services and if you can it's a really good idea to use a white list or a list of allowed IP addresses or DNSs so as always validate and sanitize all input data before it reaches your data layer but also checks outputs before they're set back to clients so this one security misconfiguration that's a large category right it'll include things that pertain directly to the API and its infrastructure but also things that are higher up in the application tech stack but also further down in the infrastructure stack so these misconfigures are often going to be part of vulnerability chains where essentially the idea is to link together multiple exploits in a row to gain access to like a better exploits of the infrastructure so for organizations that maintain APIs like just one vulnerability probably won't cause a catastrophic breach but usually if you find if a security researcher or hacker finds one they're probably going to find more than one and if they can chain more than one together then that can lead to a much higher impact breach so in a writer example let's give a simple two-step vulnerability chain example that allows for account takeovers so the first vulnerability in this in this chain will be that in the login process specifically logins through Facebook a faulty implementation of the login mechanism means that the rideshare app's off step can be injected with the Facebook ID of an account controlled by an attacker so this would this Facebook ID would replace the ID of the legitimate user so this would allow the attacker to bypass the Facebook login step so the next step would be passing the multi-factor authentication step so here unfortunately in our app example rate living was never actually implemented on this endpoints checking for the 2FA code meaning that the attacker can use a script to try every single possible combination of the six-digit verification code and yeah that allows them to pass through the MFA step and so those two vulnerabilities sequentially leads to a full account takeover each one individually probably fine you won't actually gain access to to anything but both together lead to bad things so let's go back specifically in what terms of misconfigurations can can happen so insecure transport protocols TLS version 1.3 is the latest it was released in 2018 it's the one that you should use use HTTP methods you should only allow the use of HTTP methods that are necessary for your application there's a lot more than you think just don't allow them for your endpoints third require the latest HTTP security headers so HSTS content security policy x-frame options these are HTTP security headers that are good to use nowadays these are still best practice fourth overly descriptive error messages so Windows is notorious for this so to give an example of a breach that was caused by this this starbucks.net server it was an important link in a vulnerability chain that led to PI breach that was reported by a security researcher fortunately not by an actual hacker but still a bad breach for for starbucks here so yeah make sure that your endpoints return error messages that don't expose anything about your application's logic be very generic short still give some information about what can be done to fix the request that was sent to your your endpoint but don't expose anything about the underlying logic fifth use the HTTPS I know very basic stuff but still we see API is not enforced the use of HTTPS so yeah good to use and sixth enforce a cross origin resource sharing policy a chorus policy that restricts data flows to only pages or domains which should communicate to each other and then finally everything related to your infrastructure misconfigurations just follow the security pillar of the architect framework that'll go a long way in making sure that the infrastructure power in your API is is not going to be what leads you to a faulty or vulnerable API eighth so lack of protection from automated threats also a new one to 2023 so Ticketmaster was famously recently affected by this during the the sale of Taylor Swift concert tickets malicious users were able to exploit the API is from Ticketmaster to get every single ticket for the Taylor Swift concerts at what turned out to be very cheap rates compared to what they were then able to sell them for so it was all over the press it was called a fiasco there was a senate investigation was a whole thing you guys have also all been affected by sort of spam comments on LinkedIn Facebook Instagram yeah these happen all the time the point here of those those examples is that attackers are going to use very normal user flows within your application but they're going to do it in an exploitative manner which makes the experience worse for everybody else so our our lab at Faratel we've shown that's actually 90% of requests to APIs will be bots that are going to be scraping for secrets so things like .end files git.config graph query or admin endpoints yeah bots are going to be looking for those to exploit it so to mitigate this sort of thing OWASP recommends using mechanisms like device finger printing or user flow analysis to identify and block automated threats so sometimes when I'm googling stuff at a frantic pace I get redirected to a capture before I can reach my results that's the sort of mitigation strategy that isn't too obstructive to normal users but would most likely vastly reduce the impact of automated threats nine improper assets management so API is routinely used versioning to account for the sometimes conflicting priorities between development velocity and support for legacy systems so despite releasing a version two of your API you may want to give a grace period to the legacy users of your API using version one so that they have time to update how their applications work with your latest version so in theory this would be fine however in practice and given real life time constraints development teams will make the decision to just implement the latest security best practices on the latest version of the API and just forget about updating the previous version so you should at least be aware of what you're running in terms of of API is what's what's online you know if you have zombie APIs in your your stack at least be aware that they're there you shouldn't be running them at all but if you are well you know it's good to have visibility and then to make this even more professional you should run you should maintain an API host inventory so what's an API host inventory well it's it's different to an API contract the contract is much more at a tactical level it's it's defined by documentation often created through open API spec and it describes how you can interact with a specific application or service the host inventory will encompass all of the API contracts it's much more at a strategic level and yeah will enable security teams to sort of have a better visibility over the their scope of responsibility so properly managing your releases also falls under the scope of this of this category you should always set up distinct environments for dev, test, staging production don't use prod data in pre-prod environments to test not only are you creating a risk for potential data breaches but also it exposes your internal teams to sensitive customer data and that in and of itself might be a breach of GDPR or other local regulations so how to mitigate this it's usually through operational controls I mentioned the API host inventory but you should also categorize your public APIs and make them different to what APIs that are meant for internal or private use so ensure that your APIs can interact with the APIs that can interact with personally identifiable information are tagged as such so at least you can take a better look at them than the rest of your your stack so as long as you maintain a really good understanding of the data flows and the connection between the services in your stack there's a better chance that you'll be able to yeah be one step ahead of of attackers so last one I know we're going a little bit over time this is our second run through thank you for for being here for both and again it's new to 2023 so we're clear that user inputs into your application are vulnerable to to exploitation but what about third party service inputs so unfortunately devs tend to trust and understandably so other companies APIs more than they trust their own users inputs into their their applications so the the conclusion of that is that the devs are more likely to adopt to weaker security standards for input validation when it comes to those third party service inputs so there's a lot of different exploits that can be achieved through this vulnerability think data in here exfiltration remote code execution denial of service and yeah all of this can be done through a a compromise integration through a to a third party API so the message here to be clear you should apply the zero trust security model to integrations as much as you do for for user inputs treat them the same it's not because it comes from a trustworthy company that's what's going to happen is meant to be trusted so to give an example here imagine what would happen if Twilio or Stripe very commonly surfaces were breached and then all the downstream integrations to their APIs were exploded through that breach not a good situation for anyone involved in those scenarios so my call to action today well first thank you for for coming we're looking to partner with organizations that are building APIs and on AWS ideally and we just essentially want to learn about their their problems concerns their complaints what their life's about and yeah we've at Firetail we've built some open source API security libraries that we'd love to share with you love to get your feedback on so yeah just come speak to me if you want the slides just email me I'll be happy to send them to you so yeah that was it thank you any questions I'll take that to mean that everything was clear so yeah thank you for coming these two reviews about how to breach happen because they don't want to inspire copycats but yeah it's it's a very growing source of vulnerability in regions with GI and on it it's um if you're at all aware of what's happening with uh X3 buttons they have sure yeah they listen that was like super obvious stuff that was happening like five six years ago right so like Amazon put a ton of effort and all the other copyrighted with a ton of effort into making sure that um like they're in their data and their story services were were secured and they're easy to secure to them and if you guys unfortunately there's no major vendor behind us right everybody does differently in the work then yeah so but like the conclusion if you keep in mind that you're doing authentication and authorization of operating you're you're covering 80% of of it's just lack of awareness right they just don't know that this feature is there that they left itself no why are some lack of awareness they skeed of their it's like going interchange with their complex path we're not going to start from they skeed they haven't been asked to do it so they don't need it and and no customers ever ask to do it keeps you employed right so centralize half work management system and put a complex path along the list type clicks or well I don't want to mention the name I don't do I'm in a large city right now but I'm ready to have enough yes and they don't want me to attack them because they're ancient and they'll follow us right yeah growing problem I'm crobe I'm the director of security oh yep yep hi everyone great hi everyone how do you trust open-source software I'm Navin Chini Vassen it's just a little bit about me um I'm happy to be here to talk at scale this year I am all about supply to insecurity and open-source software I am a maintainer and a contributor for a few projects in OpenSSF I'm a member and a contributor at six store and I'm contributor to Salsa code base all these three are get up organizations focusing on supply to insecurity and I've been actively contributing open source for a few years which helped me get Google peer bonus award in 2021 and 2022 right now I'm an independent developer looking for ways to contribute onto new projects and and if you want to check out my profile if you if you can check out my get a profile to see what I'm doing and if you want to talk to me anything about supply to insecurity open-source software you can hit me up in Twitter great welcome everyone my name is Brian Russell I'm a product manager on Google's open-source security team work a lot with the OpenSSF and I in particular work a lot with OpenSSF scorecards so before we really get into what the scorecard is let's talk about the problem that we're facing so we've got this blocky looking diagram here and we can think of it as basically what modern digital infrastructure looks like oftentimes when you pull back the cover and take a look at what's running underneath any system you've got a whole bunch of different pieces there they come in different shapes and sizes and no some of them are very important some of them are more precarious than others but what you have is is a real mix of different things and you don't have a lot of insight into what's actually going on in terms of making these blocks in the first place you might know how important they are to your system but understanding how they're developed how that community operates that part is usually pretty opaque to a lot of a lot of people so that begs the question what if we did have some insight into what's going on in this infrastructure and that's really what scorecards is wanting to help start being able to help people see so if you have these scores you have all of a sudden a sense of which projects are doing really well in terms of embracing best practices in developing software securely which projects you know could use a little bit of help could be doing a little bit better and then which projects are actually doing really quite bad they don't really think about security when they're developing and those are the projects that you know you would want to concentrate on if you're using them or if you're bringing them into your own open source project as a dependency it's important to direct your attention kind of where it's needed most so here's what a scorecard looks like there are a few ways to generate it to pull this data this is coming from a service that you know shameless plug from the google side it's made by this site called depths.dev it lets you go in and see for any open source dependency what do those dependency trees look like it also provides scorecards though and it's a nice UI experience if you're just wanting to look up a scorecard for a particular project we'll get into a little bit more how those scores are generated and how we would source them but in a nutshell the scorecard project right now is running effectively a cron job that every week is scanning the top million and a half projects as computed by a sort of criticality score how important they are in the rest of open source so I feel free to follow along on this next part if you want to take a look at a scorecard you can pull it up on your phone you can either use a qr code or just go to depths.dev and you can think about a project that you already know about or maybe one that you're helping to maintain any project that is within kind of that top million and a half should have a scorecard associated with it if you just want to find a project you can always just look up scorecard directly you should see a sequence of screens something like this I'll say it's always a little bit better on the laptop but I made the screenshots with a phone in mind so if you go type in scorecard look for the specific little project folder that's associated with it you can scroll down the page a little bit and you can start to see a scorecard so we're back to this list of checks and now let's talk a little bit about what these actually are so in a nutshell you know we are looking for what are generally considered best practices when it comes to developing and managing a project securely and each check is composed of a name that you're seeing here a score and then some underlying logic that goes out and actually looks for evidence to compute that score so I'll go into a few just checks I thought about going through the whole list but it seemed like let's start with the top five and and just kind of go from there so if we look at our first check that's on the list we've got code review and in this case it's looking for does this project require that someone else reviews code before it's merged in and each check is composed of a risk level as well and so they range from critical to high medium down to low and in this code review sense you know there's some caveats here we certainly recognize that there are single maintainer projects where code review is not going to be feasible but it's it's basically you know one kind of piece of evidence and one component of whether or not your project is is basically building a community that's looking at things together if it's looking to develop things securely so I'll go through you know four or so other ones just to to give you an idea of what each is I should mention too that for each of these checks the way that Scorecard works it's calling the GitHub API right now we're working to expand that past GitHub to some other places but these aren't kind of self-certified pieces of evidence all of this is just gathered programmatically checking for things that that you could check just as a general user coming to GitHub so if we just go to the next check down we look at maintenance is this project still actively under development you know for pretty much any project that is inactive development there's going to be some activity that happens on it so we look across a 90-day window whether or not things are being done if updates are being made we would expect to you know if dependencies need to be updated within a project all of those things should be happening and so we consider maintenance one of these checks that Scorecard's looking at if we jump a little bit down we want to make sure that you know just permissions to actually change the repo or configured in a way that it's read only we each of these checks has specific descriptions of how they're implemented so you can see whether or not they would pass but when it comes to token permissions we're really just looking at who has access and whether or not they have more access than then really they should we also look for dangerous workflows there's different patterns such that you know if you have your project configured in a way let's say that it kicks off some CI when a pull request is opened and someone can basically insert their own sort of script to run that that would be considered a dangerous workflow so there's a few different patterns that we're looking for when we run this particular check and then the last one that I'll just highlight here is binary artifacts this plays very much in open source mentality where you know we want there to always be the actual source code and not some compiled code that we can't understand so there's a few exceptions where you know different specific frameworks require binaries as part of of how they develop software but we're looking for anything past those and we're trying to make sure that overall you were just storing source code in the repo itself so different questions kind of come up where we're trying to get better over time of kind of explaining you know can I use scorecards if if you have a private repo the answer is yes you can use scorecards on that you can run a cli tool against your repo you can set it up in a github action but you can run this privately it doesn't have to be a public repo let's say that you use get but you do not use github the answer is yes but there's some limitations specifically anything that's tied to the github API that would require the project to be in github but we can check for a number of things that are independent of where you store your get repo and then if you're using gitlab the answer is soon this is something that is currently under development and something that we're planning to be pulling in in the next few say months at this point so that's that's generally what a scorecard looks like that's how checks work we've seen a few different examples of what checks are sorry you have a question so you're saying in terms of like for depths.dev we'll get to that but the short answer is is you can install the github actions action that will automatically score the project and it will automatically push that score to depths.dev it's a good question we also have a few minutes purposefully left at the end of the presentation so if you have more questions we'll definitely have time to talk about them further so i wanted to talk about here just what is this general state of supply chain security that we're in right now like exactly why are we talking about scorecards what is what is kind of the bigger landscape that this is this is operating in well to start with open source software demand is is always increasing in the links foundation published in a blog post that free and open source software constitutes something like 70 to 90 percent of modern software solutions and on top of that you know just that the downloads of open source dependencies has been growing by an estimated average of 33 percent across all of the ecosystems that they were looking at so we know that the demand is going to increase what we've also seen and this is from sonotype state of software supply chain reports which is free and and I highly recommend people take a look at if it's not on your radar now they looked at specifically what is happening with software supply chain attacks and they found that you know this this graph really tells the tale all by itself they're increasing a lot more and more you know people are are trying to find ways into any sort of system and often dependencies that are being pulled into the system and use they are are suddenly becoming one of the easier attack vectors to hit I think in part because some of the other good security best practices are getting followed more and more but that means the software supply chain suddenly becomes one of the more vulnerable pieces so what's this have to do with scorecards also in this sonotype report there's a specific section in which they started to look at what can you look at to really see if a project is is likely to have vulnerabilities over time like is there any sort of indicator that we can use to really see you know if this project has these certain things in place are they less likely to be vulnerable and what they found was that scorecards could be at least a decent predictor of whether or not vulnerabilities would show up over time vulnerabilities are reported at different times but a higher scorecard project or I'm sorry a project with a higher score usually indicated there was a lower probability of vulnerabilities actually being found and then in this report they also broke down which checks are more important than others you see that code review is is actually the highest one which intuitively makes sense if you have more people looking at code you have less likely of kind of making a mistake scorecards has had adoption and some notable mentions by various foundations they won't get into each one I'll highlight Eclipse just though off the bat like Eclipse is is basically using scorecards to take a look at projects within their foundation and using is really a good starting point of where they can invest in different software supply chain improvements and I think their example is is good for a lot of organizations where if this is a problem that you're starting to be concerned about like this is a good foothold to get you checking for some basic things to really see if some of the very basic pieces are covered so rather than and kind of give more examples we we asked one of our scorecard friends Christopher Robinson or CROBE as he sometimes goes by he's a director of security communications at Intel we said you know what do you think about scorecards and he sent us back this video that we thought we'd play for you I'm CROBE I'm the director of security communications at Intel I'm also a community leader and contributor to the OpenSSF I'm here today to talk about how my organization uses open source software and the OpenSSF scorecard Intel has been a long time contributor to the upstream open source community with over 20 years of involvement in a multitude of projects we're also one of the founding members of the OpenSSF we're extremely excited to see projects like the OpenSSF scorecard come into being because it not only helps us understand the security details about the open source software that we leverage in many of our products but also internally and it also helps us understand the security qualities of our own products right now Intel's making sure that all of our public GitHub repos are reporting out through the scorecard tool so that we can understand where we can improve but also so consumers of our software understand the types of things we're doing I think scorecard is an amazing project that has a ton of value to help describe the security posture of open source projects up and down throughout the ecosystem so I can endorse it strongly enough if someone's considering a way to evaluate open source software the scorecard is an excellent way to make that happen thank you all right thanks to CROBE for for sending that to us also if we before I hand this off to Naveen to do some demos and explain a few more things I did just want to say a note about what the OpenSSF really is I used it you know just in the actual name of scorecards but we wanted to talk about what this foundation is really seeking to do it's it was formed around three years ago at this point and it really is meant to be a place where different organizations who are very interested in trying to make open source software more secure for not just them but the whole world it's a place for them to come together to have conversations about how can you actually do that a place to really invest in different tooling and to just really try to raising the bar on making software more secure so it was formed by a few different organizations but that number continues to grow and I thought I'd say a note too on just you know how it works kind of under the hood if you look at this top diagram you can see this is the rough life cycle of source to production to the end consumer and the OpenSSF has divided up different working groups along this different process so they have things like best practice working groups where they're trying to just say at each part of the process what is going to help make things more secure they have working groups around vulnerability disclosures and doing that in the right way and in a way that you know basically minimizes the vulnerability impact different groups around tooling just other related projects in general all of these things if these are of interest to you there's a lot of ways to learn more about the OpenSSF there are email lists Slack groups there are public meetings there are recordings of those public meetings that you can go back and view they're active on social media you can follow them message them can use email directly they actually have a page where all of these different things are consolidated so it's just openssf.org slash get involved in all of those links are right there so that is kind of the intro so far to the problem that we're we're working on and in the organization that we're working within so I'll hand this over to Naveen to talk a little bit more about how else you can use the scorecard thanks Brian like Brian mentioned one of the ways there are multiple ways to utilize scorecard you can use scorecard's client get up action but last fall and this has been there for the other client has been out for almost a year and a half too and last fall we released a scorecard's API why did we release an API because we spoke to customers and and because we've been scanning this 1.2 million repositories which I'll get into all of this data is available for anyone to consume using BigQuery and it's been it's been doing the crown job it's been running every week and it's been doing it for the past one year so there's phenomenal amount of data available but the biggest problem we realized is people like to use HTTP API standard simple get API to go get this data so that's why we exposed the scorecard's API that was a specific reason behind this where's the API available the API is available API.security scorecards.dev and we made sure that the API is predictable what I mean is what I mean is you can see here the simple example of hitting Kubernetes, Kubernetes it's a simple code command it results in a JSON output it's got get and post right now I'm going to focus on get but I'll tell you where the post is going to be utilized we also made sure there are no security tokens authentication required for this API so anybody should be able to utilize this API this is a free it's a free maintained by the scorecards team at this moment we don't throttle the API so please don't de-dust this like I mentioned this is serving 1.2 million GitHub repositories scans weekly scans there was at least one star to that is for the past three months we had to we were rebuilding the infrastructure for that we couldn't run but you already have some of the previous data but we have re-enabled so you should be soon seeing all these scans running just enough for our eye we have been seeing about 150k requests per day people using this API who is using this API? we don't know it's a free API sometimes we get GitHub issues where then people come and ask for API rate limits but if you're using this API if you're interested please come and talk to us we are more than happy to understand what people's needs are one of the things that Brian mentioned was code review so we said and there's another system project within OpenSSF called criticality score what criticality score does is it takes the 1 million repositories and figures out which are the most critical and it uses an algorithm by Rob Pike and it uses that to figure out the top we use the data to figure out the top thousand critical projects and wanted to figure out how can what are the what are the state of code review because we realized code review is a critical thing and we use the API along with this data and we were able to graph this why am I showing this so anybody within your projects within your organization you should be able to utilize something similar and figure out what are the state of any of the projects that you dependent or you built looks like most of them are doing well some of them need some help enough talk let's go to code so what I'm doing right now over here is I'm gonna hit these scorecards API but I'm gonna hit the scorecards API for scorecard I'm piping it to JQ because it comes out of JSON one of the checks that Brian mentioned is maintained here is simple as looks like scorecards repository maintained at 10 it's a it's between one and 10 we are doing well on that and let's go back and check another another check for example token permission that's one of the critical checks and gives out results like this as a JSON so you should be able to parse this and programmatically use this looks like scorecard repository itself is needs to fix one of this that's why we ourselves have score as nine let me go back to the slide and if you are interested in you building the building the I use I open source the project for building the thousand critical projects and that's in my personal repository you should be able to pull down that repository and go ahead and build anything similar to this is written go so you but you should be able to use any modern programming language to build that let's talk about the usefulness we're going to play a I am the contributor and I'm a project maintainer and I care a lot about security Brian I want to submit a PR with the new dependency so thank you that sounds really good but you know can you tell me about this dependency is it in a healthy state would a scorecard be helpful Brian I'm not really sure what that is what is a scorecard have you checked out the school cards repo Brian okay so I hadn't but this is great and in fact I think I'm going to be using this API to be checking on all my future dependencies in fact I think I might use it in my CICD tool as well and it's also just so easy to use now I'm starting to wonder just why doesn't everybody use scorecard at this point thanks Brian we've been we realize what what we did and the scorecard team thought about what the API was and we are building a feature to specifically address this it's still not merged in so what this specifically does is anytime a new dependency comes into your repository I figure out what are the new dependency and gives you a get-up comment on the health of the dependency so it's almost like running scorecard anytime a new dependencies added to your repository this is something we are working on it's still not merged in but just wanted to give a heads up of what it is right now so you should be able to see a comment saying hey this is the dependency this is when the scorecard ran for this and here are the scores that you can see so it helps you make some decisions as to what you want to do with the new dependency whether you want to accept or whatever you want to do with that the next thing and then the the scorecard's API was there's another thing that we released last year was scorecard badges the badge is very similar to what it is it shows what is the score of that of your own of anybody's repository so essentially it helps humans understand whether it's how safe that project is so that's the idea behind the badge badge intern uses the get-up action you need to have the get-up action to get the scorecard badge and this is where the get-up action use the API for posting data thanks to Brian we don't want to go install the action manually we Brian recorded this for us so here's an example of a repository now we're going to scorecard's repo there's a specific section how to add a badge and we are copying that going back to going back to original repository going to the readme section pasting that specific content just replacing it with the owner and the repository it's as simple as that if we do that we should be able to get a badge for the repository a caveat and this is if you install the get-up action you get it and if it's part of the 1.2 million repositories that we scan if it's not in the 1.2 million repositories you have to install the get-up action to get the data the another advantage of using the action is you don't have to wait for a scan so update your scores if you install the action the scores get updated automatically anytime the commit goes into your main or your master branch if I have only 30 seconds what can I do I can go to depth.dev get some scores if you have five minutes you can run a scorecard's client get some data out of it if you have 15 minutes install the get-up action should be able to get this score out to the API you should be able to utilize that if you have more than a day should be able to utilize the API and bring it to your workflow we want to make sure that anybody should be able to get some data out of this that's our specific goal behind us if you want to get involved you should be able to hit up the QR code we have a bi-weekly meeting we meet every Thursday every bi-weekly between 1 and 2 p.m. Pacific and that's scorecard's website and the bottom are Brian and mine Twitter handles you should be able to hit us up if you have any questions thank you any questions yes good question do we want to I think we want to repeat it so that it gets on the real the question is how is this project related six-store and S-bomb six-store is part of openness of it's under the openness of Umbrella and just in FYI scorecard's get-up action uses six-store to store to make sure that the action ran it stores the data into six-store store so they are related but scorecard does is not specifically generating S-bombs or anything on that to be specific of whether are we generating an S-bomb we are not doing any S-bomb generation did I answer your question I just want to add I think more generally they're all trying to tackle different pieces of software supply chain security and so you know the open SSF makes sense is this organization above where these projects are fitting but if you think about like S-bomb it's trying to tell us this software bill of materials what is actually in this project and you know once you have that you could pair it with things like scorecard to get a better sense of what these dependencies look like past just knowing they exist and when it comes to six-store being able to actually associate source code with a binary like six-store helps make that signature happen so that those connections are made and so again if you're running scorecard you're running it against the project but when you actually get a binary you can say oh well that actually came from this project so each project has kind of a niche within trying to make the software supply chain more secure but you know are both connected and disconnected kind of depending on on what the conversation is yes good question we have a may we repeat the question sorry my apologies the question is to know the history of the score of a particular project right just to make sure so you want to go back in six months or few months to figure out what are these score of a particular project so we can see whether the project is doing well or not yes scorecard like we mentioned scorecard runs for those 1.2 million get-up repositories as a cron job it stores the data in a BigQuery and that is available for anybody to go query so it's available free for anybody to go query the data so you should be able to get a history for any particular projects yeah short answer yes go ahead do you want to repeat or I can take a step yeah so I think if I heard the question correctly it's because this is an automated process what's preventing a developer from gaming some of these checks like the developer just making constant you know minor changes just so it shows that it's maintained or it's active so you know I think the short answer is there's there's no safeguard in place for it we'd love to get to the place where scorecards are so dependent upon that that people are are making modifications you know just to just to try to boost their score I don't think we're there yet I think as that happens over time we'll have to be looking at safeguards to try to to limit that kind of behavior I think you know the biggest thing is there are there's kind of two groups of checks in this situation there's some that you could game like the maintenance check there's others where you have permission set and that one would be much harder to game and so I think there's also the question of you know what is the motivation behind trying to to game it if it's trying to boost your your project if you care about making sure that it stays in a good security place and that you want adopted my hope is is that people would also say you know I actually just want to invest in making sure I'm doing the right things in the first place but you know it could happen I think this is still a relatively new project too we're still crossing those bridges as they come along great question yes yeah so the quest yeah so the question is is anyone that's looking at different kinds of compliance talking with the scorecards project to see how those would work together you know I think we haven't I would say we haven't kind of dedicated a lot of time to trying to specifically plug in but that is something that that we're definitely interested in I think in general standards are still emerging when it comes to software supply chain security and so we're trying to be part of those conversations and we're working it's part of why we're working you know with the open SSF and working to kind of make sure that we're consistent in how we we act across these different projects that are within the open SSF I will say if there's anyone here that specifically thinks a lot about compliance and would be interested in talking about this I would love to talk with you after this talk and see if there's not a way for us to get plugged in more but right now you know we're more at the these are best practices level and from that standpoint there's not a lot of back and forth debate on it but we have a long ways to go with actually plugging into any sort of existing compliance yep we use GitHub API to figure out the sorry the question is do we understand the dependency tree and do we understand what are the dependency changes so we should be able to generate the SPOM we we don't we don't figure out that we don't we haven't written code to figure out the dependency differences we utilize a GitHub API which gives us the new dependency changes and we use that to go get the scorecards score for that particular dependency so that we can get the results for that particular newly added dependency did I answer your question specifically so not not not not not like that so what happens is anytime a new PR comes into the GitHub repository what we do is we ask and the GitHub API asked to is there a new dependency added in this PR the moment that we get that result we go hit these scorecards API and get results for those new dependencies and populate in the GitHub comment and that's how we know that's how we're able to bring up those details up in the as a comment and I want to make I think one kind of limitation of what we do clear as it is right now which is you're a saying we're not recursively going through because we're not and you know that is something that you know I don't don't exactly know for sure if we would make a hard call of whether that will be something we would ever do or not do but because all of the metrics or all of the checks are at the project level you know basically leave that as an exercise up to whoever wants to consume the scorecard to go out look at their dependencies get the additional scores for them we don't have the concept of like a super scorecard yes you know where you know your scorecard for a project is you know some combination of the best and the worst scores of its dependencies that that part's not there yet and obviously and that's why we built the API so that you can utilize other tools to go get that dependency tree and utilize the API to get the score for that you're welcome any other questions all right going once going twice thank you all for coming thank you