 Good morning. Thank you for Rising so early Which I know it's hard to do at a conference because of late-night parties. I am Paul Vixie and it's true I am part of the team who helps to secure the AWS infrastructure And this follows a few decades of the things that Jim was just talking about Today I'm going to tell you a kind of a lesson of history. It's my history. So I get to tell it And I think it's safe to say that no one in AWS would disagree with anything that I'm saying But these are this is my position not the company's formal position. Just so you know So let's begin with some evidence and then we'll talk about what I think it means What you can see here is that some engineers over at Forescout Technologies Noticed a pattern in the vulnerabilities that they kept encountering and they realized that there was some well understood problems in how Software it goes about unpacking a DNS packet, which is a binary format kind of it's in 1980s format today We would do this in JSON And they said, you know that this is never gonna stop if we don't start writing down the lessons that people should know Before they write software like this It's a short document. It's very well written very compelling. I Commended to you for your plane ride home But what I want to say is this is incomplete We have also from this year an example of a vulnerability in the DNS library in One of the many embedded C libraries And in this case when I'm talking so much about how you unpack the binary format of the resource records But rather Are you using a predictable sequence number now? Many of you are old enough to remember 2008 when Dan Kaminski came out with his DNS flaw I miss Dan very much by the way Anyway What he found is that even if you use a pretty strong random number generator This is only a 16-bit field and if you're using a 16-bit field To match whether the response you just received is a response to the question you just asked You could be fooled Somebody can successfully guess that and if you're using a good random number generator then it becomes a game of statistics And when Dan first got a bunch of us together and said hey, here's a big problem. We need to set our hair on fire his proof of concept was 11 minutes and a hundred megabits would give you guaranteed Action you could write something into a DNS cache That shouldn't be there Now we all know that the the correct fix for this is something More like DNS sec we start signing your records so that people can validate Whether the record came from the owner of the domain and so forth, but we also knew that we could not Couldn't get that out there fast enough and Indeed DNS sec even today 13 years later is not widely enough deployed to Solve this problem and so to find some software that is it's open source. It's inspectable anybody can look at this and What we do to choose the next sequence number is to increment the sequence number that you used on the previous transaction Well, that makes it pretty easy to guess because you make it ask a question of a server you control And then you make it ask the question that you want to pollute the answer for And you just use sequence number one higher than than the one you saw So this should be embarrassing that in 2022 there's still widely used open-source software that has this logic in it. All right, so What I have to tell you is the references section of this particular cert advisory Refers to in this case five other Vulnerabilities that they consider to be related Almost all of these are covered by the RFC that I pointed you at at the outset In other words, there's a big problem and it's an old problem okay, but Let's talk about sort of How this happens structurally speaking? You know, we're doing this to ourselves the call is coming from inside the house where So it started With 4.3 BSD which in fact a lot of things that we take for granted today started with 4.3 BSD And in 1986 Berkeley Decided for various reasons than the publishers of BSD decided. Okay. We're gonna support this new DNS protocol But spinning up a new release and making all those mag tapes and putting them all in shipping containers Was a lot of work. So they published it as a patch and when I say published it as a patch what I want to say is There was Usenet so it was posted to a news group there was an FTP server and there was a mail-in list where It's called name droppers where they said by the way, here's a patch And if people were interested they would download the patch. So There's a small industry. This is pre commercialization pre privatization. The whole world wasn't yet using the internet So this was not entirely crazy. It was not as crazy as it looks from where we stand today So getting into the details bind is the Berkeley Internet name domain And here the word domain is used in its mathematical sense as kind of a container of Other other things and the things it contained were a name server, which we know. Well, that's we call it bind It's name is name D, but we call it bind some tools and some changes to the library And the important change to the library was to the C library There was a call called get host by name and a related call get host by adder Oh That originally had simply consulted the ETC hosts file and we still have this on all Unix type systems today. Although sometimes it's in a different place We have the idea of a local set of mappings You know, here's here's an address and hear the names by which it may be known So all they did was change out. They swapped out that bit of code to say, all right If you if some application calls get host by name And there is a resolver that is available to us We're gonna resolve it with DNS and if there isn't a resolver we're gonna fall back to the old thing, you know backward and forward compatibility and Then there was a new API bunch of functions beginning RES res under bar And this was shipped as the lib resolve and so you would link to it if you knew that you wanted to access the resolver directly So I came on the scene shortly after this And at the time that I began working on DNS, this was all abandon where the people at Berkeley who had done it had all graduated and This later led to me founding the internet systems consortium so that there would be an organization to maintain this and other things like this as a nonprofit, but The thing that happened right after this was done is that everything got big so Everybody who had any kind of a network device knew that they needed DNS, right? It couldn't just be a LAN based server or appliance or whatever anymore. They had to be able to do DNS lookups and And yet the names of the API that Berkeley published were not necessarily convenient right different embedded systems vendors had their own naming conventions So rather than importing the code and making a dependency on it, which wasn't possible in those days technically They copied it made a copy of that code and changed it to suit their local engineering considerations and then Linux came along and Right after that we Commercialized the internet we privatized the internet all of our friends and relations started to get email addresses It was wonderful and creepy all at the same time But every distro at first had to build their own C library And so they each copied some version of the old Berkeley code So they might know that it came from Berkeley and get the latest one probably they didn't they just copied What some other distro was using and once again, they made a local version of it that was divorced from the upstream then we got embedded systems and so Early examples would be DSL modems but now they're everywhere IOT is everywhere and All of the DNS code in all of the billions of devices that I just mentioned are running some Fork of a fork of a fork of code that Berkeley published in 1986 this almost never gets independently re-implemented so all of those bugs In those vulnerabilities. I showed you earlier all of the bugs that are mentioned in that RFC our bugs I wrote or the bugs that I shipped that I shouldn't have And they're bugs that I fixed I fixed them in the 1990s and so for a Embedded system today to still have that problem any of those problems means that Whatever I did to fix it wasn't enough. I didn't have a way of telling people So what can we learn? Well Sure would have been nice if we already had an internet when we were building one Because then there would have been something like github instead of an FTP server and a mail-in list and a use net news group but in any era you use what you have and you try to anticipate what you're gonna have but Ultimately you got a ship you have a ship date people care Functionality per time unit is the measure of success for technology producers All software should be presumed to have bugs Not just because it always has but Because that's just the safe position to take so when you ship something You need to have a way of shipping changes to it in that way needs to be machine readable You can't depend on a human to monitor a mailing list. It has to be some automation to get the scale necessary to operate Version numbers I realized that us in a CICD world We're used to just fixing what we need to fix shipping what we need to ship and it goes through the automation And it goes live at some point But the people who are depending on you need to know something more than what you thought worked on Tuesday They need an indicator and that indicator often takes the form of the date in your month day format It doesn't matter what it is just has to uniquely identify the bug level of any given piece of software So we have to put these version numbers in even if they serve no purpose for us as developers Locally you have to say where you got code should be in your readme files It should be in your source code comments Because you want it to be that if somebody's chasing a bug and they reach that bit of local source code They'll understand. Ah, this is a local fork. There is an upstream. Let's see if they have fixed this and Of course, you should automate your own monitoring of those upstreams If there's a change then you need to Look at it decide what it means to you Is it a bug that you also have or have you is that a part of the code base? You didn't import is that a part that you've completely rewritten. Do you have the same bug but in a different? Function name or you know some other local variation of it This is not optional and You know your downstream should be given some way to know when you have made a change Or otherwise these bugs are going to do what these DNS bugs. I just described it doing as a consumer When you import something remember that you're also importing everything it depends on there was a famous vulnerability in the log 4j library where it had a lot of very advanced functionality that most people didn't know about So it wasn't a bug per se. It was kind of a misunderstanding and a lot of the companies that turned out to be Vulnerable to this we're not using log 4j. They were using some other library that depended on log 4j So when you check your dependencies you have to do it recursively you have to go all the way up Uncontracted dependencies Are a dangerous thing right if you if you're taking free software from somebody then you're hoping that team doesn't disband Doesn't go on vacation Doesn't maybe have a big blow-up and make a fork in there two forks But the one you're using is dead whatever it is. It's an acceptable risk. We have no choice We need the software that everybody else is writing But we have to recognize that it is an operating risk. However, Excel accessible and You know orphan dependencies Become things that you have to maintain locally and that's a much higher cost Then monitoring the developments that are coming out of other teams But it's a cost that you will have as these dependencies eventually become outdated Somebody moves from version two to version three and you really liked version two, but it's dead code Well, you got to maintain version two yourself that's expensive and I mean, that's either expensive because you hire enough people and build enough automation or it's expensive because you don't That's that's our choice So mostly we should automatically Import the next version of whatever but it can't be fully automated Sometimes the license will change from one you could live with to one that you can't You have an uncontracted dependency with somebody who might at some point decide that they'd like to get paid So that's another risk Get into the end here You have to say what version number you need So that as you become aware that only versions from this one or hire have the fix that you now know that you have to Have you can make sure that you don't accidentally get an older one Or it might be a specific version only that version is suitable for you in which case some day that That tar file is going to disappear and you got to worry about whether you have a local copy in what you're going to do So it's usually better to not have a local fork of something so that you don't have to maintain it yourself and You know as you monitor that supply chain with all the automation I've just described every time somebody releases something open a ticket and make it some engineers job We often give this to mid-level journeymen engineers To go look at it and see if it's safe see if it's necessary see if it's Absolutely vital set my hair on fire work over the weekend or we'll just get to it when we get to it Now if you can't afford to do these things Then free software is too expensive for you Remember that We are all in this together But I think we could get it organized better than we have. Thank you for your time today