 Hello everyone. Welcome to that seminar. So today's speaker is Salon Gorbeck. Salon comes from Boston University where she is a professor. She has done her PhD at Princeton with Jennifer Export and Vaz Balak. And today she presents her recent work on the transition from BGP to secure BGP and the resources public key infrastructure. We're a bit of an idea that's going to be here. So I just want to mention my co-authors, one of whom is Al Rogel over there, who did his undergrad at New York University. A lot of his work was done, the main student author is Robert Leitchhub, who's a graduate student also. So what I want to talk about is as we try to secure routing, he's been working for 20 years, we're starting to make some headway. I'm trying to talk about what is the right path to go through adoption of these protocols. So just to set the stage and so we know what part of the network we're talking about, I just want to go over a little example of how the domain routing works. So I picked a very particular example here. We have this server here, that's a spam house server, hosted at this IP address. This IP address is part of an IP preface. I know I'm at Stanford in the next sentence, so I probably don't need to do this. This is what an IP preface is. So this is a slash 24 prefix, which means the first 24 bits are fixed and the remaining 8 bits are free, so that's 256 addresses. One of which is a server. And now when people want to learn paths to the server, this autonomous system, each one of these networks here is a large autonomous system, will announce the route to the prefix. So here we have this autonomous system 2997 saying that if you want to reach this IP prefix, you can route to me and of course this address is contained in this prefix and so all addresses in this prefix will be accessible through here. So this is the basics of inter-domain routing. We have, traffic will, announcements will propagate in this way. So every one of these autonomous systems, whenever they want to make routing announcement, they select the path that they use. They mention the path that they're using and then they put their own name. And so each node will learn a path through the network consisting of the names of the autonomous system as long as the path, and I've just replaced the autonomous system numbers with their names here. So I picked this spam house server for a particular reason. So people have probably heard in March there was this attack on spam house that was supposedly the largest denial of service attack that we've ever seen. This was an actually really interesting event because there are a lot of different things that happened during this attack. One of the things that happened, apart from the denial of service because of the different things, was actually a new GP attack. And I want to show you the mechanics of this attack that happened in March. So what happened here is this is coming from a blog post by this network Greenhost. This is what they posted on their blog. So this was their normal path that they were using to get to the spam house server. And what happened was this network here became adversarial and wanted to host a competing server here and basically tricked people into using this server instead of that one. So here's how they did it. They sent a routing announcement for an IP prefix which exactly covers the address of this server. So some people might see this and find this funny. This is kind of funny because this is a single IP address. This is the slash 32. It is funny because this actually worked and this actually did run this way. Okay, so why does this work? So is this an AS that had been somewhat longer than the AS that had been around for a long time? Yeah, so some of these pumped up. So you notice that there's three ASs on this path. This one's no longer seen. And this one is Bellsurf and I don't know why it's here. This one I think is also not seen anymore. So it'd be that they've popped up just to the purposes of doing this? Yeah, I think in particular I don't know why. Or was it that they've taken over an H2U? So I don't know if it was providing services to people. I'm not really sure how they managed it. I don't know if it was the owner of the AS or something. So a couple of reasons why this is funny. This is a single address. But the important thing, the reason I'm showing this example, is the reason that the green house chose this path is because this prefix is more specific than that prefix. And the way that routing works is you always pick the more specific prefix. So this kind of attack will always work if you are the attacker, have a more specific prefix than a legitimate destination. There is a legitimate use case for this, and that is that if you wanted to move that hose somewhere else in the network, you can route to it. It's a good feature, right? Well, it's actually not supposed to work. Most networks are not supposed to accept anything shorter longer than slash 24. So this was kind of funny that they got away with this slash 32. But we've seen other events, for example, the Pakistan YouTube hijack where this was a slash 22 and this was a slash 24. And that cannot be prevented by the standard that we'll do if you don't accept slash 22. So actually, when I saw this, I was kind of surprised. And we know that this was malicious, because here we have this blog person showing you and here's the attacker saying that we got your server. There actually is a use case for me for letting that work and that it will allow you to move us to the surrounding route. So it's sort of a feature, although it's not designed for that. I mean, it's an error. It's a potentially useful feature. This was an interesting recent event. This is the Pakistan YouTube one that probably a lot of people have heard of. This one happened because they were using this technique. There was a use case, a use case with censorship. They wanted to block access to YouTube, so they hijacked the prefix of YouTube inside Pakistan, the traffic linked to outside of Pakistan. And just down here. This was in 2010 where China Telecom announced a lot of prefixes and hijacked some of them. This is a thing I found very recently. There are some Chinese IEEE journals that you can read where they talk about system censorship and it's kind of hard to understand what they're doing, but I think maybe they might be doing prefix hijacks. So it's possible that this was somehow related to this, but I'm not sure. And then there are other events here. For example, ConEd did roughly the same attack as this China Telecom attack in 2006. The only difference was in this one, sorry, this one, the network went down and the China Telecom one was really interesting because the traffic flowed into China Telecom and then left and continued on its merry way and nobody knew what was happening. So in a lot of these cases, the network just goes down and some of them it gets interesting. Excuse me. This China attack you're referring to is the one that basically 30% of the traffic to China would be directly yours. And I didn't say attack, by the way. There's not an understanding of what the intent was in any of these cases, so if I said the word attack, I would not need to see it. Why is there any understanding of the intent? So for this particular one, I know that there were some forensics done on this one and they were trying to understand it was sub-crucified hijacks. The sub-crucified hijacks would look much more suspicious than announcing the exact same leak exists that could sometimes happen as a rubber bug and they found that it was not sub-crucified hijacks. So that's what we know. All right, so we know how to solve these problems and use cryptography, right? And it seems actually kind of simple how we might solve these problems and there's a lot of solutions that we've proposed over the past few years and I want to focus on the two that have been getting a lot of traction recently. The first one is called the RPKI. This was standardized a year ago but deployment started even earlier than that and what this thing does is it certifies I can see these violations. So what that means is we can no longer have someone claiming to own a prefix that they don't actually own because this is a sort of trusted database that will ensure that the prefix to AS mapping is trusted. So the important thing to keep in mind about this is that it's checked other bands. It's not like the routers are every time they get a routing message they're going to do a bunch of cryptography to figure out what this message is. There's a centralized, well there's a distributed database you download it to your local cache you do the verifications and then the routers check against this local cache. So it's really not a very crypto intensive protocol for the routers. But that requires everyone along the path if I can see and try to stay this far I didn't know if it was possible or something close to that. You know that's this protocol so when I show you the details of this protocol this is just through the state actually. So in this protocol that's true. So this is the protocol so there's been a lot of contention on how to properly design this about 10 years ago there was a lot of discussion about this the protocol that they're currently standardizing is called BGP SAC it's the older brother of BGP S secure BGP. So what this is doing is it's certifying routing announcements it's actually verifying that the entire BGP path is correct and it's signing the messages with public key signatures so this is a much more crypto intensive protocol you're changing the BGP format in years with signatures on that. So if you're not completely sure how these work I promise I'll share exactly with you. So I mean in terms of oh and where we are today is we have about 1% deployment of RPI in terms of certificate scheme issues. So one of the challenges of deploying this we know how to design these protocols and in fact the first designs of these were made in 2000. So the challenge of getting these deployed and what I'm going to be talking about mostly is if you want to deploy such a thing in the Internet you have to deal with the fact that not every network will adopt it at the same time. It has to be backwards compatibility so if you adopt this protocol you shouldn't all of a sudden lose access to a large part of the Internet that has not yet adopted the protocol which is by the way one of the difficulties of IPv6 because it's not very backwards compatible. And finally what I'm not sharing here is that it should provide you some security benefits if you're going to go to all this effort it should be these two things plus give you some value. So what we're trying to do in this in these works, this series of works is try to understand how you adopt these protocols and what they give you. Are there a couple that have questions that they answered? One of them was what do you actually get in terms of security for these protocols? Seems like that question should be answered by the protocol itself but what's really understood about these protocols is that we know that if you for example certify routing announcements with signatures then you know that nobody can lie about the routing announcements but what does that actually mean about resistance to attacks? So because the routing announcements are signed does that mean that you actually won't choose routes to the attacker and how does this interact with the fact that this is happening on a graph where there are different networks making different decisions and using these protocols in different ways and affecting each other. So that's really what we're looking at in this in these bunch of works. Another set of questions are what do we actually... what are the incentives for networks to adopt these protocols so when you retroactively secure a system you want to give some sort of incentive to do this and we have some works on this and I'm not going to talk today about the economic incentives of using these protocols. And then finally the thing that I'm most excited about right now is how they change trust relationships. So when you put crypto into a system crypto is just a way of codifying trust relationships and sometimes when you put crypto into a system you change the trust relationships and this can be uncomfortable for people and this is particularly the case with different hierarchies and so this is one of the things I'm going to talk about today and that's the focus on our offer. So in specific what are we looking at when we talk about one of the security benefits of these protocols? What we want to understand is as we move from a deployment of the RPKI which is this sort of offline protocol that doesn't actually change the routing messages towards this full cryptographic deployment of BGPSEC what kind of value do we get as we move from here to here and when I say move I say that as more and more numbers before this protocol how much more benefit do we get moving from BGPSEC to RPKI? And to spoil my the result is that having the RPKI deployed is actually the most important thing from what we're finding and the gains that you're getting from BGPSEC you do get some gains but they're marginal in many cases and I'm going to talk about that later. And then the second thing I want to talk about is given that the RPKI is so valuable in terms of providing security how can you actually deploy the system in a way that doesn't make people uncomfortable in terms of trust relationships being altered and introducing new vulnerabilities into the system and that's the subject of the hotline. So there are two parts to the talk the first part is most of the talk and the second part is three slides so I'm going to start talking about this security benefits of RPKI BGPSEC and I have to start with a bunch of background telling you what these two protocols do. So first of all what is the RPKI? So the RPKI is a cryptographic certificate hierarchy and this is the way it works what I'm showing you here is the IP prefix allocation hierarchy so what we have over there is a regional internet registry what it does is it allocates IP prefixes and what I'm showing here is it allocates this prefix to this organization's sprint and then another prefix to another organization and so on so these are sub allocations and you can see that this prefix is a subset of that one and similarly there so what the RPKI does is it certifies this hierarchy so now you can think of each one of these entities as a certificate authority and have a certificate inside the certificate is a key the key can be used to sign a certificate for the person that it's delegating the address to and so on so there's a key here that's signing this and this key is signing this and the validity of these certificates depends on the sort of signatures being valid and also the prefix being subsets of each other so if you look at this this is different from the SSL certificate hierarchy in an important way where you require that these names be subsets of each other and the SSL certificate hierarchy NECA can issue any certificate for anyone but that's not the case here and I told you about I told you that this certifies IP address allocations and what I mean by that what this says is that this prefix is allowed to be announced in BGP by this eponymous system and so what happens is a certificate will issue these objects called ROA which is wrap origin authentication saying that this prefix is allowed to be signed by the SSL yes and this could again just this key here signing a message that says this okay so these are leaves of the hierarchy and these are intermediate so what can we do with this so if we go back to our cyberbunker example and we imagine that you were allowed to accept such issues for now if we have this RPKI we would have a ROA here saying that this prefix belongs to this autonomous system and that would exist in the RPKI and so Greenhost was actually using the RPKI what it would do is it would check the validity of these two messages and you can see that this message here is RPKI valid so this is a weak form of validity it just says that the last fault and the prefix match so that's what RPKI means and this thing is RPKI invalid and the reason is because there's no ROA saying that this AS should announce this particular prefix so what is our guy going to do he's going to look at the validity and I'm going to use the valid route and stop this so if you look at this probably many people think I can easily subvert the system so how do you subvert the system I was waiting for the question but since I'm only giving you 45 minutes I'll just show you so this is how you subvert the system so we have this weak notion of RPKI validity which just depends on the last fault on the path and so what you're going to do is you're just going to announce this route RPKI valid because this AS and this prefix are in this row and everything's altered so what's our guy going to do so now he has a decision he has to decide between two paths that have equal length prefixes so we no longer have the situation we had before where the attacker will always choose the bogus route because it has a longer prefix now they're equal length prefix and so the decision is based on the path length because these are both slash 24s so this path happens to be shorter so now it's the path length that's going to determine the success of the attacker so how do we defeat this attack and so by the way we go back into the history of these protocols people were aware of this right way and no one would have thought that RPKI on its own would be the solution to the problem what you need to do is you actually need to secure the path so now let's see how we secure the path so what happens is that we use the certificates in the RPKI to issue keys for each one of these autonomous systems and now we have these keys that are stored by their RPKI and they can be used to sign routing announcements so this is a cryptographically signed routing announcement what I'm showing here this autonomous system is saying this is the prefix that I'm using and he's saying as CNET you can use this prefix and this path and that whole thing is signed by his key the next guy in the path will use that message put his own path in authorize the person who is giving the path to to re-announce this path and then he's going to sign that with his key and so on so this whole blog here is signed and so you can build up a chain of signatures and you know that everyone on this path has said what they is actually using this path so why does our attack fail this forward-signing thing the fact that you're telling the person the name of the person you're giving the path to is actually signed by the security of this protocol because if our attacker wanted to claim this path here he needs to have a message from this guy saying he was authorized to announce this path but of course he has no such message because he has no such edge and so he cannot claim to have a path to the legitimate prefix and the attack fails what we're doing in this paper so that's in the background so how often is the RPKI database updated and like very uncommonly it's like months it's not supposed to be very regular so I mean this is just a set of authorization you can think of it as a whitelist who's allowed to announce the prefix you wouldn't pull Oroa out of the RPKI because you're doing something like that so let's say I see a new announcement and it's not there in my database yet what do I do so I'm going to get to that later actually are you talking about during the adoption path yeah that's a really interesting question okay so so here's the setup for our analysis what we're going to do is we're going to assume that the RPKI is fully deployed what that means is that prefix hijacks and prefix hijacks don't happen we just have this attack this attack that I showed you before which is the one hop hijack where the attacker is doing this so assume that that's the only attack that we have what we want to understand is as we move from a world in which everybody has RPKI and no one has BGPsec all the way to a world in which everybody has BGPsec how does security improve and the reason that we chose to ask this question was we tried to understand given that BGPsec is a more intense protocol what are you actually getting by making this transition and the way we're going to answer this question is we're trying to understand how many networks will actually avoid being attacked how many networks will avoid going there's a bunch of things that we did here but I'm just going to show you a few of them okay so in particular we're interested in this question of partial deployment because we want to move from a world where no one's got BGPsec to a world where everyone does and in a world of partial deployment we need backwards compatibility so if this network does not speak BGPsec that's why it's purple Greenhost still needs to accept legacy routing announcements from it because otherwise it would lose access to the legacy network you can't just turn off insecure announcements because you just lose access to a lot of things particularly in partial deployment so of course this guy can exploit that you can exploit the fact that he can send insecure announcements and the way he's going to do it and this is really the threat model for most of this paper is he's just going to do the same attack we saw before and this attack will work because this is a plain BGP announcement this is RPKI valid over here this is a plain BGP and you can't verify Greenhost cannot verify that this is incorrect so now basically from Greenhost's perspective it learns two routes it learns a secure route that's long and an insecure route that's short so what should we do so this is really the crux of the whole paper so if we're just thinking about security and we're just security people we would say you should just secure a member but if you actually think about how this thing would be deployed and you talk to operators and secure routes right away first is the most important thing so the answer is no they wouldn't we did ask them oh yeah I mean you have to ask them we did we asked what did they say I'll show you what they say so first I need to do a little bit more background and then tell you what they say so the important thing to keep in mind is imagine you're running a network and you have this expensive route that's secure and this cheap route that's insecure so to understand how this works I just want to review how people make routing decisions this is a very simplified view of this so in practice this has about 12 or 8 steps depending on what you look at so the first step that we modeled was local preference what local preference is is that an autonomous system can label any route according to whatever criteria it wants so for example it can label a route as this is a provider route it's expensive it charges me a lot of money I get paid for sending traffic on this route so there's all sorts of economic considerations that determine how routes are labeled that and various other things are boiled into local preference also for balancing the second thing is after you go through this local preference then there's a preference for routes that are shorter and shorter just means the number of ASs is on the path and then there's a number of other criteria that have to do with if you've heard of meds they have to do with that how close they are to a particular network we didn't model this because all of our experiments would be done on graphs so the AS level topology and so we don't have information that can inform doing anything with this so this is how people make our decisions now this is what so performance is often flooded into things like latency and bandwidth right this is like me opening a huge camera it actually doesn't matter at all so there's a debate in the network community about whether AS path actually determines performance and a lot of people say it doesn't anyway that doesn't matter for me the point is that people do prefer shorter routes so let's see okay so this is what we would want as security people and this is what a network operator might prefer because local preference could include the cost of choosing a route so he might prefer to use that or she might prefer to use that as the first priority and then have security be the second priority and we call this our security second model and we prefer shorter routes over secure routes and we actually so as a result we have three models and so we want to ask operators what they would do we give them a list of the BGP decision process and we ask them where would you put security and these are the answers that we got so you'll notice we asked a hundred operators we got a hundred answers you'll notice that this doesn't add up to a hundred because some of them said they didn't want to use BGP second all or they said there's not standardized they can't answer these questions so that's what we learned some would actually put security first most of them would put it third some of them would put it second so we did a survey at that moment okay so what does our paper do what we're looking at is we're going to assume that everybody uses the same security model and then if you want to understand what are the security benefits which will help you explain where they are but we're trying to quantify the security benefits of BGPSec in a world where some set of S is secure so assume that everybody uses the same security policy either this one or the other one or the other two and what happens if you deploy BGPSec instead of SDS so that's our main question I should note that the simulations and the theoretical results were improved we're in a model of local use a particular model of local preference the model we used so for those who don't know what I'm talking about it doesn't matter for anything I'm going to show we used the galax grid model for local preference now what that means is that you prefer a path through a customer over a path through a peer over a path through a provider we're also doing robustness tests to the assumptions that we're trying other local preference models and seeing if the results hold up but this is what I'm showing for the results I'm going to show in this talk okay good so given that we have these three different models of partial deployment if we go back to the situation we saw before what the greenhosts do so if we're in the security third model what's going to happen is that route length for greenhosts will trump security so in the case of this attack this path is actually shorter than the secure route and the attacker would choose the insecure route so this is a classic protocol downgrade attack what's happening is you're exploiting the fact that he has to use the old protocol and he uses this kind of decision process that allows him to prefer the old protocol with a new one and get him to downgrade so all the results that I'm going to show you we did a lot of simulations to try to understand how often this happens actually happens a lot and so this is going to account for most of what I'm showing you so just to understand your simulation you just picked an AS at random and said okay so this AS is going to prefer route length over security based on the probability so the setting we're looking at is assume everybody prefers route length over security so let's assume we're working in this so I'm going to call that let's assume we're working in the security third model and then what we do in our simulations is we pick attackers and destinations and some set of secure nodes and we're going to see who routes to the attacker and who routes to the destination and how about the secure nodes right, next slide but everybody uses the same security policy so let me tell you a couple of reasons why we did that because it was hard to analyze if they all used different security policies because we now have to analyze different sets of secure nodes and different security policies so it starts to get too complicated another reason is because we can show that BGP won't converge if people use different security policies which meant that our simulations wouldn't converge so we didn't actually try to do an experiment with multiple security nodes so you asked me how do you choose the set of secure nodes right, so we didn't want to choose the set of secure nodes so imagine that so there's 40,000-ish autonomous systems in the internet how do you choose the set of secure nodes what we wanted to do was try to quantify security without actually having to do that without having to choose the set of secure nodes so let's look at this and I'm going to show you how we did that so let's look at this graph we're going to assume that we're in the security third model so route length trumps security so let's look at Greenhost Greenhost here is a very particular property and Greenhost is doomed so why is he doomed so in the security third model the attacker offers him a route that's 2 hops long whereas his legitimate route is 3 hops long or 4 so the attacker's route is shorter than the legitimate route and that's if everybody in this network is secure the attacker's route will still be shorter than the legitimate route if I make everybody be insecure the same thing will be true the attacker's route will still be shorter this will still be true it doesn't matter who I secure in this network this node, Greenhost, will always be attacked it will always be doomed so in the security third model and similarly with the security second model you can identify nodes that are doomed and this allows us to determine the maximum benefit you can get from BGP sets because even if we have a full deployment of BGP sets the doomed nodes will remain doomed they will still go to the attacker and be attacked there's one possibility that it might be preferred over or over the peer so people don't have a business relationship would that possibly help alter this behavior for example, Greenhost had a customer that was secure so actually in the security first model this preferred customer peer will determine a lot of who's doomed that's very important I just want to call your attention to one other thing you can classify these nodes here and layer in SCNET as in you so let's look at these nodes it doesn't matter what the attacker is going to do here SCNET is always closer to legitimate destination nobody is secure, he's still closer everybody is secure, he's still closer it doesn't matter who you secure in this network SCNET and similarly N layer will both always route to legitimate destination they have no need for BGP set in the security third model so in this graph the only person who actually benefits from BGP set who will change its routing decision based on what's secure and what's insecure is this node here entity because what it's learning here is a two hop or three hop path to the destination that's legitimate and then a same length path to the destination that's bogus and so if that path happens to be secure maybe he'll choose it but if it's not secure maybe he prefers the other path for some other reason so the takeaway from this is that we can actually classify nodes into three groups doomed in mu and what we call this protectable only the protectable nodes actually benefit from BGP set I don't understand why BGP set the difference versus the RBKI sure so what RBKI is doing it's imposing that the attacker can only do this attack if there was no RBKI what he would do is he would just claim to own the prefix itself and in particular he could even claim to own a sub prefix and if he owns a sub prefix everybody in the network would write how to implement it so what's happening is by imposing this is just a simple thing by imposing RBKI what's happening is that the attacker's success is determined by his position and his apology of the network in the absence of RBKI what determines his success is just announcing a lot of prefix if you agree it doesn't matter so you're really weakening the attacker by forcing him to attack to announce a path that is actually for the same prefix and you force him to actually attract the people right I understand that but what was the pretty clarifying and so in this particular picture it's not preventing him from doing any attacks he's just attacking whatever he did attack because if there was no BGP sec he's just downgrading to playing BGP and attacking BGP so there's nothing preventing his attack what is working it's not changing the attack as attack what it's changing is the decisions of the other nodes so the nodes that are actually secure are looking at the T here if he was a secured node these are equal length paths but this one will be secure and this one will be insecure and so that means you should choose this one in the security third model and so what BGP sec is doing it's changing the decisions of other nodes but it's not changing the attacker's actions I guess the thing I don't understand is that if I understand it correctly if you deploy RBKI that's really enough to give you oh wait is BGP sec basically every hop in the route has to be signed yes this is what it looks like every hop is going to be signed so the rest of the route is encapsulated by the guys right so just to summarize this what's happening here is that the attacker's success will depend on his position in the network and that's being forced by RPKI because they can no longer do some prefix hijacks that will attract the entire trap we did some evaluation of this in each one of the three security models so what I'm showing here this line this is a line so I should tell you what the Y axis is for so what did we measure here so what we did was we ran a whole bunch of simulations where we picked attackers and destinations in the graph so two nodes in the graph we have the graph of the autonomous systems we get this from the UCLA research UCLA's research we pick an attacker and destination we run a simulation to see where traffic will flow and we count how many nodes go to the legitimate destination and how many nodes go to the attacker then we compute the average overall attackers and destinations so what this line shows you is this is the fraction of nodes that do not go to the attacker just with RPKI so if there's no BGVSEC nobody has deployed BGVSEC more than half of the nodes are not route to the attacker even without a BGVSEC at all now what this is showing you is this is assuming full deployment of BGVSEC the way we computed this this is assuming that this is what we did here is we subtracted out the two nodes we figured out which nodes were doomed and we could subtract them and this will show us an upper bound on the maximum security you can get with BGVSEC in particular a full deployment so a partial deployment will happen somewhere between them so what you can see is in the security 3rd model the gains that you have over RPKI are about 17% 36% in the security 2nd model and then you get up to 100% improvement with the security 1st model so do you assume the single attacker? yeah we're always assuming the single attacker what is that? honestly because there are too many parameters so there's a couple of problems with multiple attackers with BGVSEC if you have colluding attackers everything fails because they can just connect themselves and that might not be there so I've never tried to actually have multiple, including attackers with BGVSEC I mean we had to play with routing policies selection of nodes so we didn't actually have multiple attackers multiple attackers at the same time two sir yeah the power would go up significantly because they'd be closer to more threats so this was just to remind this is just an upper bound to be computed by removing the BGVSEC here's just one example we tried many different deployments this is one that we picked this deployment is securing all the tier ones and 100 nodes of highest degree and all their stuffs so for those who understand that good that's about 50% of the nodes in the graph so we wanted to understand how close we get to these upper bounds and you can see that this is how close we get so in the security well if we secure about half the nodes in the graph we get about half way to where we could have gotten but in these other models if we secure half we have much lower numbers so the reason why is this happening and there are a lot of effects that we found here I'll just briefly mention some of them one of them is protocol downgrade attacks that I showed you that's the main effect some other things that we see are you're not secure but someone next to you is secure and he picked a good route so you also picked a good route so you get some sort of benefit from someone being secure for you and you're going to pick a good route before because they did those are the two major effects there are other weirder effects one effect is you are insecure someone next to you is secure you used to use their route but now because they're secure they're using the super long route through the network through the secure part of the network so what's going to happen to you you don't want to use them so what you actually can show is that as you secure more nodes in the network you can actually get more nodes being attacked because the paths get longer and people avoid those nodes so you have to sort of what we did is we looked at all the different reasons these numbers go up and down and the biggest effect that we found was protocol downgrade attacks so what you're seeing is you have let's say a quarter of the paths in the network are secure but people are just not using them as soon as the attacker attacks could you remind me like whether VGP or VGP sec signs the entire path all the way up to the destination or does it sign on the partial path so that's an important point so if we look at how this works it's only signing the entire path this was actually really strongly debated when they were standardizing this let's say that this guy is not secure what do you do? what you do is you downgrade back to VGP so the rest of the path will be signed with insecure VGP so you only get security if you have a totally signed path by every node of the guy so if you have VGP sec fully deployed then the entire AS path all the way to the destination will be signed and if you have it partially deployed let's say like in this you know here when you if you look at you would end up but if we look at this setting then this path here would be signed but as soon as it goes to Greenhost it doesn't speak VGP sec then this would downgrade to point VGP so it would be signed to insecure that's very likely to be signed so everybody it's signed all the way down recursively correct? which is nice because you can say well I trust the guy here but I don't trust this guy down there so it's maybe it's not good a sort of primitive question is how would sign-up on the initially there were Greenhost if it was a type of yeah so the way this worked was that there's this thing called the netherlands internet exchange where a whole bunch of networks just come and connect themselves to the network and then there are route servers there and this is instead of like they physically come over and plug their cable into the other guys router they go to these exchange points where there are route servers that you can just use and you can down the route server but they don't have policy with discriminating no so people who come into the internet exchange have different kinds of policies they'll give routes to certain people and accept routes from other people so these two guys probably have an open policy which meant they accepted every route on the route server so what it's like it's like imagine you come to this basically the other center but doesn't it happen in the some internet exchange I'm sorry that kind of policy only happens in certain kind of for example there's different kinds of policy there's open selective and I think open means you take everything and turn it into a strange amount but that is how this happened it happened in the netherlands internet exchange so they sort of injected this into the netherlands internet exchange whoever was accepting their routes saw this route and it turns out that we host didn't have the slash 32 filter on and they thought it happened the point of this thing is really just to show you that by forcing the attacker to be constrained by the topology the success of his attack goes down and then when you combine that with the decision the routing policies that people use the gains over our KPI become smaller so that's really the main result of this paper and then just a little bit about how we did this we did this the assimilation we tried lots of different things we used different graphs so this graph for those who understand what I mean there's a lot of questions about how many edges are there in the internet so we just threw four times more edges into the graph just to see what it would do we changed we tried different destinations different attackers and we're currently in the process of writing a journal paper where we're going to try the different local preference points we haven't seen a lot of changes so to summarize what we showed here what we're finding is by forcing the attacker to actually announce paths to prefixes that are really being announced and constraining them by the topology the gains we're getting here are a lot more than we would have expected we don't really need to go all the way here even though you can exploit the RPKI with a fairly obvious attack the success of the attack is really constrained by the topology and the topology is such that paths are short enough that the success of that attack is actually not that good legitimate paths are pretty short so the attacker announcing a short path doesn't mean that he's in it and so it sort of raises a question when we were using people who used the ggc how are we going to use it in the routing policies or not using it as a security first-grader policy which does actually give us pretty good performance from ggc what do we get from that protocol so anyway what we started doing was we started looking at RPKI and what it takes to actually get this thing deployed so what I want to talk about briefly is what are the hurdles and challenges to moving to using the RPKI so this is a hotness paper in which I'm going to only raise questions if not offer solutions we're actually working on some solutions so let's go back to our picture this is the first picture we saw before so here we have a row that exists in the RPKI and this is to your question from before here we have a row that exists in the RPKI here we have a routing announcement that we want to be focused this routing announcement should be focused so that everything is going to be prepared for just like this so what does that mean does that mean that we should have a certificate in the RPKI that is this certificate and it should be invalid of course not there's not going to be any certificate here there will be nothing in the RPKI that connects this to this so what does that mean it means that what you expect from the existence of certificates is not actually the way you can build the system so if we think about certificates normally what we do when we have a certificate is if the certificate is valid whatever it's validating is valid if the certificate is invalid like the signature is bad or something is wrong with it if the certificate is validating is invalid and if the certificate is not there then we have no information that's not the case in the RPKI for exactly what I showed you before so you can have a certificate that's missing or not there and what that's going to do to the VGP route is make it RPKI invalid and that's exactly the example I showed you before similarly if you have a certificate that's invalid it might just make the VGP route valid not telling you why that is but the same thing can happen there as well one of the big challenges of this system that you were bringing up if there's a certificate that's not there what are you supposed to do? does that mean that the route isn't valid and it's a hijack or does that mean that nobody got around to actually deploying the certificate because we're in the partial deployment process so this is and then there's also a problem with misconfigurations what if someone puts in the wrong certificate and causes things to be valid by mistake like what happens if someone makes a typo and puts the wrong AS over here actually getting the right validity set for the system is quite difficult and so that's the influence that RPKI has on the actual route validity the next piece of influence that the RPKI has is actually the routing decision so when we went back to our simplify slide over here what we showed was that this guy is dropping RPKI invalid routes and so he's not going to use this route and so he's going to route this way which is simple but what happens if the reason that something's invalid is because the certificate was missing or it was wrong or someone detected badly or something bad happened to the RPKI someone compromised the RPKI so what that means is that actually the route will go off one if you're not going to select an RPKI invalid route if something weird happens to cause the route to become valid you're not going to use the route so it can actually harm connectivity so this is actually one of the biggest deployment challenges here is how do we balance between a routing of protections against routing hijacks and protections against RPKI problems I'm not giving you enough details to understand this but there have been other policies proposed so different routing policies proposed the ones that were proposed will stop the problem of something going wrong in the RPKI and taking prefixes offline but what it doesn't prevent is sub-prefix hijacks which what I've been saying the whole time is that these are the worst attacks that you can do so you have this challenge of how do you deal with the fact that if things become invalid what are you supposed to do with them what does it actually mean and this is particularly a challenge when we're deploying a system this is one of the things we're trying to deal with the second thing I want to show is how can you cause something to become invalid deliberately so this is something that we did when we were reading the specs and we showed this to the designers and they weren't expecting this to happen so here's something that everybody knew would happen this is a certificate hierarchy in any certificate hierarchy in public infrastructure you have certificate revocation lists they're there because the keys can get compromised and you need some way to update the certificates right so you need to be able to say this certificate is compromised and the key is bad so the existence of certificate lists sorry certificate revocation lists means that you now have the technical means to actually seize IP prefixes this is something we've never had before in the internet so today when you get an IP prefix you go to this guy Aaron at least in this area and you say I want a prefix and they give it to you and they put a database empty in the list and this is your prefix and that's it there is nothing that feeds directly into routers that will influence how routers make their decisions based on the location of the prefix the presence of the system changes that right so one thing that could happen is if Sprint for some reason wanted to prevent this AS from using this prefix it could just revoke this certificate so it would revoke this certificate and the impact of that would be that all these certificates would revoke so this is a really blunt way to do a revocation certainly if this guy wanted to target this certificate it could just revoke it everybody would know that Sprint would fall for it that's right also these guys have a lot of business relationships right okay so both of you are okay so this is actually an interesting aspect of the system so one thing that they did so certainly if they were doing a certificate revocation list yes however how did they build the system so the system is built through repositories repositories are controlled by the issuer so this certificate was issued by this guy which means it sits in his repository okay so this is what Sprint's repository is doing now what was done to make Key Rollover more efficient is that the way you do Key Rollover is overriding the old certificate okay so that's the first observation so you don't actually have to put on a certificate revocation list and if you did that that would be a non-refutable way to prove that you did something but here you can actually overrun things and you don't have to even put on a certificate revocation list so that's one thing that's sort of design yeah that's a design decision that they made to make Rollover can I let that prevent them from still using the AAS oh are you saying because not people who look at the wrong key um so so now I'm going to show you and then what do you say you said that they all have business relationships okay so that's not necessarily true in all parts of the IP address okay so um that seems to be an implicit assumption in the design and if that's not the case so it is true in all parts but not all parts so we'll talk about that um so here's what I'm going to there's chaos because it's going to ensue if you don't have that right all the chaos isn't in the Cinex process which is to say that if I'm 18 and I can allocate pieces of sprints after this week that's madness right so there's nothing like ATTL things let me show you okay so just back to your question he wants let's suppose he wants to target this okay so there's two things one the repositories are managed by the issuer of the certificates that's the first thing the second thing is that these are certificates for sets not for names or individual points so since I can overwrite certificates and they're for sets here's what he's going to do he's going to overwrite this certificate with the same certificate right except that he's going to remove one address here that address is inside here which means this guy can't be valid anymore okay so what's happening here and this is what we find with Kyle with them was last summer working with me so this particular certificate can be taken down in a very targeted way with this overwriting thing and so we did a study of how this is possible and when we showed this to the designers they actually didn't realize it so these are some challenges of like how do you actually deal with the fact that there can be revocations overwriting in very targeted manipulations of this hierarchy so to the other question what's going on with this hierarchy so this is just one view of what's going on with this hierarchy so we tried to model the hierarchy the reason we modeled it is because the deployment is too small to see what it would look like in the future so we built a model using BGP information and stuff from the routing information registries and AS to country mapping so this is sorry so this is a a map of the IPv4 address space what you're doing imagine the IPv4 address traces a line and you fold it up into a Hilbert curve the result is that this thing is a slash 8 half of this is a slash 9 one quarter is a slash 10 and so on so this is the entire address space and what I'm showing here is the different the number of countries that are sitting under a certificate for each direct allocation so the root of this RBKI is the information registries and then one level below them these are the number of countries that these things cover so you can see most of this is blue which means only one country is affected by a particular certificate but you can see that some of this is green in particular here are some interesting examples Sprint is actually an interesting example these particular addresses are interesting because these were allocated a long time ago and then pieces of them were given out to different people who just took them from mapping from the autonomous system to the country that it's in the RIRs have these mappings and this is really a first-order approximation of countries that might be more countries you can see that these certificates cover things in multiple countries so there are some interesting questions on how you manage this and then to your question can you detect if something goes wrong and that's actually what we're working on and a bunch of people are working on right now so the need for monitoring in this kind of infrastructure to prevent these behaviors is obviously very important we've done work on this for SSL certificates but only starting now for the national so I'm going to finish now so one thing that we're pretty excited about at BU is we're trying to understand how you can build this system in a way that we're retroactively securing a system in a way that can feed into router space and you want to do this in a stable way you have to deal with the fact that some of the things we're looking at is trying to understand what to do about routing policies in the transition to the deployment of this thing another thing is looking at some of the techniques that people have developed for certificate transparency and SSL which is a completely different system the biggest difference is that you have to be aware of what you're looking at what you're looking at what you're looking at is a very different system the biggest difference is that it's not hierarchical it's completely flat where every certificate is allowed to sign any certificate for anyone in this system it's very specific only for young people so there are lots of techniques there and some of them might be used here and we're also looking at a variety of other issues in the partial deployment as this thing gets along GDSEC is where you would like to be in the marginal gains in the process of getting out and the problem is the deployment for the other department but if it represents where you agree that you're not being exploited from ways to get back they're representing the other ways too so other ways of getting out are running policies or other protocols so for example if you I'll suggest things that probably don't work but it appears to be it's relying on the human indifference or the operators reordering that that place is secure presumably if you can create trusted islands that consist of local assets or higher up in the higher idea and say I require that anybody that connects to me is going to be actually pretty fortunate that that can be more trusted what do you need to do this in a way that then where the short parts of the departments are less likely to be less likely to be that secure but it would seem as though there would be a social approach to this or a human approach to this to try to persuade them by an area or of course there's a mandate so I would like to do that but it seems as though the implications are severe enough but we have the next year and that's actually what you said no I actually do think that would work when we were talking about how we don't want to be perceived as so when we were talking about solutions, that's one of the things we were thinking about is there parts of the network where with business relationships and top links we would require people to do security first we haven't done that yet but I do think that that's a good way forward actually I'm not sure exactly how you would do it but it would be probably for certain purposes where the facts of this kind of property then let's all do security first and that's how part of the government seems incredibly un-internet like philosophy and what you've described you say the telephone company or something like that but trust is party you said there is no telephone so just trust that's on the top of the fact that nobody really scales and what's the timeout on the certificates how big is the verification list how do you communicate these securely and how do you provide security when it's a tradeoff between getting the packets through or this is waiting to get better certificates somehow or other doesn't this just sound like a long approach I mean I think I can say a couple of things I think these are valid concerns I think that there are a couple of things that they are working on and they're aware of in the working group of the ICF one of them is exactly how do you deliver this in a secure way so there are a lot of issues that are being discussed right now and how do you build this repository structure to communicate the certificates to that's another problem this whole revocation problem has been out there for the last 30 years unsolved so there are alternative proposals by doing certificate distribution through DNS which I which also has its problems because DNS is a query infrastructure and what you really do want here is wholesale download of things there are queries so that's one of the proposals in terms of actually building the infrastructure that hasn't been solved but again this is just the wrong approach there that means what is wrong so one of the things that is interesting and I agree with is that there is sprint here, like what's sprint doing here the reason sprint is here is because in the 90s there was a lot of space and people walked off of the space and all of a sudden here they are back in hierarchy and controlling the space again which is interesting my bet is it's never going to be just like AI around for 30 years basically frauds what's revocation in a new browser gets released every six months every day so I mean what they are saying you should do is you should every day download the certificates and the new view of the wait for 24 hours and nothing bad happens the power group goes over Steve do you have a different approach? do nothing is the other approach no I think the other approach is to recognize that you need to basically be able to do a source about what you know you have enough knowledge at the end points because fundamentally if that's discoverable and told that somebody should care about this can I get hackers from A to B can I figure out what I'm actually getting because you know Secure Jump or protocol gets you that so you just need to know the options and this is going in the opposite direction where you can secure all of this and the sprints can hide more and more because of the visibility or instead of controlling them so they don't have to be able to see what path they might expect I mean the internet philosophy is in M there's a path from A to B you should be able to do that and this says while not the sprint hasn't been saved that's I think a really good criticism it seems to violate the internet principle so why is that a good idea like is there a problem why is it good for the internet department to ask the question okay any more questions thank you very much no no I'm sorry