 Hello, our next speaker is Florian Opser who will be talking about unwind a privilege separated Validating the net recursive name server for every laptop. You have the floor. Oh, thank you. Can you switch this over? Oh, yeah Other people's laptops are the worst Okay. Good afternoon. I'm going to talk about unwind I'm Florian. I'm an open-based developer since 2012 where mostly specialized on privileged separated network demons I've also poked at things in a network stack and since of this week my total contributions on OpenBSD have been a net negative. I deleted more lines than I added and I'm not going to talk about this because that's forgetful. Andre might mess. I think For work I work as a senior systems engineer at the RIPE NCC where I'm a part of the team that runs the K-root name server Which basically means we have to figure out how to answer annex domain real fast but but more about the things that I do for fun so OpenBSD has been described as a hiking club with computer problem So when we get out of our basement, we we go on hackathons. We meet meet up in all these beautiful places and so we need to travel there and you might end up on a train or Hotel Wi-Fi you might be in an airport and You're always stuck with what the network provides you DNS wise Sometimes you find yourself behind the captive portal need to get past that and So we were thinking can we automate this you you always filled with your configuration Can we have a demon running that handles this for you that handles the harsh locations? DNS wise network wise Well, actually no matter how they are maybe you find yourself in a hut in a rocky mountains in the snow behind the saddle I think and we have done this twice. It was awesome So what did people do before on their laptop? You just run DH client it takes over resolve conf You will get past captive portals with this because that's how they are designed how to get past them But you're at the mercy of the name server operator that whatever the network hands you You don't have DNS like in the sense of you don't run the crypto on your laptop So maybe the name server dusters, but it gives you an ad flag. Can you trust that? I don't It also probably doesn't give you privacy really depends on what the thing behind this does but not on the first hop and You're resolving Rums in the same address base. So if I run ping foster m.org this goes to libc uses stuff resolver there and talks to something That I don't know who actually runs this. Can I trust this is hands me back a network package to which I then need to parse Variation on this is yeah, I really don't trust that network I put one of the quad axis in there and resolve conf this will very likely not get you past captive portals and It also will not work in places, but UNS is filtered Then in OpenMSD, we do have unbounded base of people figured well, we can just run that Which gives you then a DNS like validation. You can have privacy with DOT if you configure a line They're very likely not get you past captive portals and It will not work in places where DNS is filtered. However, it runs in a different address space There some other options that you can do There was a talk last year at foster m about a system D resolve D Which seems like they're trying to solve the same problem that we are trying to do with unwind So that was certainly an interesting talk for for ideas now What does unwind promise again, we're taking over resolve conf We we get DNS like validation we get DOT if you configure it it will get you past captive portals It will work in places where DNS is filtered and it runs in a different address space The the architecture so we want to run this always. So we need to get this secure. We're following the standard OpenMSD private separated demon architecture, which mostly consists of three processes We have this design and multiple demons already and using using that Template it gives us open IPC over over pipes with structured data. We don't need to reparse it gives us a config file where the grammar is is Very similar across the demons of people are already familiar with the config language And It it also gives you a logging framework where it locks the syslog or if you run it in the foreground It will automatically start lock the standard error and the cli tool Now the three processes we have a parent process that just spins everything up It's privileged it opens port 53 and hands that over to the the front end process It asks the edge client for oh, did you learn some some name servers? Please tell them to me and we run the processes in a Reduced service operating mode we with the sys call we have an OpenMSD It's called pledge and it's it's literally just this string which means we pledge to only Use the standard out parts of libc or talk to open file descriptors. That's this standard out pledge We pledge to only open files for reading and For technicality we can then send File descriptors over socket over pipe. Sorry, and if you do something else in that process it just gets killed by the kernel Another the the second process is the front end process that gets the the queries from the client Which then parses it and hands over a structured data to the resolver process And when it gets a response back from there, it sends that back to the client This part also handles the control socket and It gets informed by the kernel when an interface goes up down or interface just Disappears and also the learned name servers arrive here. So this is the process That's mostly exposed to to the outside in the sense of where user interaction comes in And the final process is Where we do all our heavy lifting and for dns And this part needs to keep track of various resolving strategies. I'm going to talk later about what what those are It receives the query from the front end process Finds the best strategy to get this resolved And when an answer comes back it passes this back to the front end and since this one does Do all the dns resolving eventually it needs to talk to the internet So it has the iron pledge in there, which basically means you're allowed to open the socket and talk to the internet But not a lot of more. Oh, there's one thing since we do support a dot We need a search bundle But we don't want this process to read the whole file system. So we can tell it with the the unveils call Look, there's only this one file only this file exists in your Whole file system if you try to open something else the kernel will tell you this does not exist So you cannot exfiltrate Any information that's on the file system. It's just not there time for a breath Let's talk about The resolving bit because this is I suppose the interesting part for for this room and so I mentioned that I run a nameserver for a living which means I go to all the conferences and Go to the the dns talks and what the implementers tell me is oh boy It's really terrible to implement the resolver. Don't do that. And now I'm not doing that We're just standing on the shoulder of giants and use a live unknown for that, which is the the working horse for for unbound And for for technical reasons we have a local copy in there Because the api does not expose all the things that we need So we need to poke at some internals, but this is certainly not a fork. I do not want to maintain that So every time there's a new release from and on that I just want to jam that in So I mentioned the resolver strategies. This is where this really gets interesting So you can run unwind without a config file, which already gives you four strategies It can run as a recurser. Basically what what unbound would do It can talk to Learn for waters It will try to talk opportunistic u t to learn for waters And it can run as a It can so it's the blip unbound and just use the the lip see stop We'll later see why that is important You can also give it a config file where you configure a d o t forwards with an authentication name for the for the Openeristic d o t. It cannot validate the cert doesn't know the name But in the config file, you can put in a name and it will then validate the cert And in the config file, you can also tweak the the preferences of the strategies and this is just the default that it runs with All the all the strategies run off of individual lip unbound contacts So one of them can do its own recursion. The other one can talk to a forwarder But these are different objects And we found a way to have a shared cache here We need to be a bit careful there since You can only share a cache if the The context has the the the same quality It basically means it needs to be able to validate If you're trying to share a cache between the strategy that does not validate that cannot validate because Signatures are stripped if you jump that into the same cache real interesting things happen And this is all single threaded We plug it in together with a lip event So when we need to figure out are these strategies any good and What what we do for that is we ask for the start of authority record for the root zone because we know that that exists And we know that it's signed And from there we can deduce if a specific strategy is any good. So we can find out is this strategy validating or Does this strip r6 or a dns keys which happens on some metal boxes or happens on some some open public resolvers Then we assign This thing is resolving Or maybe we just can't talk dns to the outside world because 53 or 853 are blocked Then this strategy is clearly dead and we will not hand any queries to it um, another thing that we're doing is we we observe How how good is strategy is how fast does it answer? So we we keep track of the round trip time and put that put that into a Decaying histogram and calculate a medium of the round trip time over that And so running the the cli tool on on my laptop, I think it was online for about a week or so And it shows me this So this is where we calculate the quality from and I suppose one interesting thing is the first column where The round trip time is below 10 milliseconds and all the queries ended up there which basically means these are all cache hits So eventually everything that gets answered off to the cache, except for all the other stuff, I guess So we have all these strategies. We have we we checked out if they're any good and how do we do the resolving now? well, we know to sort the strategies to find the best one and Validating is always strictly better than resolving and then we use the median rtt for tiebreaker And we also have the the preference to consider The way we do that is we skew the the most preferred strategy by 200 milliseconds So if I say I really want to do recursive resolving as the the preferred strategy What I mean is here i'm willing to wait 200 milliseconds more But if I find myself behind the satellite link, what that usually means if I try to do my own recursive resolving An answer comes in two seconds after I ask. I'm not willing to wait that long So yeah, you can wait 200 milliseconds more, but yeah Then just give up there. So what we're doing is we pick up the best strategy um Give it to liban want to do the resolving and start a timer which waits the measured round trip time milliseconds And if we do not get an answer We try the next best strategy And if we then wait the round trip time milliseconds and um and so on and so on Well, if we do not cancel the already running queries Because we put on all the work already there. They're talking to the internet things are happening We might as well while we do all the work already Maybe get a hot cashier and we also want to know how terrible this is actually this will skew The round trip time in the histogram. So on the next time we will not actually Use the strategy because we already know that it's bad um This is how this kind of all started so captive portals and dns breakage the Captive portals break dns. So it's kind of the same category um We used to have what what what your phone is doing and What the jesus left of here is doing and what browsers are doing is they they have an hdp check They just go off to the internet to do an hdp query to a well known url where they get a well known answer And if they don't get that answer They assume they have a hdp portal where you then need to click here accept the term for its conditions and all this Yeah, the other and So chrome has a url for this or android has a url for this Mozilla has one apple has one We we have a cdn we could run our own But we could never agree on which evil corp we should trust here and why are we trustworthy? so One night we were we were talking So we monitor dns We know a thing about dns Can we do this in line just toss the hdp and turns out we can We just need to have a look on how broken our things actually So you find yourself behind a captive portal you need to agree to the terms and conditions And what the thing does it it completely blocks you off from the internet You can only talk to the dhcp forwards. Well fine talk to those and you're done. Um, this is simple um, then you find yourself on the dutch railway and They are special so they have an open wi-fi. You connect to the thing and Whenever you you send a query with That has an e dns zero option. They just answer annex domain to everything This also turns out it's it's kind of okay, because well, we do the the check Does this strategy actually work and we ask for the root zone and that this thing says yet the root so that does not exist Which is a cute story, but kind of I don't trust that one So everything will have the quality of debt will not be used and this is where Why we have to stop in there as well Because the the libc stop does not do the dns so you can actually get through that thing So you can click the terms and conditions and everything is fine. Everything is funny and um So I actually have then a dot configured which then uh upgrades all of this to dot because even if you are Behind the camp of portal if you if you went through there, uh, it will still answer annex domain. It will intercept your dns Now Then maybe you you're living in the netherlands and you want to fly somewhere Oh, no, hang on. That's the next one um, sorry You find yourself in in situations where dns is actually open and These are getting a bit more difficult because all these strategies suddenly say, yeah, this is totally working the root zone exists I can resolve this Now you can of course run ip over dns, but that's not particularly fast um The the problem here is so you you get an htp redirect that is the captive portal the middle box intercepts your htp or htps And a redirects you to a thing which runs on the middle box And but it's not the the redirect is not resolvable on the public internet So the resolver says yeah, um An extra main you talk to one of the quad axis. It's as an extra main You talk to you to one of the public one is as the next domain. It's not resolvable So we came up with a heuristic where we're saying, okay fine For the first five minutes, um, we do not trust annex domain We just if you get an annex domain, we just go to the stop and and figure it out that way Which works, uh, real well And yeah, we we go directly to the stop not to the forwarders to do just sidestep dns zero issue at this point um Now i'm at an airport In amsterdam This is brilliant So you get a redirect And or you you you you go to the internet you talk htp. You get a redirect to the captive portal and this one is dns exiled Which works out quite nicely And so then you click here. I agree to the terms and conditions and then it forwards you to another page where it's actually opens up the internet That one is also resolvable on the public internet Uh, but the validation fails There are just no r6 on this And so, um, yeah another heuristic there, uh, we do not trust validation errors for the first five minutes um So but in so so people so I traveled with this, um A lot of developers use this for for their traveling and it gets them past things There are still some there are still some stuff that they've been to work on um So when when uh, our laptops have, uh, maybe 4g cards and they also hand us forwarders But we need to figure out and we just jam them in with with all the other forwarders But this is not correct. Uh, you should only use them when you when you actually on 4g and not wi-fi Um, and in practice this actually works out because they are not reachable when you're on wi-fi, but it's still wrong Um a much bigger problem is what are we actually doing about dns? So you can say well Be strict about this dns. I guess a thing you need Always validate and if you can't validate then adjust the answers Well, that's kind of not helping when you're in situations where it's really not working Of course, you can just slam your laptop shut and then go to the beach, but if you really want internet, um, Maybe you should accept this So there there are various ideas what we can do about this one is um Do not allow a downgrade like you're you're in a network and You figure out the dns egg validation works Then strictly require dns egg validation works If someone suddenly starts to intercept you or actively attacks you and strips all the r6 Just don't accept that Unwind currently would do that it will would discover. Oh in this new location dns egg is not working. So I don't both are trying um But it cannot detect That you're actively being attacked. Uh, so we need to improve on that Another thing is since we're doing all the crypto on the laptop. We can actually trust the ad flag So, uh, bubble this up to the to the software Who actually wants to do the resolving That one needs to decide so i'm doing ssh my jump host And the dns egg validation fails then ssh should prompt me. Yeah, do you want to trust this? But it's not doing that and i'm not aware of any software that actually currently does this So it's ultimately very helpful Another thing It turns out when you're behind Weird captive portal. Oh, sorry weird middle boxes and satellite things are one of those That they really like to intercept and and launch your your udp package. They they they Actually fake answers for you Um, they don't do dns egg. So they they also strip records and everything gets gets weird at that point However, they do not understand tcp So if you really want to talk to the internet, uh, just use a tcp, but um Lipan won't will not do the right thing here because it also cannot notice. Um, so it's happy to use udp and But we know better that tcp would actually work So the idea is that we can detect this and then have a dedicated strategy that just does tcp um The the Some some of the captive portals you you you go through there and they only give you internet for 30 minutes And um, so this basically means after 30 minutes you need to reconfirm But we already know that we have passed the captive portal. So all the heuristics would not would not trigger So we need to improve on that and one idea there is that we probably can work out what the redirect domains were and Drag them put them in a special pool to Handle them in a more special way A thing that was pointed out in the the system you resolved the talk last year was that you Since the captive ports are lying to you you should really not put their answers In the cache and I think they're they talked about their solution to this problem is to dump the cache afterwards So when you know that you get past the captive portal, but I really want to keep the cache So I put my laptop to sleep all the time and I opened it. So I want a hot cache um So we need to find a better way there And I hear reports of people Using this behind dns 64 and at 64 boxes that kind of maybe works, but there seem to be weird problems So I need to uh investigate that Anyway, uh, that was that Do you have questions? Many questions In the case of the horrible airport, uh captive portal, um, you said this the solution was to Ignore it the nsx validation failure. So five minutes. Yes, does that equate equate to a downgrade attack? Yes, yes Just wanted to be sure. Yeah. Yeah. Uh, no, yes Because all you're doing is managing downgrade attacks in the same way Yes, so the idea is so we we know know that the network changed The kernel tells us this you you connected to a new wi-fi we treat that special um It's a trade-off many thanks for your presentation. I would like to ask Openbz community usually use comic sans font for presentation why you use another one First I would like to thank the speaker for not using comic sense I will improve on that for the next time. Oh, please Hi, Florian Well, you mentioned ssh, which I find interesting because it's an application written by the same Project as the word this work you're presenting And I don't know if you're aware, but ssh does have Go passes for dnsx validation I mean it ships with uh the ability to link with ldns. That's not really worthwhile Yeah, there are patches that also link with get the ns that really works But then the open the ssh maintainers seem to think that this is a complete non problem And nobody would ever want to run it like this So I hope that maybe your work will inspire the open ssh maintainers to to merge The nsx validation and then the problem should be solved for ssh at least I I suppose to um, I think uh the the um The the things you're talking about these are in portable So I don't think that the the OpenBSD one can use ldns We don't have that in base. So this has to be in portable. Um, well, maybe you should merge from portable then Maybe yes Did you also I saw a list of other projects, but the project dnsx trigger was not on there Because it's quite similar. I think I can't find it on a short notice. Um, oh, yeah, yeah, yeah, um Is it still maintained? Is that still the thing I came across it and um It it Yeah, I don't really remember the specific it it looked a bit abandoned, but uh, no, okay Okay, sorry about that then Oh Several more questions hold on on my way Okay, thanks. Um, I'm curious about how Much of a pain in the s this would be to port to something other than open bsd So clearly there's the pledge thing But what else what other open bsd specific things are in there that would be horrible to port So the the pledged thing, uh, it's probably by now understood how to do that in portable other software also does that I think the the the kernel integration might be problematic where I don't even know if this is specific to to open bsd or specific to to bsd's on how to learn that an interface came up And down these kind of things. I suppose other operating systems. Uh, it's not that link, right? No, this is a route socket. Okay. Yeah um, so yeah, I suppose there are ways to do this and uh, The the most simple way would actually be to just use to control the the cli Just execute a script. So if you if you have a way to tell oh, uh, my network changed Pipe this through through the cli. I think that would be the easiest way and Other than that, I don't I don't think there are that many things that would not be portable Right. So like ip monitor on any linux, which just gives you stuff for a pipe Yeah, okay. Any other questions? Here's one I was just wondering when you said that you one of the strategies is to run your own recursor Like for example, when you're at your office, how do you resolve printers if it suddenly doesn't trust the dacp provided forwarder? Yes, um, where's otto? Autumn do you mind answering now? There's a similar question about the firefox doh stuff, of course But that's not this talk um, what you can configure any name or domain and Redirect all queries For that domain anything under it to a specific resolver, which is typically Let's say a solution to your split horizon thing in your office or in your home office um dns stack can be used in that case, but also can be Switched off for those names, which in general I think even in our office don't work that well Thank you. All right. I have time for one very quick question So I can't show this slide. This is actually a better one. I thought Does that good have a title? All right, thank you for your talk. Thanks