 Okay, welcome everybody Jeff joined. Our next speaker is Petr Spachek. He is going to have the talk that is less focused on the DNS people here and more focused on the other people here. So enjoy. Hello. As this was mentioned, I'm Petr Spachek and work for CZetnik. So if I talk about not resolver in some fancy way, don't take it seriously, right? Bind is good as well. So the talk is called Blemend DNS and right now we will be focusing on the thing how to detect where it is broken, not how to fix it. We have half an hour so we are not going to fix things, just find what's broken and that's just enough for half an hour. So everybody knows this. It's nightmare because it basically says nothing. Something is broken, go figure. And it might be DNS, might be basically anything else. So right now we will focus on DNS and that means that we have to go through some of the basics because to debug something we have to understand how it works in practice. So this is in the schoolbook. In theory there is some recursive resolver with contacts of authoritative servers and combines the answer from the pieces scattered all around the net. That's the theory but that's not the practice at all. Of course users want to have some application, right? And the application for example Firefox or something is not talking directly to recursive resolver typically but it's using some APIs in the operating system. Then the operating system handles the communication. So there are couple layers of in direction but that's still not enough. In practice the operating system talks to something be it modem or whatever your home router or something and this middle device then talks to recursive server in theory. In practice nobody knows. It's like I mean this is the reality. Your operating system attempts to send the package to some IP address then magic happens and you get some packets back and that's what we need to debug right now. So prepare for fun. Now the difficult question is what do we do because something is broken where we should start. So this presentation is not like universal procedure for all problems. Take it like high level ideas and modify it as you see fit. So I like to begin with the authoritative end because that's usually you know problem, somebody else's problem. And besides this if you look at the far end you will have some expected values. The values you want to get to your local machine which is nice because you have something to compare something to start with. So okay the website doesn't work and now we want to get the right values what we should see on our local machine but local machine doesn't work right. So to solve this chicken egg problem we use some external tool to look from other side of the network basically. So my favorite tool is called DNSWiz. There is more of them so you can find alternatives as well. And it's quite an easy web app to use but quite powerful. I will try live demo so it doesn't work. Okay so web page called dnswiz.net easy as that. And now how do I work with Mac? Okay you just enter the domain name supposedly the one which doesn't work. This should work right? Yes well yeah this is likely to work because it's run by ICANN and it does nothing so. And dnswiz will eventually come up with some nice graphs give it a second. The thing is that okay it's too big right so how do I make it smaller? Minus. Okay okay awesome. We can read it now when I get that right. Yes perfect it looks like this I will zoom on or maybe you know what I will switch to slides back because it's more legible than the original one. So that's the website you enter the name and then there is couple things to look out for. First the dnswiz has its own cache and it might show you quite an old result. In this example it's still small but there is information about the time when the result was generated and in this case it's seven days ago which means useless for debugging. But there is a button which is next to the time. If you click to update now it should generate a new result and then you have the thing you need to see. Okay so now let's assume that we have the fresh result from dnswiz. In that case the important part of page is called notices and if you can see a lot of red and yellow signs then there is a problem on the far end most likely. So it's in this case it's a good idea to pick up phone and call the domain owner and say hey your site is broken do something about it. Because usually often it means that the problem is not local and you are not going to fix it because you have no means to do that. So you need to get for example a bank which is always a fun to fix something. If we assume that this left part is green it means that dnswiz didn't find anything broken on the authoritative site. Then we can go to the next step and look for the expected value. For the values we should see on the local machine. So if you hover your mouse over some of these bubbles the dnswiz will show you the data for example the AP address which is available from the remote server. So that's the value we need to get basically. So when I debug something I take a note about this value and then continue with the rest. Okay so now we cover the case when it's broken on the remote side. It means pick up the phone call the domain owner if you can and tell him to fix his stuff done. If it's not broken on the remote side then it's bigger problem for us. So the next step is to debug it locally. We have the value we want and now we need to find out why it didn't arrive. So as I've mentioned previously there is an operating system in play and some middle box devices and some firewall magic. So first thing to rule out is the local machine, the operating system. If you spin it goes through the operating system APIs and it might happen that something is broken locally and the problem is not even in dns. Because if you spin and it gives you IP address 1, 2, 3, 4, 5. That would be really bad. 1, 2, 3, 4 and the dig which is utility which talks directly to dns and basically skips the layers of indirection in the operating system or some of them will give you different IP address. It means that the operating system is doing something weird with the answer. Maybe the content of file slash etc slash host is something weird and it prevents resolution. In any case if the result from ping and dig commands don't agree it means that it's problem somewhere most likely in the operating system. So don't blame dns in this case. But if the result from ping and dig commands are the same it means that the dns is something, returning something weird. Because dns with says the IP address should be 91 whatever whatever and the ping says 1, 2, 3, 4 and dig as well. So locally we are getting different values from dns than values seen in the dns with. In that case it's going to be fun. So next step is typically to open the slash etc slash resolve.conf and look for the IP address which is the IP address used by the local resolver operating system to talk to the local resolver. There is a couple of options again. It's like big debugging tree imagine how many options and how often you need to branch. It's a lot of fun. So if you run the dig command without any extra parameters with the at sign which specifies IP address it uses the IP address from slash etc slash resolve slash.conf. So now we know where to look. If the resolve.conf file says and there is a local host address in it it means that you are talking to recursive resolver on the local machine might be unbound, bind, system to resolve the whatever. The thing is that if it's local it's easier because you can open the lock see what happens, maybe flash the cache restart the daemon, see if it helps or not. It might happen that lock doesn't make any sense at all. You see that the local daemon sends the query to the internet and the result the answer which came back is just nonsense doesn't match the query sent and so on. In that case something fishy happens on the network most likely. So I often suspect ISP that they mock with the DNS. Of course they have good intentions but they make our lives much harder. So there is couple ways how to confirm this theory or this proof. So if I go back to the DNS ways you can see the IP address here that's the expected value from the remote server and there are IP addresses of the authoritative servers on the other side of the net. So what we can do is to take address of this authoritative server or any of them and use dig, add sign and the IP address from the DNS ways and the query as usual and we can see whether we get the same result or not. If the result is not the same it means that we attempted to communicate directly with the authoritative server and we got garbage back which means that something fishy happens on the network or if you just didn't make the note my way how to confirm this is to use some garbage IP address at all. For example this IP address 192.021 is from documentation block which is by definition not rootable on the internet. So if you get answer for this query it means that something totally shitty happens on the network. So in that case pick up phone, call your ISP, tell them what do you do, I don't want you to do this and if usually you will not succeed so it's time to change ISP. There's no other way around because they're just doing dumb things sometimes. Okay that was the example when we had local host address in the slashetc slashresolve.conf. It might happen that there is something else, not the local address. Typically it will be IP address of your home router more than something. In that case go figure where is the configuration interface for another thing. If it's home modem, open the documentation or something and hopefully you will be able to find another IP address in that modem. Typically the modem forwards the queries to the recursive resolver on the ISP. So if you get an IP address of the ISP recursive server you can skip the modem and see if it helps or not. Assuming we go to the web interface of the modem for example and got an IP address of the ISP recursive resolver we can again use dig at IP address of the recursive resolver and name and see if it works. If it works it means that we basically skipped one piece in the resolution chain we skipped the modem and now it works the modem is the 40 one. So the classical trio C logs, flash cache restart if it doesn't help time to throw modem out of the window or at least call the ISP. If it doesn't work even when you try to contact the ISP's DNS recursive server directly well then the ISP has a big problem. Most likely the support line is ringing all the day so don't be surprised if they don't answer your call and again might be time to change ISP to something more reliable. Well it's quite complicated to give some generic procedure because DNS is wild west it's like anything can happen. I'm always surprised when I debug something because no matter how many problems I debug I'm always surprised by something new. So just the high level idea, don't trust your local machine because that's often the reason why it's broken. So use some looking glass like the NSVS or SSH somewhere else and then the query from the remote machine from different part of the network. And then there is nothing else than common sense. Just compare what you see, think, okay we have chain of five components which forward to each other and it looks good at the last three so one of the first two has to be broken and so on. And besides this the ultimate recommendation is complain loudly. If DNS is broken at your ISP don't be silent. You know you can work around it somehow. You can use DNS over TLS for example but it's not helping or it helps you for the moment but it's not helping anyone else because it will stay broken. Other customers who are not so technically savvy will not be able to set the DNS over TLS or anything else and will get super creepy experience. So complain loudly please call ISPs, please send emails if you can send email at the moment and because we need to create push and explain that there is a demand for sensible internet connection not just port 4.4.3. And if you don't have enough go to the github there is a project called DNS violations which is basically collection of weird stuff which happens in DNS and we would be glad if you submit weird stuff you see in your network because it's always at one hand fun and on the other hand important information to know what can happen because we as developers of the DNS software need to know in what conditions the software will work because I mean there is 200 RFCs around DNS and even if we read all of them it's not enough because the real network doesn't match what's written in the RFC at all so please complain and share the information thank you for your time. I think we have like 5 minutes right for questions so ok so go on Not really a question but a comment I also often use DNS quiz and most of the time it presents not a single error but maybe multiple errors and read all of them not only the first one because I was trying to say it could not find a matching signature for my DS key in the parent zone and it ended up manually doing controlling if the signature was correct and all that verification until I found out there was a second error below it could not contact an authoritative name server so of course it could not find a matching signature because it could not contact the name server so read the error message definitely please sometimes in the past the solution was use the Google DNS 8.8.8.8 right would you recommend people to override whenever they have DNS problems and use that or try to use the DNS server that they should be using I mean this prevents you from doing that often because if the network is doing something fancy with the DNS packets in flight you can use any IP address you want it might be 1111 or the documentation address and you will get the same crap so it might be a good debugging step you can definitely try it DNS script has been a workaround for a while until it was discontinued sorry once again DNS script has been a workaround for a while until it was discontinued yeah DNS script seems to be dead DNS would be better yeah ok I have seen some questions go on for debugging you can also use Wipe at last yes if you have access to Wipe at last sure but it's oh really ok good to know well yeah the DNS over TLS has the the problem that there is still not yet the public and stable reliable server for it so well we are looking for but quad 9 is not official very soon it will be yeah the problem is that nothing is like nothing have SLA nothing have you know the stable encryption keys and so on but hopefully the quad 9 which is the 9.9.9.9 for you know equivalent of the Google thingy except it shouldn't have the transport problem hopefully as long as you actually have a good local resolver it's way better to use that than Google DNS because actually Google servers perform better using your local DNS because that they from the authority server they look at which network does the query come from and if it comes from A to A it comes from Google's network to get servers within the network where if it comes from then your own then it will send you to Google servers that are close to you yeah there was a comment that the different servers might return different other answers depending on from very vast and that's of course one complication debugging because well if you are getting different answers from different places then it's hard but usually when you get the answer it's the one which should work because there is some intelligence which is generating the answer so hopefully that will be the functional one okay now go on this is already I see again I originally started the DNS violations project and I invite well because I don't have much time lately I invite everybody to come and help us with the project and the second thing if you ever think of writing DNS server read the current violations that are there and please don't make the same mistakes also don't make your own mistakes make new mistakes right and if you make new mistakes go log them on the DNS okay please so we've seen some customers using some and they are based on any caste so it's a little bit more difficult to troubleshoot because from your side the DNS answer is different from the customer side depending on the point of asking so do you have any smart way to okay so the question is how to deal with the tailored answers depending on the IP address you ask from and do we have some super tool for this no it's just complex well yeah theoretically but but that can give my balance my IP address the problem yeah I mean we probably don't have anyone from nominated here but they have wild stories about having like millions of views because every customer has its own tailored view of DNS so I mean definitely might help if you can use it but you know take it the Tor browser oh well you can use Tor browser but then you have no idea where you popped up do you have any click on it and it says I mean it doesn't help to debug the problem because you will show up somewhere if you try it and it doesn't work for you but you try it with a Tor browser and it works there and that's the information oh yeah yeah but basically we will get similar information from the DNS it's different place in the network if your name is DNS is oh yeah DNS is one place Tor browser is a random place and you can just click a button to get a new random place yes go on do you know a way to know what we solvers are a process using, it's a process using because Lipsy doesn't update when the process is done okay yeah the problem is that usually when you start a process it reads the slash etc slash resolve conf and then it keeps the IP address during lifetime of the process that's the typical behavior and yeah it's madness I use Warshark or TCP dump and look how it looks what it does when I type in something which generates DNS query to kind of bypass that in Ubuntu by default we have a local DNS resolver so you always point at 127.0.0.1.2 and then you always resolve locally and that does update ETC house it does update the DNS dynamically it does split DNS and some flushes caches it's a bit dynamic WTS yeah I know I have no those things so like although Lipsy is limited like most of the distributions they kind of you know have more stuff behind it and EG Lipsy has a patch right in general in general I think that having local resolver is a good thing because you can validate locally and don't need to rely on anyone else doing DNS validation for you but then it's more stuff which can break and more fun oh okay I will be pretending that I didn't hear this go on one more thing about RIPE Atlas it used to be that you just could send them in a mail and you would get sent such a note to use your own connection as one of the data points for the RIPE network I'm not sure if it's still so they also had the notes here I don't think they have more pros here I think they ran out after the talk earlier I know that there are several people from RIPE that have brought pros here still okay so you get points regardless of having a probe yourself so you can use the network anyhow and just ask them for more points because it's only as a measure to make sure that people overload the network so if you just need some points for doing whatever measurements you just get more points so for the record if you want RIPE Atlas gadget at home and have enough of them just fill in the web form on RIPE.net and you will get one okay, thank you for your time