 Because our talk is about to begin. It will be domain name system, the hierarchical decentralized naming system used since 30 years. It will be by Hannes Mianat, Mirage OS hacker, who's been developing security protocols, TLS, OTR and others, and is a coffee nerd. Please give him a warm round of applause to welcome him. Thank you. As Karen was mentioning, I'm Hannes, and I will talk about the domain name system. This is part of the foundations track at this conference, so if you already know a lot of details about DNS, I don't expect you to learn anything new. And if you don't know anything about DNS, I hope to explain it to you. So what is DNS all about? Well, its purpose is to resolve human-readable and human-memorizable host names to IP addresses. That is the only purpose of DNS. So it's basically very similar to a phone book. You may know it from the old days. It's a piece of paper or a real book where you can look up a human name and find their phone number in a specific city. But DNS is actually not as static as a phone book, but it's de-centrally managed. So we can push updates much more frequently. And we have a de-central and hierarchical delegation system so that people who own a domain name, they can update their records. And DNS is used in a lot of applications and all around the Internet. So there's a command-line utility called host on Linux and other Unix operating systems. And if you type in host of events.ccc.de, which is the main website and so on, you get a reply from this program. And this program uses DNS, so the protocol I'm talking here about in the background, and requests a record for it. Obviously, a host then can also fail if the name doesn't exist at all. But here, host of events.ccc.de returns an IP address, which is 195, 54164.66. What is DNS as well? Well, DNS is specified by the IETF, which is the Internet Engineering Task Force as a collection of RFCs. It started in November 1987, so more than 30 years ago, with RFC 1034 and 1035, which specified the basics of the fundamental protocol on how to resolve host names into IP addresses. And DNS has evolved over the time, as you might be able to see here. And DNS is nowadays maybe 20, 30, or 40 different RFCs. So it's a quite complex protocol, but the same basics and the same fundamentals are still used. And since DNS is used since 30 years, I expected to be used for another 50 years or so. So I believe that we won't outlive DNS so that DNS will be around when I die. That's why I care about DNS. DNS, in other terms, is a distributed key value store, which has been standardized and specified 30 years ago. It uses a hierarchical name system, and it uses delegations to get some decentralization aspect in there. It has built in redundancy and built in caching into the protocol itself. So let's look how the names look in DNS. So we saw, as a first example, events.ccc.de. And the name system, as I mentioned, is hierarchical. So it is a tree-like structure. We have the root at the top. The root is empty here. And then below the root, we have, for example, the top-level domain DE and the top-level domain ORC and other top-level domains. Then below the DE, we have also some second-level domains like ccc and various others here. And then below ccc, we have, for example, the events name. So it's a tree-like data structure. Each domain name, so a domain name is events.ccc.de consists of a sequence of labels which are separated by the dot character. So if you actually type events.ccc.de, the DNS protocol itself knows it is the top-level domain name is DE. Then you have the second-level domain name, which is ccc.de. And the third-level domain name is events.ccc.de. Each label now may contain only alphanumeric characters, which is a to z, lower and uppercase and zero to nine. And dashes, but dashes may not be the first character. The label length may be only between one and 63 characters. And the textual representation, so the representation using the dot format, which is events.ccc.de, for example, has a maximum length of 253 characters. Domain names are case-insensitive. So in any case, in whichever case you ask a name, DNS server for, you will always get the same answer. The data format, the data format on the wire is in respect to some requests. So your client or your resolver asks request to some service. And the request is a triple of a name, which we just saw, then a record type and a class. The reply contains the very same information, so a name, a type, and a class. But additionally, it includes a time to live, a length, and a data field. And the data field is interpreted differently depending on the type. The time to live field specifies an amount of seconds for how long this resource record may be cached. So we have built-in caching and cache timeout. What are potential types or resource record types? There's one which is the address record, which we just saw. It's also called a record. Then we have other types like name server record, which is also named NS, and start of authority, which is ZoA. The classes which we saw up here, so it's a triple where we now discuss the name and the type. And the class is usually used Internet. That is basically the only one used in today's networks. But there are others as well as DNS was developed back then. There's also a chaos net, which is not really used these days. How does a DNS packet look like? So we have a request. A request is this triple, and the reply is set or a list of these triples together with a time to live and some data. A DNS packet has a header of 12 bytes. But the first two bytes are some transaction ID, so some identifier, which is just echoed from the client to the server. Then we have the next 16 bits which contain flex, the operation code, and the return code. And then we have four fields, which are each two bytes in length, and they contain the amount of questions, answers, the amount of authorities, and the amount of additionals. And then after this 12 byte header, we have arbitrary length of these things. So an arbitrary number of questions, well, a number of questions as specified in the question count, then answers, then authority, and then additionals. Each of the questions, as mentioned earlier, is this triple, and all the other fields are this whole structure. So how does, we have seen the host command. Now there's a different command called dick, which is used for debugging and for looking into DNS packets. If I type into my laptop, dick of the record type A, so I want to resolve the host name events.ccc.de with an A record, so I want to get the address record off of that. And I'm asking now, not my default resolver, but the specific name server, which is this ns.cccv.de, which is an authoritative name server for that domain. I get back as a textual representation here. So dick also uses a DNS protocol in the back end. And I get here as an answer back. First I get the question repeated, so I get events.ccc.de. Then I'm in the class internet, and then I'm using the resource record type A, so I'm interested in address records. I also get an answer, which is events.ccc.de. The time to live is 600, which is five minutes. Then also the class is in the record type as A. And then the actual content, the actual data is now the IP address. We've seen it earlier. Then I get some additional information in the authority section. I get which name servers are responsible for this domain. So for events.ccc.de, I also get time to lives. And then name server entries. One is ns.cccv.de, and the other one is ns.ccc.de. And then I get some additional information, namely that ns.cccv.de has also an IP address down here. So how does this whole delegation, we have now seen names and the query. How does the delegation system work together? Well, first of all, you need to set up your name service for the specific domain or subdomain. So you need to insert into your name server some name server records and some start of authority record. And then for the subdomain you want to serve, you need to insert into the super domain. So into, for example, ccc.de, you need to include the delegation that now my name servers are responsible for this subdomain. What is this start of authority resource record type? Well, it consists. So if you, you can also use dig to request other resource record types. For example, the start of authority as done here. And then you get as an answer, and I only copied the answer section here. You get events.ccc.de, then a time to live again, and then you get the start of authority record. That one contains first of all, which name server this information originated from. Then an email address where the first ad is replaced by a dot character, which is the responsible email address for that zone. Then a serial number, which is here pretty high. And then some timers when you should refresh things in the zone and what is the minimum time to live in that zone. Then if you ask for the name servers of events.ccc.de, we just saw that as a part of the authority section. But if you specifically ask for the name server records, you get those two names of the records back. And you get as in the digital section again an IP address. DNS works that you have now a service and servers have some delegation and the root of the system is part of the root service. So they are organized or deployed by IANA and every client has information about which are the root name service whom you can ask to do resolutions. So how does a resolver work? So on the other side we have a resolver and there are two kinds of resolvers. One is an iterative one and forwarding one and I will go into detail of an iterative resolver. So my unique system, my laptop, for example, asks by using the get host by name API, asks for the record for events.ccc.de and that one sends a query out to the local configured resolver which asks for the resource record type A and then for the domain name events.ccc.de. What does the resolver do? Let's consider the resolver doesn't know anything apart from the root name service. Then it queries, oh, what is the name server? Who is responsible for the de domain to the root name service? They reply with, oh, here's a list of name service which are responsible for that. The resolver then asks that name server, one of these name service, what is the name server for ccc.de? That one replies with those name service. Then it asks, oh, who is responsible for events.ccc.de and that one replies with the IP address. This IP address is then answered by the resolver to the get host by name unix function. That's how an iterative resolution step works. So you have noticed there are multiple packets from the resolver sent to various other hosts. Now here the host only have names but no IP addresses. So how does the resolver actually know who is z.nick.de? That is a problem which is also called glue. So DNS answer as we have seen earlier needs to include some IP address if the name servers inside of the domain of the responsible server. So since events.ccc.de, since the name server for ccc.de is also responsible for ccc.de, whenever you ask it for a name server and it says, oh, it's ns.ccv.de, it also needs to reply, oh, by the way, ns.ccv.de has the following IP address because otherwise you wouldn't be able to find out which name server or which IP to ask. The iterative resolution I just showed in the earlier slides uses query name minimization to improve privacy. That is a concept that you don't ask the top level so that you don't ask the root name server for the entire address, for the entire domain name. So you don't ask the root name server for who is events.ccc.de. This is a privacy feature which has been standardized in 2016 and is nowadays deployed by at least some resolvers, but not by all. Otherwise, it is a privacy feature because otherwise you leak the information to the root server which shows you are interested in. What about caching? Well, every resolver since there's time to live for every resource record set, every resolver can as well cache all the data and all the records and then reuse the cache data for new replies. Usually a setup looks like I have my laptop over here and then I have on my router likely a DNS resolver, forwarding resolver which doesn't do iterative queries but which just queries all the time. Another resolver run, for example, by my ISP. And that resolver run by my ISP is then asking the alternative name servers. That is a common setup. Another possible setup is that you just don't ask the ISP name server but you just ask your router and that does iterative queries. The resource record types I just showed to you the three ones are not, that is not a complete list. There are many more. Some useful ones which I use quite a lot is, for example, a way to introduce aliases. So if food.example.com should always redirect to bar.example.com, we can use a so-called C nondical name or resource record. That one will just reply to every query for food.example.com, independent of the resource record type which is requested with, oh, look at bar.example.com. Another resource record type is MX records or mail exchanges which contain a priority and the name of the mail server which is responsible for this domain. Then similar to the address records, we also have quad A records which I used for IPv6 addresses. So A for IP vision 4 and quad A for IP vision 6. TXT is a record which just contains some text data and it's used by a variety of protocols. Ediness is an extension mechanism and the extension mechanism is also done just by using DNS and resource records type. So they have an in protocol extension mechanism which is pretty interesting and pretty nice. I mean DNS, since it's 30 years old, has to be developed over time and you need to whenever you want to extend it, you have to be sure to be backwards compatible because you have a huge deployed server and resolver base out there. So you don't want to end up in interoperability problems with earlier versions. Ediness has been developed a long time ago but there's next year reflect day to say, oh, now it is actually required and service should behave well if there are ediness records requested by the client. What transport does DNS use? Usually it uses UDP because it has a very low overhead, only eight bytes of header and a very low delay. But you can also use DNS over TCP and that is actually done if the reply is too big for UDP frame. And then if the reply is too big, it is just truncated at the point at, let's say, 512 bytes and the flag in the DNS header said that, oh, by the way, this answer is now truncated. And then the requester has to re-request or can request the same request using TCP. TCP then requires it to set up a whole TCP session. TCP, if you use DNS over TCP, it uses the very same packet format as we have seen earlier but it's prefixed by the length of the content. And the ediness resource record we just saw that contains or carries the datagram size of the maximum UDP packet. What more about DNS? Well, first of all, if you want to register a second level domain, you are forced to run at least two authoritative DNS servers which are in disjoint IP networks in order to have some fault tolerance because some one IP network may be offline or the routing may be broken at any given time and in order to resolve your host name, you need to have another one so that clients will always be able to resolve that. You can now, since you are running multiple instances of the same data, you need a way to synchronize that. DNS has a methodology called zone transfer to do that within the protocol itself. And this zone transfer uses the serial numbers in order to compare where the zone transfer is actually needed and also some timers which are part of the start of authority resource record type. So it's all very much in the protocol encoded how you can do synchronization and fault tolerance and redundancy. DNS has also extensions in order to do dynamic updates. So instead of having to restart your server whenever you reconfigured, you have an in protocol update where you can say, oh, now let's add this domain name with this IP address or let's remove this domain name. Since now we have DNS updates and only this timers for the secondary, for the synchronization, there's also a notification mechanism specified. So the primary server where you do the updates, where you do the dynamic updates, that one then tells or notifies the secondary name servers to say, oh, by the way, the serial number of my start of authority has increased. Please update yourselves. And since sometimes if you want to do updates, you don't want to allow it to everybody and only filtering by IP addresses might be a bit weak. There's an authentication mechanism called TSIC, which usually uses HMAC secrets, so shared secrets, which you need to pre-share across the different servers. But you can also use some Diffie-Hellman key exchange in that layer. You can use DNS to do load balancing. So if you encode multiple address records for the same name, the server will reply with all of them in a random order to all of the clients, so to all of the requesters. And the requester will pick the first one or again one at random. And so if you have two address records for the same domain name, the probability that roughly half of the clients will use one of the address records and the other half will use the other address is very high. So we can actually do some sort of load balancing via DNS. This is also called DNS round robin. Mod DNS features, well, there's some security mechanism which has been specified and standardized, but it's not yet mandatory to use. And it only protects, well, it mainly protects the iterative resolver, the communication between the iterative resolver and the authoritative name service. There's, DNS is also used in a multicast setting for service discovery, something like Rondevue or ZeroConf that is all based on DNS. DNS can also use internationalized top level domains and domain names by using some puny code and coding for them. What are DNS threats? There's on the one side censorship. So anyone who runs a DNS server can filter domain name or if you use a DNS resolver like the one from Google or the one from AT&T or the one from CloudFair, they can at the central point filter domain names in there. And don't give you any reply, for example, for domain names they don't like, like libgen, the library genesis.org is in such a problematic situation. In some countries there are people or some governments want to filter certain websites and they do that by requiring all the ISPs to have filters for specific domain names. Cash poisoning is another threat. So since DNS by default doesn't use any authentication and every request has a very low entropy, so it's very aesthetic. An attacker, for example, in the local network can just reply with a huge amount of replies and will maybe faster than the other one. So the low entropy is because we only have the 16 bits in the header as a transaction ID and apart from that we can only do some entropy by modifying the casing of the domain name and that's it basically. We can also play with the UDP source port to get some more entropy. What else is a threat is definitely user tracking. So if you run a central resolver which is used by a lot of people you see a lot of information from all of those people and you can track those people. So you know about which domain names are around, which domain names are asked for and so on. Another issue which has been discovered in the DNS was amplification attacks, so-called amplifications attacks. And the issue here is that your request, your DNS request is usually very small. So it has a 12 byte header and then only contains the domain name type and class. Whereas the reply, the response to it might be very huge, especially if you use DNS and use cryptographic signature on that. So the ratio between the bytes you requested versus the bytes which was replied with is very small. So as an attacker, you can fake your IP address and ask a resolver with your fake IP address with a lot of very small packets and that DNS server will then reply to that fake IP address with a lot of large packets so you can amplify your attack instead of putting all the packets to your victim directly. You put them first to a DNS resolver and the DNS resolver will amplify it and put that all onto the victim. DNS over TLS is another privacy enhancement. It's specified in RFC 7858 and it's a connection between the caching resolver and the iterative resolver so between your router and the iterative resolver. You have to verify the certificate of the iterative resolver and the iterative resolver needs to be trusted to avoid any man in the middle and it protects against eavesdroppings and modifications. So if you are in a country where DNS is censored, you can use DNS over TLS in order to connect to some remote server, some other country where DNS may not be filtered. And you might happen to know some people who run such iterative resolver and who are more trusted than your ISP servers. I think I'm now basically near by the end of my talk, I will talk a bit about MirageOS, which is an operating system with a very tiny attack surface and less attack vectors than a Unix system and it's developed in a functional language called OCaml. And DNS and MirageOS, how does that fit together? Well, I implemented over the last years the DNS server and resolver, including dynamic updates, authentication, using HMAC secret. Let's Encrypt was also implemented by other people and I also integrated it there. We have persistence storage in a Git repository and we can provision those Let's Encrypt certificates using the DNS challenge so you can get a certificate from Let's Encrypt. How does that work out? Well, the idea is you start a virtual machine or a MirageOS unicom, a so-called unicom with a static seed just for its private key and an HMAC secret. And that one then generates its private key from the static seed and a certificate signing request. Now it requests the certificate from a DNS server, which uses another resource record type called TLSA, used, well, specified in the Dane project. And then if that certificate is found and the private key matches and the certificate is still valid, then we continue and can serve some web server or some mail server. If it is not found, then the certificate signing request is uploaded to the DNS server and the DNS server is then pulled periodically for a certificate. In the meantime, another unicom, which is a DNS server, is notified and communicates with the Let's Encrypt and tries to provision that certificate. So that's how I do my deployments of DNS and Let's Encrypt Sign certificate. So I'm now at the end of the talk to conclude DNS is widely deployed and has been around for 30 years. I believe it will be around for another at least 50 years. It's a redundant and federated distributed key value store and it already includes in protocol caching and dynamic updates and authentication. So it's a pretty complete protocol. That's it for me. Do you have any questions? And the questions works as follows. You stand up behind the microphones and then you're allowed a question and not a comment. And those of you leaving, please leave quietly so that we can do the Q&A. We have a question from microphone number one. Yeah, I got a question regarding MirageOS. What are your plans for 2019? So like I think we have a new release of Armin coming up, but what do you guys have in store for next year in general? I think that's a bit out of scope for this talk, but we are organizing Hacker retreats in Marrakesh again next year. And we have some plans for next major version of MirageOS and also of integration with the DNS and automated Let's Encrypt things. I also want to integrate some DHCP stuff in there. We have some questions for microphone number two. What would be like the pros and cons of using DNS for load balancing versus running a different like a load balancer like HAProxy? Well, with a proxy you have at the end a single point which needs to handle all the load. So which needs to handle all the connections. Whereas with the DNS based load balancing, you can just set up a set of service which all gets distributed load. But with a load balancer or with a proxy in front of that, you have the advantage that you have a single point where all the connections are flowing. So you have a central instance. Some in some setups DNS round robin is not very suitable, but in others it is. So if you need some common shared state between all of the connections, you better use a proxy and have somewhere the shared state. Whereas if you don't have shared state like if you have a static website or a static server, then you can use DNS round robin and you don't lose anything. And next question for microphone number two. Hi. In the last time I heard about techniques like DOH, DNS over HTTPS, what are the pros and cons? Well, DNS over HTTPS is now an extension where you don't have to use UDP or TCP, but you can transport DNS directly over HTTP, which is an advantage if you already speak HTTP and you are happy with a JSON and so on. But it also adds some, I mean, some more work, some more data because you need to do a HTTP request for getting that data. So there are some pros and cons in various setups and the DNS working group at ITF is currently discussing the sole DNS over HTTP in detail. So one disadvantage is that usually with DNS you use UDP and you have a very low delay, whereas if you use TCP you have to first do the connection establishment and you lose quite some round trips until you get to the point where you can send and receive data. I can see from the signal angel we have a question from the internet. Useful for ham radio like mapping call sign to the main name or vice versa? I didn't, I didn't understand the question. Can you repeat please? Do you know if there's any DNS extension useful for ham radio? I unfortunately don't know that, but IANA, the internet assigned the internet authority for names and numbers. They have assigned all the resource record types and that is a central list and you can browse it so you can maybe find it in there. There is still time for more questions so please get out of your seat if you have a question there will be time for your question as well. Question from microphone number two please. Hey my question is about a geo DNS and so for example if I want to connect to Wikipedia.org from Singapore or from Europe so I would connect it to a different data center how that's being handled by DNS? Well as it depends so you can set up for example your DNS service as anycast nodes and anycast is a routing extension or routing mechanism that you have the same IP address at different geo locations and then the closest one is used. So using that mechanism you can have your data center in Singapore and your data center in Europe both sharing the same IP address for the name server and then hand out in the in your Singaporean data center in IP address which is also hosted in Singapore and in your European data center one which is in Europe. You can also use well the one motivation for the EDNS flag day for next year is to be able to deploy that and to deploy having the ability as a client to add some information into the request about its own IP address so you if you know at the server side that your client is hosted in Europe you can reply with an IP address that is also in Europe. So there are various mechanisms and you have to find out and choose which one is suitable for you. One more question from microphone number two. Could you elaborate on DNS tunneling attacks as a means to transport information ex filtrate information out of an organization like in order to get undetected? Yes unfortunately I didn't talk too much about exfiltration but certainly since DNS uses these domain names you can in a request already impose or include some information like using the domain name and using automatically generate domain name or encode some information in the domain name and then if you're in control over the server you can exfiltrate some data from the client or the client network to the server which is very useful if you are in scenarios where your client shouldn't be able to connect to the outside but DNS is fine and DNS works DNS also works via the proxies so you can exfiltrate information using DNS and domain names and then as well you can there are even implementations that you can talk IP over DNS so if you are in some setups where for example a wireless network hotspot where all internet access is forbidden and you have to pay money for it but DNS is still open then you can have if you are in control over the server and over your client then you can actually communicate via DNS over that not paid for wireless hotspot we're nearly out of time we have two more questions first from microphone number one how useful do you judge DNS to ship certificates for other protocols like SMTP or for the web I think that is very useful if we would have a trust chain in DNS so for example using DNS sec if we have DNS sec and we have a trust anchor then we could ship any other certificates just via DNS that's the Dane protocol working the Dane working group working on that extension and I think the weakest point at the moment is that DNS six DNS sec deployment is not so huge or it's not widely deployed so the last question is from the internet what do you think about using DNS services from big enterprises like Google, Cloudflare and so on concerning privacy yeah you shouldn't use that there are independent DNS resolvers run by non-profit organizations which also have some data privacy laws and they don't log anything so I would recommend to not use those 1.1.1.1 or 8.8.8.8 or 9.9.9.9 do not use those centrally organized DNS resolvers because there you don't pay anything but the company nevertheless gets all your data what you should do though is thank khanis for his talk