 Nou, al plies wel van m'n Pablo Caboni. Je wilt ook wel eenbouwt en vrije wist die en houdt die loft die werk tegelijk. Dit is een verhaal over hoe ik vond in liefde met eenbouwt, en waarom gebruik ik vrije wist. Het is een verhaal waar ik vroeg, waar ik vond, maar met een gelukkig vriend. Het is al met een gelukkig vriend. 6 jaar geleden, bijna. En op basis van een toerlijke event, er waren zo'n toerlijke eventen. Ik ben 42 jaar oud. Ik ben uit Buenos Aires, Argentina. Ik werkte al 20 jaar geleden als Unix Admin, DNS Admin, Netadmin, etc. Ik heb de DNS en de BSD, speciaal de vorige BSD. Ik werkte met netwoorddevijs, netwoordprotocoules, op een heel hoog niveau. RFC, ik gebruik reed RFC teksten en soms om vriendelijk te discussen over het, omdat van slechte implementatie. Soms ben ik genoeg om te ontwikkelen met languages zoals C-langs, maar in een ander manier. Mijn contacten, mijn persoonlijke contacten, Twitter, Mastodon, LinkedIn. De informatie dat ik later wil laten zien, ik heb er wat sensitieve dingen gebouwd, in elk geval. 6 jaar geleden was ik in een paar KPI's van DNS servers, appliances. En we ontwikkelen in de office dat de CPU-usage over 60% gedeelte. En niet meer. KPI's per seconde gedeelte tot een top aan de lijn, maar niets meer. Het werd gestart en de gebruikers gedeelte bijna 3 seconden om te bekijken om de hostname te ontwerpen. Wat waren die gebruikers, mobiele subscribers? Wel, er waren hardware upgrade planten, maar in de meeste tijd. En wel, meer dan 2.5, de reale nummers 2.8. De truth. 2.8, customers, 2 hardware appliances. A plateau line from 12 p.m. at lunch until 8 p.m. En, wel, 60% of usage, en niets meer. 20.000 queries per seconde per box. Dit is fine. Everybody is exploding, but this is fine. The hardware upgrade was plant in the middle of this story, in the meantime. Parts number 2. Make it worse. There were some firewalls in the middle of the network. Those firewalls were almost exploding because of high traffic of UDP packets. It was a mess. Facepan. Just for fun, in the meantime, I was playing, I was testing in a, in a lab, in a mini lab with a Dell PowerEdge, 1950, anbound, under free VSD operating system. Oh, some people said, hey Pablo, you must test. It rocks. Okay. I will give you a try. Well, next steps. The traffic, because it was traversing several firewalls, it needed to be optimized. And we had to, we had, the job at that time, they had to remove, do re-engineering for those firewalls en remove entirely from the network, for where the DNS traffic was traversing. And that hardware, the hardware that was plant, it was two load balancers plus four servers. With a commercial brand of software inside. It should be easy, but not as you may expect, because some issues happen later. Problems began. At that time, 2013, in Argentina, we had a very huge economical crisis. We had economical issues. So we had issues to import hardware. So we bought the hardware, load balancers and servers, but only load balancers arrived to the company. So physical, those four physical servers didn't arrive. We had the half of the infrastructure. So, oh, nothing to do with the UAH protocol. In the meantime, this was the lab infrastructure, Dell PowerEdge, previous E8.4 running under AMD64 architecture. Well, Unbound 1.4.21, compiled with LiveEvent. LiveEvent is a library that abstracts you the insights of sockets. So you put LiveEvent with previous D, you can install Linux, we can install on another operating systems. It uses, for example, in previous D, KQ, which you will expect high performance for previous D, instead of select functions of sockets, the sockets. Well, the version 1.4.14b, very stable, very good comments. No, I didn't use DNSsec because with that hardware you will expect high CPU usage. No CPU usage, high latency for resolving queries. I used DNS top. It's a must, DNS top, excellent tool. I will show you later. Among other tools, DNS Perf contains REST Perf. DNS Perf package contains REST Perf and DNS Perf tools. I used in particular REST Perf. It's copyright from Nominum, now Agamai. It gives you the same feeling that a real stress testing on a network you will see. At least at that time it was still up and running that URL. You can download up to 10 millions of entries for random queries that people did in the past. Samples. I don't want to ask where did they come from, those samples. Some readings. You must read kalomel.org. It's a must. If you get into that site, that website, you can look for how to, several utilities, several protocols, several software. For example, mail servers, how to DNS servers, whatever. Including DNS, among DNS. And it includes free BSD and open BSD, how to tune, how to do fine tuning on the network stack. Part number one of the master plan. The DNS service was exploding. We have load balancers, but no servers. My boss said, Pablo, you were playing and testing and bound. When I give a try in testing slash production, yes or yes? Yes. That's how they motivated. They say, well, let's recycle some hardware boxes because we have no brand new hardware. And try to get the most of that. Optimise every, by taking every little bit. And handsome. Some premises, no, some premise, sorry my spelling. Some of them, for example, a cluster of load balancers. There were two sites, one cluster and another cluster. That load balancer, you must use only 50,000 ports, not more. Well, you can use until 60,000 or 62, 63. It's better to use a lower quantity. So if you exceed that kind of, if you exceed 50,000, if you exceed 65 ports, you must use, of course, another VIP. Virtual IP address. And server servers were behind those load balancers. FreeBSD was the premise for serving DNS queries and behind the load balancers as a way of protection plus load balancing and give you the best of the service. And another premise. Every IP address I will use, it should be up and running quickly here or here or here. No anycast. It was impossible to use anycast, so they decided to use BGP. BGP with slash 32 in mind for every IP address. The big picture. This was before. The firewall, that fear of fire is a joke because it was exploding. It was getting on fire. One appliance per site until November 2013. At the same time, mobiles there, trying, querying, and querying, and querying, and exploding the network, exploding the firewalls, exploding the server, exploding whatever. And the traffic went to the servers. Those appliances were resolvers. The traffic to the alternative servers were to the left. A summary of what's happened at that time. I'm sorry for the small phones. The newest picture. The same mobiles subscribers, customers, but this time querying load balancers. The load balancers should redirect the traffic, the UDP and TCP traffic to those appliances. Sorry, to those two servers and one server. I'll let you know why. That situation happened between November 2013 and May 2014. The new status was with the CPU usage for those servers at less than 40%. No more firewalls in the middle. The traffic got lower, very lower, became very lower. Well, that's the slash 32 I said before, regarding to the VIPs. Those load balancers, some functionality, some features of those load balancers were BTP software demon. So they publish their VIPs themselves. Part 1 for doing fine tuning at the operating system level. UDP sockets, port range, bug load, nick drivers, timings, interrupts modes and logs. You should think about how many queries per second you will receive. Because if you load the whole info locally on physical drive, the IEO will drop down considerably. Several instances, in fact it was one process, one amount process, and let using between six and eight cores. I mean one thread, one core. Well, queries available, queries serving per core, et cetera. So the trickiest part. You must touch loader.conf. You must touch how interrupt service requests for maximum threads, to listen to the packets and answer quickly. Well, NVB clusters and buffs, interrupt handling directly, max queue limit, the workstream queues it should reach until here 10,240, the same for sending queue length. There are more knobs, but it's a very huge list. Another more knobs. This time with ctl.conf, I mean OIDs. Max socket buff, the maximum socket buffer size, it's very important you must raise to 16 mega, 16 million. And network buffer received, the same. Fast forwarding between interface. They must not wait. Well, send space, you should raise the TCP buffers just in case. At that time there were no so many TCP, so many queries over TCP. Well, received until 524,000, the backlog queue incoming TCP connections. And the list, again, it's incomplete. Well, time for unbound. Unbound has a very sane default, but if you want more performance, if you require, you must change those parameters like number of threads. It's important. You must put, my recommendation is to put less core than the current you have. If you have eight cores, I prefer to put six or maybe seven, but nothing more. Why? Because if you have terrible traffic, and you must get into the console by using SSH, you will blame it. And the rest, SLabs, memory lock contention, some parameters I was playing with those parameters because I'm aware, but not so aware of what they do. And the documentation for unbound says, try to not touch those memory parameters, SLabs. Well, so record size, memory size, that implies if you, when you query, you execute tons of queries, those queries go to memory. If the memory is small, those queries, the cache will discard those queries. And you will expect more traffic. So it's better to raise those values. Message, memory, cache size, number of ports because queries per second is a... is implies open ports at the same time. How many queries do you want to receive per core? Because if you receive excessive queries, your cores will get... you will get hit on their cores. And so get receive and send buffer. Again, you must raise those levels. Those are some... my recommendations for million, part number five of six. There are some tools, for example, DNS top. DNS top is like a top tool of every Unix operating system, but it shows you which IP address is sorted by the guy who is hitting your DNS service. It shows you your IP address, the IP address. It shows you queries per second. It shows you as a total. It can... its capabilities, for example, it can show you the whole text for the query, I mean host names or, wow, query types. If you have query type like name server, mail exchange or whatever, it will appear and it will say, hey, this IP address, it's hitting your server. Check. Well, it will not say check. You will say at the top. But you must check. It doesn't touch nothing of whatever it's running on the server so you can execute without any hassle. And lightweight. And one thing to keep in mind. When you execute for first time, it will show you just by default how many queries you are receiving at that box. But you must put how many queries you are sending, answering. So there are some options that you must put first. I mean forced to show you queries received and queries answered. Well, press perf, it shows you maximum queries you can use on that server until it reaches the limit. I prefer over DNS perf because DNS perf, it shows you every time the same stress testing over the host names. I prefer resperf because it sends you a burst of queries. Resperf, well, that's the explanation. Sorry, my notebook has no such amount of hardware for virtual box or whatever. So it's a little demo I recorded on another computer. DNS top at the top, at the bottom, below. Resperf in action. So as you can see queries and replies, you can see below. You can see below when it has no more replies to send en it shows, let me show you, it shows here the maximum throughput. This is the maximum the DNS server can reach. Queries and replies, let's show again. Here is the 10 million random host names Nominum provided at that time. It tries to send resperf 65.000, more than 65.000 queries per second en it waits for a reply. And here are some query types. This is a counter. This is a counter of queries per second and this is an accumulator. And how many times does it run the stress testing conclusions from the infrastructure. First test I got around 10, until 15.000 queries per second. It was cool. But I say, why not do fine tuning by following Kallamel.org instructions? Ok, let's go. A reminder, I didn't use DNSsec at that time. After I played with fine tuning, the box showed me 54.000 queries per second. So it was great. The new DNS service. How do we assign the new IP before addresses for the mobile subscribers is extremely easy. You touch the configuration. And after the session of the mobile subscriber drop-down it tries to reconnect. It will receive the new IP before addresses. And it was not a forced disconnection, a massive disconnection. It was, for example, when you use your cell phone, your power off, or you disconnect because no signal or whatever, that's the time when you receive the new setup. Of, because those sessions contain a timeout of 24 hours or almost. Well, I used cacti at that time, old times for gathering information and doing fancy graphics. DNS-top, I used to see how was their behavior. Well, rapid deployment on the lab. There were several factors. Bottleneck, some previously provide some nice or excellent performance without any hassles and no stability, no performance issues. In fact, they were up and running those servers without hang, without nothing for about six months. Well, the row numbers, those six months, queries started with 80,000 queries per second on November and when end up with 120,000 at May of 2014, response time dropped down from 3 seconds for every query to 0.1. So the service was running fine. The end. Well, queries were made from mobile subscribers en de quick and not so dirty solution was received so well, they say, hey Pablo, this is a nice solution. Don't, lessons learned and don't. Don't put a firewall in the middle. When you have very high traffic of DNS of queries. Don't trust on your appliance. Don't, the specifications are okay, but don't follow don't trust blindly. And don't avoid high availability DNS infrastructure. It's better to have load balancer second part on those. You must have KPIs for query per second, UDP traffic. Of course TCP traffic too. You can use DNS top. You must put load balancers and dedicated not firewalls with general purpose for the traffic. Don't do. Use physical servers. Don't use virtual servers. I wouldn't put something on a virtual infrastructure if you want high performance and you have high traffic if you expect high traffic. Use a scalable operating system on DNS like FreeBSD and Unbound. Unbound has several nice features I love security in mind. It has so many anti bad guys' features that it's a nice software. Acknowledgements FreeBSD, Noliet Labs, Nominum, Measurement Factory. Thank you to Mario Czerworski. He pushed me, hey Paolo, send your presentation. Send like the birth of Omar Simson. I'm here I am. Thank you to Alan Jules. He helped me to polish my English and some words and some... For example, don't do that kind of animations you must put raw text like the previous. Well, that's all. Questions? Any questions? No questions? Hello Paolo, thank you very much. Have you mean welcome or have you considered enabling DNSSEC validation? Yes, but performance was a huge bottleneck for those old eight years old servers. So it was not the priority at that time today. Yes, of course. No, not with hardware from 20 and 7 or 20 and 8. So I have a couple of topics. First, I agree with you you should use a physical server. However, some of my colleagues at ICANN who operate the L-Route server did a number of tests using different virtualization platforms for the authority server they run. And they found that it was fine. I agree with you, but I just want to say some people have tested this and found that there's no real change in the latency or the isochrony of the service to virtualize. So you may want to soften your recommendation there a little. Second, I have a question. What is the use case for a 16 megabyte received socket buffer? What is this case? The socket buffer is that when you receive so many queries per second it receives it must put the content of those sockets in somewhere of the memory. It has some payload those sockets and whatever, but multiplied per 60,000. So it contains in it must contain it must store the content into that memory. That's the final purpose. I know the purpose, but the use case in this situation would seem to be that you are willing to wait more than a second between when a packet is received by the network interface and when it is seen by unbound. And I want to suggest the experience of the bufferblot project which is online at bufferblot.net shows that if you cannot service the response quick enough if you can't actually get it in or out in a lot less time than that you will do more harm than good by actually processing it later. So I want to suggest a change to your recommendation where you calculate the number of octets of query payload and also the number of queries that you would receive worst case in 250 milliseconds and set your socket buffers to that so that you will not be in a position of burning CPU resources to receive queries that you should not be answering. Yes the my my main problem at that time was just stop stop hitting the network resources stop hitting of course the nearest servers stop hitting whatever so some parameters like that they were over oversized so that's the final it should be the final explanation everything was just in case because I understand I just noted that you are an expert and this presentation will be watched and your parameters may be copied I want to suggest that everything we've learned about buffer sizes is to measure them in time not in packets or bytes and to constrain the amount of time we're willing to buffer because yesterday's newspaper is worth less than today's and a one second old query is probably better left unanswered not only is more not always better sometimes more is worse finally I want to say when free BST chose to move from bind to unbound in the base system as the recommended platform I realized that my work at ISC was done because I got bind from BST and when the BST I used stopped using bind I started looking for a new job en I started feeling about the socket buffer configuration because you increased the TCP socket buffer to the 16 megabytes I increased but again just in case I think increasing the buffer size limits the maximum number of the concurrent connection of the TCP in the TCP case only so I'm curious about the how many concurrent connection you want to accept I got your point under normal circumstances you have unbound I remember it contains almost or at that time 10 open connection TCP connections I modified that we had to modify that at the same time for TCP parameters of the operating system that's the explanation for oversizing such amount another question is what makes the upper limitation of the performance even after your optimization for example the CPU usage or memory saturation CPU usage lowered until 40% per core what determines the current so query per seconds if you want more performance which part limits the current more concurrent the thing is you need another additional IP addresses first load balancer level I mean the single server if you want more performance on the single to raise you need to an additional IP address for the server you need to raise some almost socket buffers but wait those resources had some limits I was playing with those limits maximum socket buffers the best value was until 8 million if I put 16 of 20 20 million my server hanged so bad 8 million was a quiet value for no receiving any kind of problems and that's the way I my first step including reading the whole documentation it will be stress testing until reaching some point DNS stop if you stop if you stop receiving answering you will get the right parameters you will put you will get the limits and you will I need more servers it was more than studying studying the parameters it was it was playing with those parameters and doing our capacity planning it's a mix ok thank you you're welcome well thank you very much