 All right, hello, welcome, everybody. This is the Core DNS intro and deep dive. So I'm John Bellamerick, and one of the core maintainers of Core DNS. And with me is Yang Tang. Yeah, my name is Yang Tang. I'm from Yvante. I'm also one of the maintainers of Core DNS. OK, yeah, so let me continue. OK, yeah, first of all, thanks for everyone for joining the session. It's glad to see everyone is coming here, and especially since it's COVID times. So I think it's still tough. Life is still tough, but I think we are seeing the end of tunnel of this COVID period. So hopefully everyone has a better life moving forward. Right. In today's meeting, in today's session, we are going to discuss about Core DNS. We are going to do two things. First of all, we are going to discuss about doing a little bit of introduction in Core DNS, discuss about the latest update in Core DNS, and several things related to the Core DNS community. And then I'm going to hand over to John. John will do a deep dive on Core DNS, and some of the things that we want to discuss. So if you are truly interested in Core DNS, want to make some contributions, or even write some special plugins for Core DNS, then certainly that's going to be a chance to learn a little bit. Hopefully you'll enjoy the session. OK. So first of all, just in case you are not very familiar, I'm going to give you a little background on Core DNS. So what is Core DNS? Core DNS is a flexible DNS server. It's written in Go. Initially, it started as a fork of a Kali HTTP. And then I think in 2016, that's when Core DNS project was started. Initially, it made driven. He contributed to the majority of the original code for Core DNS to make a Kali server. He actually made a fork of Kali HTTP server, and transformed that Kali HTTP server into a DNS server. And that's why it was originally named as Kali DNS. And then over the years, with different contributions from so many contributors, so many members from a community, Core DNS gradually evolved into one of the best DNS servers around the world. And most notably, Core DNS has now become the default DNS server for Kubernetes. That's why we are here, because that's a KubeCon. So we discussed about Core DNS, but at its core, Core DNS is still a DNS server from the beginning. And Core DNS is very much different from other DNS servers. We all know the bind, like a DNS server bind, and some other DNS servers. But the difference between Core DNS and other DNS servers is that Core DNS has a focus on server discovery. And also, Core DNS has a very special architecture. That is actually the plug-in-based architecture, which means it can be easily extended. If you want to have some features, and you cannot find the support from Core DNS, along with you know how to write Go, you can easily write this feature for yourself. And that, in fact, later today, John will walk through some of the plug-ins and you can find out how easy it is to implement a plug-in just for your usage, along with you know how to write in Go. Okay, so Core DNS itself support different protocols. Core DNS support DNS, DNS over TOS, DNS over ATP and AP2, and DNS over GIPC. The DNS over GIPC is not a true DNS standard, but it's more like customer implementation. Core DNS also support a feature of forwarding to upstream via DNS or TOS or GIPC as well. So if you use Core DNS to serve your DNS server, to serve DNS traffic, which is UDP, and you want to use Core DNS to query to upstream DNS server, you actually don't need to always go through DNS or UDP. You can use other features. You can use transportation through the TCP, through the TOS or GIPC, which is much more reliable. So from that standpoint, you can sync a Core DNS is more like a frontend to give you the flexibility locally, but at the same time, you can use the other communication channels to guarantee reliability from the data sync point of view. Core DNS also have the integration with different cloud vendors. For example, Core DNS has integration with RockFace 3 from AWS. Core DNS has the integration with Google Cloud DNS and Azure DNS as a major cloud vendor. Another thing with Core DNS is that the Core DNS is fully embedded into the cloud native ecosystem. It has the integration with promises, open tracing and OPA, both are all of them, a cloud native project. Of course, as I mentioned before, the biggest feature with Core DNS is that Core DNS now is a default DNS server in Kubernetes. So whenever you use Kubernetes, you probably already noticed that there's a pod up and running that's with the name of Core DNS, right? Okay, so let's get through some of the recent update since last year. This slide just shows the Core DNS update since last Qubicon, that's later last year in North America. Since later last year, we released several versions in Core DNS from 1.8.5 to 1.9.2. The latest version is 1.9.2 has been released just 10 days ago, this month, 10 days ago in May, 2022. Over the past half a year or so, two plugins has been added. One is the GLIP plugin, which allows you to report where the query comes from, so which is a very nice feature that's been requested by the community and we finally bring this plugin into the default Core DNS plugin system. We also have added another plugin called Head plugin, which allows you to fiddle with the header bits of your DNS query message. The releases in Core DNS over the past half a year also consists of a couple of backward incompatible change. So if you ever want to update your DNS server, you may need to pay close attention. One thing is, in Kubernetes, we remove the wildcard query functionality. This may have some impact in your usage in Core DNS, but we feel like this is very much needed for security reasons. Another thing that's slightly related to security is that in Graphics 3 plugin, we also remove the ability to pass the plaintext secret in Core file. In the past, it's possible to just pass a secret and write down in the Core file and save the Core file locally, but recently in one secret audit, it was revealed by the audit that this may not be the best practice from a secret point of view. So we finally decided to say, let's remove the plaintext secret saving in Core file. So that also means if you ever use this feature before, you have to find some other ways, for example, you can pass the secret through the event of verbal, which is much safer from a security standpoint. We talk about security. Of course, I'm going to go through several security fixes as well, but one thing I also want to touch base, that's about the coordinates one down nine dot one. Some people may notice that the Golang 1.17.6 actually consists of several security vulnerabilities. That impacts a lot of software, not just coding. Also, if you ever use E-Steel or ever use some of the software built by Golang, you'll notice that if you ever use a vulnerability scanner, you'll probably notice that you're one of the scanner, just reported quite a few things recently, right? That's actually related to Golang 1.17.6. And because of that, in coding as one down nine dot one, which was released just a couple months ago, we did the emergency update to bring the Golang version to 1.17.8 to fix this vulnerability. So again, for the security purpose, if you ever have a coding server, you probably want to update to the latest one as soon as possible. So we talk about the security. Of course, for the past year or so, the security has been the focus, not just for coding as but for the whole software industry. Especially if we think about like, in early 2021, people were talking about ransom attacks, and then later we talk about lock lottery. Both of those events has been like epic in terms of news updates, like people just receive those news even from CNN, from some of the news channels. People talking about the lock4j ransom attacks. So that's why the security has been a focus for the whole 2021. And for coding as we actually complete a security audit, the security audit has been done recently in March, 2022. The security audit was done by a third party auditor. That's a trail of bits located in New York. The event, the security audit event has been sponsored by Lane Foundation. So now here I'm going to say, okay, thanks a lot for the support from Lane Foundation and the CNCF to allow the coding as to utilize the resource that's available to us. So which allows us to making great progress and also helping the community that's using coding as. In this security audit conducted by trail of bits, there are total 14 security issues discovered. But I do want to say the only one high severity issue is related to a potential cash poisoning attack. There's also another medium issue related to the usage of plain tax saving plain tax in the core files. But this medium issue reported, it's actually not so much of critical because again, it's a puzzle to mitigate the issue even if you don't update your server. But the rest of the issue discovered by the rest of the issue discovered by trail of bits are all related to informational or low level severity. So we feel like coding is very much a safe thing as server. So no, and all the issues for now has been resolved. So if you ever have a coding server that's running, let's say 1.9.1, you should consider updating the coding as to latest version, ideally in 1.9.2 because that 1.9.2 can fix all the security issues reported by trail of bits as well as the go long 1.17.6 issue I mentioned just early. The whole report is available and we posted on the coding as repo. If you ever have any interest, you can certainly take a look. But all the issues have been resolved as of now. So you can just use coding as the latest version to avoid any potential security issue. Okay, another thing I want to discuss is about the coding as community. Of course, the growth of coding as it's always associated with the growth of community. At the moment, we have 300 contributors, big thanks to everyone who contributed to coding as. We have 26 maintenance. That's a pretty big number as well. And we also have 32 public adopters. If by the way, if anyone in this room, if you're ever, you know, if you're a company or your institution ever use coding as and your company or institution is willing to let you, the name to show up, you can certainly create a PR in coding as repo to add your company or institution to the public adopter list that by the way, by itself we'll add you to be, you will become a contributor just by adding this, no entry, right? We also have a 9,200 stars. So we are hoping to reach 10,000 stars version. So let's see when we can get a reach through this goal. And also another thing I want to mention is that for the past five years or so, coding as has been participating in two programs. One is the Linux Foundation's LFX community program, which helps the students to work in open source in return students will receive a small amount of money in return, which it's going to help financially. And also we participate in Google Summer Code. Both program has been running for quite several years and we participate almost every year for the past five years. This year, there's another project that that's actually the ACMB support for certificate amendment. This project has been accepted by Google Summer Code. So there is a student that's currently working on this project. Hopefully we can see the completion of the project and hopefully that can bring this nice feature to the community as well. Again, just one more thing I just want to mention that if we plan, coding is planned to continue to participate in both Linux Foundation program as well as Google Summer Code in the future. So if you ever know any student has an interest in open source community want to contribute, you can encourage them to send application every year and in return they are going to receive money if they can complete the project. That's very much the introduction. So I'm going to hand over to John to do a deep dive on coding. Thanks John. Hi everybody. So we have about 20 minutes left before I jump into this. I want to ask a few questions. Know how much time I should spend on different parts of this. So how many of you are using Kubernetes and Core DNS and Kubernetes? Probably almost all of you. Okay, awesome. And then how many of you are using Core DNS without Kubernetes? Just nothing to do with Kubernetes? Okay, hey, we've got a few of you. Awesome, great. Well, so what I'm going to talk about a little bit are how you could customize Core DNS. Now, for most of you in the audience who are using Core DNS as part of Kubernetes, you're probably not going to want to do this but there's a few of you who raised your hands. We'll talk about it, but I'll try to be a little bit brief so we leave a good amount of time for Q&A around since most of you in the audience might not be too keen on this stuff. But as Yong said, one of the really great things about Core DNS that's different from traditional DNS servers, most of them is this plugin architecture. And the idea is that we use a sort of request processing pipeline. So a DNS request comes in, the server unpacks it and just hands it to this pipeline. And when you're setting up your core file which is your configuration file for Core DNS, you are just enabling different plugins within that pipeline and configuring them to tweak the request in whatever way you want. And so that plug-in architecture lends itself to extensibility and that's what we're talking about here. So there's three basic ways to extend it. The simplest way is to enable an external plugin. So we have plugins that come with your standard Core DNS that all of you running in your Kubernetes because whoever your provider has built it or they pulled it from our Docker Hub. But if you wanted to do something fancy you're like, I don't know, back your Core DNS in memory cache with a layer two Redis cache. We have an external plugin for that or there's a whole host of them and you don't even need to know how to use Go to do this. They are written in Go but it's actually super, super simple. There are a couple of really important things though to note, plugins are not noted dynamically. They're built, they're done at compile time, build time. And the plugin ordering is fixed at compile time. So that processing of that request through that pipeline you can't change the order of that without rebuilding Core DNS, which kind of sucks but doing something about it is challenging for a variety of reasons. So this is probably the most accessible way to do it and all your prerequisites are is Docker and a shell. So simple things, you clone it, you modify the plugin CFG that tells it what plugins to compile into Core DNS and in what order and then you build it. So I'm not gonna step through it because we don't have time for that but I'm not gonna actually do it on a shell but it actually really is quick and easy. So you clone Core DNS, if you pull this PDF down off of the sketch website, you can just copy and paste this right out of there, paste it in your shell. CD into that directory, open plugin CFG. So what you'll see in plugin CFG is colon delimited list, I can't speak. The first word is the directive, that's the word that will appear in your core file when you configure that plugin and then the go module that implements that plugin. This list is a little bit out of date because we forked caddy so it shouldn't say mholt there. And then you build it, we have a Docker image, oh yeah, you run this Docker command and it builds it, it emits your core DNS binary and you're done, all right. Second way, core DNS is a library. So here what you're doing is instead of actually running the core DNS binary itself, you're embedding core DNS in another binary. You can use this to strip out plugins you don't care about. So how many of you, if you know, use the node local DNS feature in Kubernetes? Okay, a couple of you. So that project uses this technique. So essentially the node local DNS, which all of you should be using it by the way because it's much, much better, what it does is it runs a little mini core DNS just for caching on every node and it redirects all the DNS requests from that node to that local cache. And then for any requests that need to go to the central cluster DNS, it upgrades the connection to TCP which fixes some kernel bugs and issues and race conditions. So how many of you know, have any idea about what I just said as far as DNS? Like you understand, okay. Hopefully you can look that up and it'll get through. But yeah, so essentially if you have DNS issues in Kubernetes, weird DNS issues, try node local DNS because there's kernel issues, there's race conditions, there's a contract filling up with UDP, there's all sorts of subtle things that can happen under load that that fixes. Anyway, it uses this technique. And I have an example, you can pull off of GitHub. Super easy to build. I along with, I don't know if you know, Cricket Lewis, he wrote a bunch of the DNS books, DNS and Bind and all of these things. He and I wrote a core DNS book and we go through this example there. So you can go buy the book to that and it'll be great. All right, third way, I've used six minutes, we'll use four more minutes for the last one. So write your own plugin. So remember, we have the pipeline, request comes in, we do something with it. What do we do with it? Well, we tend to classify plugins into three categories. This is not strictly necessary, this is just sort of a best practice we use because what we want, like a Unix pipeline, we want each of those plugins to be composable. We want you to be able to use them with the other plugins. So we wanna kind of scope them to some small piece of functionality so you can pull them into, as an external plugin, pull it into different core DNS instances and things. So when you decide to write a plugin you should think about, am I writing a backend plugin? A backend plugin means I'm a source of data. The Kubernetes plugin, which you're all most of you are using, is actually a backend plugin. It's pulling data from the CUBE API server and publishing it as DNS. The cloud DNS, the Route 53.1, all the same thing, they read from those cloud provider APIs and then they present the data as DNS. Those are backend, there's also backend, like external backend for storing your DNS names in Postgres, for example. Mutators, mutators are things that muck with the request. They do something to it, they change it, they deny it, so we have a rewrite plugin that lets you, somebody queries for, well, some name, we look at that name and we say, ha ha, we don't really want you to go there. We're gonna rewrite the query and we're gonna send it to the upstream names or something different and then we're gonna reply with that IP address, so that's what rewrite is for. And actually that makes me, something I'm not sure if y'all mentioned, core DNS is a authoritative DNS server. So in DNS, there's authoritative servers and recursive servers. When you, a recursive server takes a DNS request and it breaks it down into the labels and it goes out to each of those other DNS servers and figures out which name server owns that particular domain. So when you look up food.google.com, your local recursive server will go, I don't know anything about food.google.com. I'm not authoritative, I don't have those records. So I'm gonna ask, I'm gonna figure out what name server does have those. And so it's gonna say, okay, well maybe it's the Google.com one. Well, I don't know anything what's Google, I don't know what that is. So I'm gonna go to the .com, right? I'm gonna go to the root name server and I'm gonna sort of walk that whole tree. So that's a recursive name server and it's not what core DNS does. That means that unless you're resolving, using core DNS to resolve the names it owns or that it's pulling from some other backend source, it's gonna need an external name server that it can forward its request to. So if you're looking at your core file and you see the forward plugin, that's all that does. So just something to keep in mind when you're using core DNS, it's a huge limitation frankly, but it's there for a reason because recursive DNS servers are really hard to write and so we haven't done it. Anyway, mutators, that's where we implement ACLs, cache, rewrite, things like that. Finally, configurators, that's a word, are just things that modify the state of Kubernetes, of core DNS rather. So the bind plugin tells you which IP address is to bind to. The log plugin tells you what kind of logs to do, things like that. So think about your plugin that you wanna write as one of these. Then you just implement four functions. These are the mandatory ones. Of course, your logic is gonna have to live in some function somewhere. The name function, super simple. The serve DNS function, that's the meet that takes the request in and does something with it. So our example, I've used up all my time, but I'll go quick. Again, this is out there on the core DNS organization in GitHub. You can step through this, it's super easy. But basically there's a plugin here that I've written as an example that will take a response when you get the response from say the upstream name server or the later plugin in the chain. It looks at the response and it consolidates a given type of record into one. So, super simple, like I said, name function just returns the name. Setup registers the plugin with the parsing routines so that when we're parsing the core file, we know what you're referring to when you use that directive. I'm just gonna, wow, actually, this is important. It also does this add plugin and this is what inserts it in the chain. So in that set up, in that plugin CFG, basically when we read the core file, we're not looking at the order the directives come in in order to initialize all of these chains. We're actually running through a fixed order. That's why you have that issue I talked about earlier. And I'm just gonna, because I wanna leave time, oh, wait, sorry, y'all. I already did come up. I'm going to skip showing you that in the interest of time and leaving time for Q&A, but you can check out the serve DNS on GitHub. Like I said, super easy. Just takes in a request, modifies it. The one thing I will tell you is, well, how about this? I'll tell you, contact me afterwards if you wanna do this and I'll go through it with you in detail because I wanna leave you guys time to talk. So some resources then for any of you out there who are interested in learning more about it or diving in to actually modifying Core DNS. This is where you can find us online. We have a Slack channel. It's not the Kubernetes Slack, it's the CNCF Slack. We have a Slack channel on there and, of course, GitHub is the main place where you can reach us. That's where we're most active. There is a mailing list, but we don't use it almost at all. We will get the emails and reply to you, but nobody uses it, so I don't even think we put it up there, but mostly Slack and GitHub. All right, so questions. Yes, sir, here, we've got a microphone here. Yeah, my question's about PTR and A records. I know services of Core DNS is quite consistent, but POD, it seems that sometimes you get PTR records reverse DNS for PODs and sometimes you don't. And I see on GitHub, some people saying that, oh, that should never ever work, which I actually personally agree with, but my users like that it works. Yeah, let me think about it, let me try to remember. So when you do a headless service, I believe you will get PTR records for the PODs backing that service. I actually wrote that, well, I wrote that specification reverse engineering from Cube DNS, but it was a few years ago, so I would have to dig into the details, but yeah, generally, I believe you get them for headless services, for cluster IP services, they don't really make sense. And for PODs themselves, we don't have A records for PODs, and unless you have a headless service. And the reason for that is really two-fold. One is that from a sort of philosophical point of view, we don't really want you thinking about PODs, and that's a little heavy-handed. You actually, you can have POD records, I'll get to that in a second, but we also, from our performance reasons, is really the main reason. So if you think about what Core DNS does in Kubernetes context, Core DNS sits and listens on the Cube API server. So every time that a service, or now in modern ones, the endpoint slice is actually what backs a service, whenever an endpoint slice, a POD comes or goes from an endpoint slice, we're updating the cache within Core DNS. So we're watching and we're seeing all these events come down from the API server. If we subscribe to PODs too, it's way more traffic, and it's way more load on the API server. So in general, we don't want to do that, and so that's why you won't see PTR records for PODs, only for PODs that are endpoints in headless services. Probably because those PODs, well, it would be by the service name, I don't know, we can dig into it later. You won't see them for raw PODs. It's gotta have a service backing it in order for it to have any PTR record at all. Now in a stateful set, if these PODs are participating in a stateful set, they're gonna have that service created automatically. So that might also be where you would see them. If I first look up the IP of the POD, I get the services DNS address sometimes. You'll get the, yes. When you look up the IP of the POD, you'll get the DNS name of the headless service that that POD is backing. Of course, if there's more than one, I don't know what we do. Yeah, and restarting, sometimes I've seen restarting the core DNS server makes it appear. That shouldn't be the case, but we can maybe talk offline and see if somebody else has a question. Any other questions? Hello. Is it possible as a policy to forbid a Kubernetes STEM space from querying entry of services in orderness space? Can we write this kind of policy? Sorry. I'm sorry, I couldn't follow what you said. Sorry. They make the past, yeah. Is it possible to write a policy to say that a Kubernetes STEM space cannot make DNS requests to know about services in orderness space? Okay. There is an external plugin called, well, at one point in the Kubernetes plugin, we had a configurable option for, I believe, naming specific namespaces for which we produce records, but I'm not sure if we had one for excluding them. We would have to check, but we do have a policy external plugin. That's the one I was showing. We called it firewalls directive, but policy is the plugin that lets you do a little more with that. But that's certainly a very feasible functionality to add. Where I'm hesitant is, I don't want to add a bunch of stuff to Kubernetes. It's already got a bazillion options, that plugin, but we can potentially use it to deal with the policy plugin. Come talk to me afterwards, we'll take a look. So my question is, we use a lot of external services in our cluster, and because of how end-dots work and the way the search works, we end up making a lot of queries and we can see that NX domain errors. So is there a way to make sure that for a regex, a particular regex like Azure or something, we can make the code in us a little bit smarter so that doesn't search for those domains? Okay, so you're saying you have concerns with the code DNS, if a code DNS constantly occurring, let's say back in like a drill, like Google Cloud or Office 3, is that a question? Yes, so we make a necessary DNS request because of how end-dots work. Okay, so first of all, I want to mention one thing, like DNS, everyone knows DNS is simple, so it's DNS protocol, just UDP, what's the OSF? But the DNS is one very important feature. DNS is massively scalable. Your whole internet is, because it supports the whole internet, right? That's, a lot of people talk about distributed systems, they didn't realize that DNS by itself is a distributed system. How DNS is able to handle that? DNS, when you support the internet, it's through caching at different levels, so you do need the caching, right? And also that's one thing, that's one thing, the caching. Another thing is reliable translation at the different cache levels. For example, that's why we mentioned about DNS over TOS, DNS over GRPC, and why we mentioned about connecting to DNS to, let's say, Azure with Azure API because that's gonna be more reliable. But caching is the most important thing. Like when we talk about no local cache. Yeah, yeah, no local cache will probably help with that because it caches the negative responses. I think what you're talking about is the search list. So in DNS, the client-side resolver, sitting on your node or in your pod, when you ask for, say, Google.com, the Kubernetes has configured a list of names that it tries because you didn't qualify fully with a dot on the end. So the first thing is if you control those names, just put a dot on the end and the problem goes away. Okay. But you can't really get people to do that because people don't do it. So we actually implemented a feature a few years ago specifically to address this problem. There's some gotchas with it, though. First of all, it means you need to watch pods because in order to make it work, we have to figure out which namespace the pod making the request is in in order to understand its search path. And so, and there's a race condition. If the pod comes up and we don't get notified fast enough about the pod, then internally we can run into a problem where we can't preemptively figure out your search path. So it's a less than perfect solution to that problem. The node local DNS cache probably will help you because I believe it will cache those negative responses so they will never leave the machine. They'll never leave the node. But yeah, it's unfortunately not a fully solved problem. And that's probably your best option. I think we're out of time. I mean, I'm willing to do one more question if people have one, but I think that's it. All right. Thank you, everybody. Okay. That's it. Thanks, guys.