 Good morning. Good afternoon. Good evening. Welcome to another edition of ask and open shift admin I am Chris short executive producer of open shift TV I'm joined by the the main, you know host of the show Andrew Sullivan But we're also having our special guest the one teammate. I know that Bragg's about his knowledge of DNS Christian Hernandez. How are you today? He is the DNS man yeah, it's Well, you couldn't have an episode about DNS without without you. I know yeah without me jumping on yeah, I'd be I'd be almost offended if you did right I am happy that you are here today with us in person instead of just participating. I say just participating on chat You know, I think we do that to each other quite a bit and participate in other streams. Just not not actually in person So, hello, welcome everybody to the ask an open shift administrator office hour stream So this is one of the office hour series of streams that we have here on open shift TV And the goal here is really for you all for our audience the people who are listening and watching in To ask us anything at any time about whatever it is that is at the top of your mind So please don't hesitate to do that You know we more than welcome those questions again, whether or not it relates to whatever we happen to be talking about today We've already got a question Great. That's that's not a question, but like Interested in seeing the episode. So yeah, sorry speaking of which I didn't even I didn't look at the staged YouTube channels to see if anybody like last week remember we had a question four days early So yes, we did not have that I don't think I Haven't seen it at least I mean, I always have the reaching chat open. So I feel like I would have seen that Had it have by kind of deal, right? So, all right. Well, thank you everybody. Welcome. And again, please don't hesitate to ask those questions at any point in time So today's topic as or if you didn't see any of the social media or other things leading into today We're gonna be talking primarily about DNS and also a little bit about cluster networking but primarily about DNS and the genesis for this is Christian and I spend What what feels like a substantial amount of time? I mean relatively a lot of time. Yeah, yeah talking about DNS and DHCP and DHCP we we think is starting to get out there rate the requirements the configuration how it affects things and all that but DNS is still largely a black box to folks You know, there's folks. Yeah. Yeah, there's some of us might have read the RFC. Yeah, well Understanding how it works and knowing how it works in OpenShift and Kubernetes is not the same thing Exactly. And that's what we often see as a problem. So today we wanted to kind of look at, you know maybe demonstrate a little and and explore DNS inside of OpenShift and Keep in mind that a lot of this will also apply to Kubernetes as a whole So it doesn't necessarily Matter if you're using OpenShift or not Hopefully we'll be able to answer some of those questions. DNS is a universal standard Yep, and it's it's funny because you know, I was an infrastructure administrator for a long long time and It was it was always the networks fault until virtualization came along and then it was storage as fault, right? And now that we've kind of worked past all of those right most folks are using either hybrid or all flash storage now Suddenly it's back to being the networks fault and a lot of times it's DNS. Yeah, great. Most of the time is DNS I have a nice haiku. I could post you. I was gonna say where's the haiku about that? By now everyone knows that DNS haiku It should be it's posted on people's walls, I would imagine. Yeah, I Very seriously yeah Yeah, I very seriously figured out how to put nor was looking into how to put up one of those little Sidebar things like the news that just had an image of that haiku Oh, nice. Yeah, I think There's there's some software that that does that Oh, yeah, it's like a virtual camera and you can post things on it. Yeah, there's there's a few of them So before we get started before I hand over to you Christian and start asking you a lot of dumb questions There was a few things that I wanted to talk about So for those who are regular watchers regular Attendees of our stream, you know that I like to cover a few things that I have seen either internally or externally that I believe are Relevance and also important things that maybe you haven't heard yet. So Let me find the right tab here Just always these days always the tabs I've actually had very few open today. I just have a lot of windows All right, that's the one I'm looking for So first and foremost a bit of shameless self-promotion We have been doing a weekly blog post after each one of our shows they come out on Fridays So this one happens to be from last week if you miss anything from last week Not only do we have a link to the stream itself But we also go through and try and put as many of the questions as many of the things that we can find in There so if you want to write quickly skim because I know, you know One of the big things about doing a video stream is there is no Transcript there's no way to do like a control f and find a topic of interest So I try to go through I try to identify the questions and pull them out and then link to where they're answered inside of the Inside of the video. So nice. These are better than our release notes. It's That's a lot of work, yeah Come on this there we go. Yeah, that things little window gets in my way. And I yeah Yeah, I'm always trying to figure out how to move it and I never just needs to like Like I have three monitors and it needs to just like go down right, you know, just get out. Yeah. Yeah So I will paste a link to that into the chat. The second thing I wanted to talk about is This is a bug that's been in place for a while. I had actually you can see here It was first reported back in September late September I actually had not heard or encountered this I don't like to bring up bugs very frequently because you know, they happen all the time So usually until it's something that I feel is coming up or more I guess serious or coming up more frequently than any generic old bug. I tend not to bring them up. So This one Happens because I saw no less than three customers this week that encountered the scenario where Their vSphere IPI node had gone offline So for whatever reason the worker node was offline, right? Maybe it was turned off maybe the physical node had had you know, powered off failed for whatever reason some yeah, and the IPI process is effectively reaped that node, right? It deleted the node because it was going to provision a new one to Recapture in that that capacity. Yeah, the problem is that virtual node the VM had some pvcs attached Yeah, and so the pod or the node was running it had some pods those pods were using pvcs The node failed for whatever reason. So it still had those dmdk's attached to it. Okay, and When IPI came along and destroyed the node so too did the pvcs Yeah, so that's that's not good, you know data loss is never a good day on at any point in time ever So a couple of things to be aware of here. So first and I'm scrolling down. Yes, there's a number of missing comments in here I apologize. I'm not logged into bugzilla right now because there's customer information associated with these So I don't want to show that that information to the worlds, but there is a fix for this I believe it was just committed yesterday evening or maybe early this morning Or PR just rather so it should be fixed in a coming release right hopefully sooner rather than later But for now just be aware of you might want to set that if you're doing auto scaling if you're doing Or using IPI in the auto scaler together to reclaim those nodes or to otherwise restore those nodes You may want to configure the auto scaler with some sort of delay So that way it will have at least five minutes So that way in the event of node failure the Kubernetes scheduler will say hey I need to I need to you know, clean these up and I need to move them Yeah, give the the scheduler enough time for those for the workloads on the on the node to move to another node Before it it reaps that that BM. Yeah. Yeah, exactly So another one and man, I always feel I feel bad bringing up two bugs That makes it seem like a bad week even though, you know, it's well, you know what you can actually feel a whole show of just like looking over But how depressing would that be So so this one I've seen come up a number of times Internally especially I've seen it come up a few times externally as well essentially we have uncovered a Inconsistence bug for Coros virtual machines deployed to vSphere using OpenShift SDN So we think this has something to do with the VMX net drivers that are in the Rel 8.3 kernel the one used by Coros 4.7 and it's effectively this manifests itself as some packet loss So if you're if you have upgraded to 4.7 or if you've deployed a new 4.7 cluster on vSphere And you're starting to see, you know, one two three five percent packet loss This could be the result of that. There is a workaround. I don't know similar to the other one I'm not logged in because there's some customer information here. There is a work around It doesn't look like it's in the public comments. If anybody needs that just let me know Andrew dot Sullivan at redhead.com. I'll share that with you effectively. It's setting or turning off the VX LAN offload for the Network adapters inside of the virtual machine. So it's just a couple of sys controls that you execute You can test it by OCD bugging or you can use machine config to set them in a reoccurring manner Okay, getting out of the Things are broken. Things are bad, you know, bug bug reports. Hey, we love failure. We love, you know Helping folks, you know Fix their problems on the show channel I try to, you know, again, it's it's uh I try to preempt or try to make you aware of things so that way You don't accidentally stumble into a bad a bad situation. Yeah Um, so the next one is a good one, uh, you know, I talked about sizing in a couple of different ways With three or four different aspects, whether it's sizing the cluster sizing the nodes sizing storage Right all of those things across a handful of shows here on the open shift admin hour ask an open shift admin hour So recently you see march 16th Simon the lord published a blog post here that walks through application sizing principles So a really good read, you know, share this one with your your application developer application admin peers that kind of walks Through all of the different relevant settings different capabilities things like that to help you size those clusters Both how the application is allocated resources and how those resources are then reflected inside of the cluster And when I do the show notes for this one, I'll uh, I'll try and link those other episodes inside of there If I remember to make a note to myself Yeah, also your episode on, um resource limits is really good also to link there because that has um, oh yeah going from You know going from a vm to to a container is very different because the the developer is used to getting 8 cpu 16 gigs of ram because of like scale right because oh, I need to be able to now that's different now You're essentially constraining them. What do you mean? I want to get you know, half a cpu cycle it's like well, it doesn't really work that way so um that that episode your episode of uh of uh of resources resources limits and ranges is good to um, I got I got the link right here. Yeah recommend those users watch that as well Drop it in chat here for everybody. There you go. All right. So the last topic I wanted to talk about before we get started here Is one and we were joking before the show started that uh, I'm just going to kick this hornet's nest um, because it's it's a bit of a sensitive topic um, I think internally and I know that there are some people who are passionate about it externally as well And that is reference architectures Oh, you said about a word I know I I'm sure if any of our our marketing peers are are watching right now They all just had like their stomach not up because they're afraid of what I was just going to say You know like when you said it prior to the show I was like oh boy So reference architectures, um for a little bit of background here So anybody who has been keeping up with open shift for the last roughly two to three years You'll remember that we used to publish reference architectures more or less every version or every two versions for various I see somebody asked for the sizing link. It's uh, it's the second one up the sizing applications right above the last chat Yeah, scroll it scroll up a little bit that came from the city tube. I wonder if I'll post it over there just in case Okay So we used to publish reference architectures red hat published reference architectures regularly And we stopped doing that for a couple of different reasons Uh, so the big one was quite simply resources Right constantly trying to and having you know devoting not just the people resources But the physical resources and all of the things associated with going through and generating those documents Uh was you know quite frankly Hard And it really was and invariably it always led to you know, if we used one partners You know a set of resources instead of another partners, right? Sometimes there would be you know hurt feelings and stuff like that Um, so it it it became A bit overwhelming and in number of different ways So the decision was made rather than creating reference architectures Which the perspective was and I would be very welcome to anybody's Anybody's additional thoughts or perspective on this But the perspective was that most folks read a reference architecture not to literally implement it as written but rather to Glean right rather to determine what information is important what information is relevant What decision points do I need to know to deploy an open shift cluster in my infrastructure? so it was really um For lack of a better term right it was it was a guide Right. It was a a consolidated source of documentation for here's one way of producing an open shift cluster So the docs team took upon themselves to take all of that Knowledge and put it directly into the documents So if we look at Can I type here? No, my browser is beach balling on me Oh, oh it great out everything. Oh it great out. There we go. There we go. All right back So if we look at the documentation here We and we scroll down to for example this scalability and performance section You can see that and this section has grown with each each release They are doing exactly that they're taking as much of that information as they have as they can get and putting it directly into the documents right recommended host practices is a great one and Allowing you Our audience right the people who want to ingest those reference architectures To really understand the rationale behind those decisions right not just oh, I see that You know in the reference architecture. They said, you know, this value to x well Why why was it set to x sometimes we explain that sometimes we didn't so through the documentation The goal is to provide all of that background all of that information So you can make the right decision for your infrastructure So with that in mind, however, we didn't completely abandon reference architectures right it seems that way and Some of you may be thinking well, I have seen some reference architectures come out and you are absolutely correct We now rely on our partner ecosystem for those So i'm gonna post this blog post into the chat here So this one which is coming up on a year old now So I assume Dave will probably update it and I've known Dave for a few years now. He works over on our Partner team right they do a lot of the reviews of these types of things and the goal is Him and his team work with for example, if we scroll down here, right? Cisco and delimc and hitachi and hpe and so on and so forth everybody and those partners Create the reference architecture, which is then reviewed and contributed to by red hat So if we were to look at and i'm gonna pull up one of these, right? This is the hua packard enterprise the hpe site for open shift reference architectures. I'm also gonna paste this in here So if you are using hpe hardware and let's say you want to deploy open shift onto your dl Whatever servers right dl 380. I don't even know what the current gen is So we can click on this 4.6 gl And dl, excuse me and it walks through all of the different components Of their reference architecture. This is all created by our partners. Yep. So they still exist They're still there. They're just not created not published by red hat red hat. Yeah, it's it's at the vendor that you're trying to Put open shift on and and that intuitively that intuitively makes sense because being a software vendor red hat like we We one really can't recommend hardware Because there's just like a ton of hardware that we're supported on like we Like andrew was saying we're we're just we don't have enough people to go out and buy every single hardware And you know vet everything every possible combination. So we we lean heavily on our um on our providers like Like hpe and del del emc to do these right and we can review them that seems to be a better Better use of our time rather than trying to build a team with an impossible task Yeah, exactly and together we are stronger, right? So yes, exactly I just thought of that that simpson's episode. Yeah, I was thinking of the exact same thing The one where they they mimicked lord of the flies. Yes, exactly. Exactly. That's So I happened to click on hpe over here You know you see net app and sysco have one for flex pod net app has one for their you know net app hci There's there's you know, literally dozens of partners that do these Now you will see one exception to this and that is open stack Because red hatch is you know red hat open stack We create a reference architecture and actually it's one of the uh or two of the folks on our peer team the field product Where they field product managers? He'll yeah f f pms. Yes. Yeah, so yeah, august Simon ellie resox and ham and company, right? So they they create that open shift on open stack reference architecture and that's because it's a whole, you know red hat stack there Um, just like if you were to you know similarly and I think I've said this before C&I plugins csi plugins that come from partners. We don't document those inside of the open shift documentation We rely on the partners for those The exceptions being if it's you know red hat virtualization or red hat open stack platform Those are documented by red hat because it's a holy red hat solution And that's that's all I've got these these other tabs are for uh our topic at hand. So I will I will now stop pontificating about things unrelated to dns and uh Essentially, I'm I'm gonna start off. I'm I'm gonna play the role that I was born to play of dumb guy. Um, and Wait, that's my job We can share So so christian, um, you and I started this conversation last week or maybe the week before And really it came out of some confusion around how does dns resolution happen for open shift? And it turns out that this is a little more complex than you might expect So can you elaborate on that and talk about you know dns in open shifts and the difference between node-based dns and pod-based dns? Yeah, so, um So where do I start right? Um, you know what I mean with with uh With with uh with with dns. So I um Originally I was gonna start like in the beginning. What's dns, but I imagine that At at some level since this is ask an admin. Um, there's some fundamental understanding of what dns is, right? So dns and Kind of sort of what it Um, how it works from a high level. So, um, I'll start with the fact that I guess you guys all know how You know what dns is what the root servers are what? you know what Authority, uh, you know server authority is and all that stuff. So um So dns in open shift is Um, kind of a layered Design right so originally dns in kubernetes. So, uh, I think I'm just gonna start there dns in kubernetes And what it's used for then I'll work my way up So dns is kubernetes is used for service discovery Right, so that's this is um It turns out dnn that that problem of service discovery was solved a long time ago with dns Right, there's been a lot of software that's tried to do service discovery You know people writing service discovery and in the end it just turns out like dns was the um So what was the answer pause there for a moment. What do you mean by service discovery? so by by service discovery meaning that As an application or as a service inside of open shift kubernetes in general It can reference service by a name for example, I have a front end web give a Stupidly dumb. Yeah to two tier application, right? I have you know, no j s front end and some sort of database back in I want to be able to just say database in my application and it should just Work, right? I shouldn't be as a developer or even as like an admin or as an end user I shouldn't have to care Or memorize ip addresses or what the ip addresses are or how to connect to that So it should be as simple as just saying my sql, right or database or whatever name you decide to deploy your database as so That's the idea of service discovery. I can reference something by name and it automatic automatically knows the ip address so And essentially, you know Funnily enough dns is a perfect software to do that Because that's just what it's been doing there's And that's what originally It was the original intent right for kubernetes service service discovery so in kubernetes, that's kind of like the the bare you know The the the kernel the the core of the the use of dns inside of openshift. Yeah, so pods are deployed that represent some micro service right front ends back ends database whatever that happens to be And then they have quite literally a kubernetes service in front of it that says point to these pods And it it discovers those ip's using this for front facing right external facing external being still within the cluster, but dns name so that Other components can say connect me to front end and the service translates that into The set of pods the set of ip's for the pods that represent that front end microservice Correct. Yeah, so it you can um You know just reference things inside the cluster in general By a name, right? So it's essentially you can even connect to other pods You know in by name, right? So everything is managed and everything has a name And that's managed by The internal dns server in kubernetes, right? So before that was based on cube dns. They've actually upgraded A while back to core dns. Yes, right, which is originally it was sky dns Or sky dns. Yeah, and it was yeah dns. Yeah, I remember compiling sky dns And the very first time I deployed kubernetes in like the point eight days Yeah, yeah, exactly. It was like sky dns It was it was also bare bare bare bones dead dead simple, right? And so I like, um, let me Where am I here? I was gonna share my screen a little bit and kind of go over the docs A little bit so that's I can only find there we go Share screen. Mm-hmm. Make sure it's the right screen wherever you threw it. Now you need it Yeah, wherever. Yeah, exactly. Now I need this little there we go. Can I oh, I can move it I'll move it over here. I'll let that there you go way way over here on the right side way out of the way Yeah, so The dns, right, so I will Um, what's what's funny about searching for dns? On the open shit docs is that it's like not the first one There's the api for networking, right? So, um, it's under networking. That's that's how I remember how to get to it And then there is understanding the dns operator. So, um, open shift deploys Core dns be an operator, right and so Let me make this a little bigger Because there we go I can make it bigger if you like. Is that too big? No. Okay. 110 seems good. Um So we can I you can actually need a bigger. Let us know, please. Yeah, just let us know, right? Yeah. Um I think you can essentially copy pasta this Let me Make sure I'm at the right cluster Be sear. Okay. I upgraded this, um Last night, okay, so it looks like the upgrade went. Okay, not that it shouldn't but you never know. Um It's very reassuring of you. Yeah, right, right. Um, thanks. So if you go, yeah, exactly See, I say as a dns. I didn't say open shift. Um, so So, uh, this is um, it shows the dns operator and Just like all things in open shift, it's controlled by an operator. So we have a dns operator here And uh, you can see the version here And that's all good. So, um, what's really cool about this here when you describe here It'll give you information about your cluster and so, um The important information is this, uh, this information here. So the dns service for Um, for kubernetes runs on the uh, the service address, right? So the service address Um From a high level I guess now I could I could play dummy to you, andrew. Yeah is an overlay network For the most part. It's an overlay network. Um, that sits on top of your regular network, right? A software defined network It's possible But this this will get I don't want to rabbit hole too much But it's possible to have kubernetes running without an sdn If you ever did uh, kelsey's high towers, um Kubernetes the hard way he shows you how to do that. Um, it is hard It is hard because anytime you add a node you have to add the routing table to all nodes and It just makes sense just to use the software defined network. So, um, this is an overlay network So when you see this this is my service overlay network and my cluster domain is cluster dot local Which is the default so cluster dot local is the domain For kubernetes the internal domain for kubernetes. So let's let's take a step back for a moment. So The core dns service inside of open shift is for pod resolution or more specifically service name resolution for pods Correct service name resolution for pods. So this cluster dot local is meant to be only for internal resolution and is has no burying or no Importance with for example the name of your cluster, right? So you're whatever your cluster name happens to be, you know Mine is usually, you know, like something dots work dot lan or whatever that happens to be Nor does it have at this stage anything to do with your upstream dns servers, right At this stage. No, we're just we're at the pod service. We're at the service level Because that's where we're dns, uh, core dns really shines, right? It's the um Uh at the service level. So and then just uh Just a quick aside that cluster ip that you see there the 172 dot 30 So that ip address will come out of what you define as the service ip range in your install dash config dot yaml So if you change that from the default You could see a different ip there and that's perfectly fine Yeah, yeah, so it it it gleams that information from the range you give it and it'll it'll take that information And then it will create this ip address. So um, and this is the dns ip for your pods and that's the cluster domain. So uh one also quick thing before actually just drill down into the pods is the The cluster dot local is um, you shouldn't name your Your cluster, you know open shift dot cluster dot local like that's you'll you'll you'll have you'll have really bad bad time Anything local anything that essentially anything dot local. So That actually came up With some with some customers because they've had like dot local domains. So kubernetes has essentially ruined that for everyone So we have a question christian So sonics, uh, so how does this talk to the router? How does the service talk to the router? I'm assuming core dns and uh, and the dns operator There you go. Yeah, so, um, I'm actually I'm actually going to get to that workflow. So, um, I imagine you mean Yeah, um So but in general the service, um, the the service Layer is a non-routable layer, but that doesn't mean things can't get in and out So but I'll I'll talk about that workflow in a bit. So So yeah, so then this is the, um, the ip address for The the dns service inside the the service name, right? So, um, If I go to back to the docs And that does a describe Oh, and this gives you a nice Jason path. There you go. Yeah I've never seen that dollar interesting Anyone knows jq. Let me know what that dollar does. Um Um, so, uh, that's that same that's the the service network range. So as, um, As Sully was talking about this is the range and that's and that matches this this ip here. Let me clear it. Um, I always like to clear because you know, people don't like looking at the bottom I know I hate it when someone's sharing their screen and the they're like, can you clear? So my head my head could go up. Um, yeah Yeah, so, um So before, uh I wanted to go through this doc line by line, but I think I'm gonna skip around a little bit because it just makes more sense, right? So, um the core dns has a um Has a configuration And so, um, and that's this is stored in a config map, right? And by the way, andrew and chris I can ramble all in on so please give me time checks So I can get to the all the points I want to get to um And so, uh, so this is the configuration file All right, so if you do that, um, oh, I hate this managed fields. I can't wait to Kubernetes um In 4.7 in the GUI it automatically collapses those in yeah in 1.21 I guess which would be open ship 4.8. It'll It'll take take out the managed fields. You have to specifically ask for them. So um So this is uh, the configuration file for core dns. It's pretty simple. Um It first gives the, um The the domains it's it's it's the authority for and then, um, it even gives an ad arpa for ipv6 as well. So, um That's you know, something some our customers ask for is ipv6 support. So, um It'll send information, you know to Prometheus. So, you know, it has that plugin already set and then it has this line here called forward.resolve.conf. So let's let's explore that a bit. So, um So the pods all the pods have their resolve.conf set to let me describe that again To this ip address, right? So that's um, you know, uh dot 10, right? 172.30, zero dot 10. So let's get let's get some pods. Um I know I have an app running here. Uh, there we go. Test this my sequel. Uh oc dash end test rsh this guy And then do bash because for some reason sh everyone likes sh command not found clear. Okay. Well bear with me Can I do this? Yes, I can. Okay. And so, um That's about the second. That was Yeah, control. I'm like that was control. By the way. Yes, I didn't see that. Um, if I do a cat etsyresolve.conf It has a search field, right? So this search field breaks down Pretty easily. First it it does the um the namespace Right. So the namespace take is is a part of the dns. Um, so it says, you know, anytime you do a look up look up First look up the name from this domain Then this domain in this domain and then finally this domain, right? Um, last one is your external, correct? Yeah, this one is my is external or my my internal but to open up externals. Um, domain It um, so if I you know, if I try to do do I have dig? No, in this look up Do I have ping? This is a very small container. Um, yeah, so When it is hosts on there. No, this is a very small container. I think I think this is alpine That it does a search, right? If I do it like an NS look up a foobar.test.isv.local Oh, sorry, if I do a foobar, it'll automatically tack this on look for that then tack this on look for that and then it'll go down that side Um, then it has the name server for this pod to that the dns the internal dns server inside of OpenShift. So, um It says 172.30. So it'll look up. This is its main dns server and it's the only dns server this pod has so, um It'll only ask the the internal dns server. So what happens if This pod tries to reach outside the cluster Right. So what if I tried to do a ping of like dns1.ocp4.cloud.chx, right? Or if I do a ping of google.com How does core dns, um handle that request, right? And so, um, the answer is wasn't that other Uh, where's that yaml? The answer is in the next line, right? So, um It said, okay. Well forward everything that I don't know about to the host resolve.conf file Um, and in the host resolve.conf file look for everything sequentially meaning Check the first second third fourth one two three four five six. Yeah, exactly. And so, um, so in the pod It sends everything to that dns server. So that's first and foremost That's the only config in the pod and then the core dns server will then say, okay. Well, I don't know this I'm just gonna forward this onto my host resolve.conf file. So let's take a look at what that looks like Uh, oh, this is a windows server. Interesting. Okay. Oh, see our The debug this is this is what this is a server that I hack on for windows containers, by the way. So, um Hopefully no blue screens here So this will um, so now we're we're basically remote shelling into the, um The uh the node here. Yeah that one node. So I do, uh Uh ch root Christian just to uh, I answered in chat already. Um, so uh another question How do you how do you change the order of name servers after the cluster has been installed? Um And hopefully our answer is online Yeah What did you answer and I'll say yes, so I would I would um, so, uh Changing the order I would put it in Um, like in the machine config or if you're using dhcp, just change the order in dhcp. Um, if you have A particular name server you want to connect to for a certain domain I would use a core dns's plugin feature and like the only plugin feature That we support right now is forwarding and I'll go over how forwarding works in a bit, so Yeah, and so it's in line with what I said. I just didn't do the uh machine config part. Yeah Yeah, well the machine config if you want to just actually change the order on disk Or dhcp, right? Like if your dhcp is handing out your name servers and just change the order there Yeah, um and next time the lease happens. It'll it'll flip them. And so yeah So here I'm in I'm in the node and if I do a cat etsyresolve.conf Um notice it it's uh, I'm using nm state, right? Um And so Here it'll choose this name server then this name server then this name server sequentially, right? um, you'll notice that the first entry is um My ip address, right? So I could do ip adder, right? Uh Is it or is it that's a lot of Yeah, well if you an open ship all the pods. Yeah, it's all yeah all the pods, right? Yeah, it's all the uh virtual interfaces like this one that's making you do a fine. It's there line 11. There you go. Oh, it's interface 11 Let's do that. Yeah. Hey, what do you have? So yeah, there you go. So um So this is for um the the host file, right? So it'll it'll look in the host file first Then it'll do dns one dns two um in that file. So that's how that that um that configuration works If if I go back here, um, it says forward everything I don't know about to the result.cont Um, and then it'll use those name servers. Um As well. So very cool. There is um A few things I want to um, so here Right, it tells you operator status You know, you just describe it and just kind of see, um You know, it was degraded because I did a earlier because I was doing an upgrade But that's and then the the logs here in the dns operator That'll give you information about the logs. So one thing to note, um We have a question christian. Um, so from fahad For vSphere ipi does the dhcp server have to provide dns records for the nodes? And I think what that's asking is Does dhcp need to do dynamic dns updates with the dns server? So, uh, I'm going to say soft. Yes, it has to There is a way to provide dhcp For only the ip addresses and not dns That gets a little Laborious, I would say and I would I would recommend using the dns forwarder For configurations like that and I'll I'll explain that In a bit so the dns forwarder It's probably the catch off for a lot of these use cases And I'll I'll add on to your responses saying I would I don't think it's required, but it is strongly encouraged So and that is specific to on-prem ipi So the on-prem ipis use mdns For their local node resolution So inside of the cluster it will resolve those ip addresses without an issue because it's using mdns For that purpose however if you wanted to do external resolution like you saw christian a moment ago do an ssh I think maybe it was a debug But if you wanted to ssh to that node or you needed to have like you're exposing a node port and you wanted to Reference that by name, you know dns name of the node instead of ip address Then you would need to have external dns have those node names, which is where that dhcp updating dns would be recommended But as far as I know with on-prem ipi so v sphere rev etc They they don't need node names and external dns I'll also add on that if you do have dns names and they're wrong or they are different The nodes will actually take those names and use them instead So reverse dns will override the name that's given to And i was trying to check christian we got 15 minutes. Yeah, okay. Cool. Take care. So thank you so here The the dns service Here runs as a staple set Not staple set daemon set. I think yes. So daemon set meaning you know, it'll I have actually I actually have Seven nodes or one of them is a windows node, but the staple set Selector is set to os litics, right? So if I do demon demon set Sorry demon set staple. I was working with something else with staple set and I have staple set on the mind And so yeah, so that I have I have six. So there's The daemon set does you know one pod for each each host that's added with this label Um, and it's running in every cluster, right? And you notice that there is, um, three containers per so let's do a describe on that Uh, describe Yeah, what are you describing? You should know You should know what I you should know what I want. It's just intuitively pick that up, right? Like yeah Yeah, I should just know what I want I could read my mind, right There was um I used to work with a guy who would make, um, a lot of typos and you I would always hear him tell the computer, right? Because we always yell out our computers Right, do do what I uh, do what I meant. Not what I said I have the intelligence no wait Do as I say not as I do, right? Yeah, yeah So, uh, the the container that I wanted to call out inside the pod is the uh dns node resolver and, um, its file is essentially its job is essentially to manipulate the etsy host file, right? So if I do a, um What is that rsh, uh, and I think that's on the pod host No, it's not here. So, um, its job is to basically manipulate the etsy host file on the pod and so, um You know notice this has like the ip Of the pod. So this is my ip Meet my ip meaning the the pods ip address and all the possible names um, that's associated with that and that's what the um the job of this, uh What's in your resolve.com For that on the pod. Yeah Yeah, I know we looked at that a moment ago, but just to compare Yeah, so it has, um But the dns server, right? So yeah, it doesn't have to go and look for the um The name for like the pod itself. Yeah, right because it has that entry in etsy host Uh for anything else it uses that dns and if it doesn't find that then it just forwards it on to um The etsy resolve.conf on the node. So, um, the last remaining five minutes Let's uh take a look at I've been promising the dns forwarding Yes, right. So, um core dns The uh, if you look it up a core dns, uh Let's look that up. There it is Uh, it says I was reading this the other day It was like a like a y or something like that. Uh, yeah plugins, right? So it's it's a um Uh, it has the ability to do plugins, right? So it's essentially the core dns was made to be extensible um, and one of the things so just beyond Name resolution one of the things is that uh forwarding Is written as a plugin. Um, and you do that by editing the the default configuration So let's take a look at that here Without the dollar Who put the dollar there that copied that tell docs, okay? Um So here essentially, um Yeah, it's essentially here. Let's let's do this. Um, okay, so it's not there spec We're deleting you spec and adding these here. So, um Here you're plugging in dns saying that okay, so anytime someone looks up food.com instead of going through that That whole chain, right? So let's kind of recap the chain the pod looks at its etsy host file locally then it looks at um the its own resolve.conf if it can find it it will then forward it to the Resolve.conf file on the node itself so Instead of doing that the the pod will then uh the the d sorry the pod the core dns When the pod asked core dns, it'll say well for any domain food.com I'm just going to forward that request to these dns servers. So this is um You know, I ran in my day, I ran, um I don't know cluster dns server sprawled up for for my company sprawled across all over uh, north america. So Um, you know, sometimes you do dns delegation. Sometimes you have For whatever reason tech debt You dumb reasons Or when we say dollar sign reasons, you have dns server that's kind of isolated from the other ones that they don't know about each other Um, so if you have you know stuff out there that you want to reach out to you may have to ask that dns server Um, and this is essentially an array, right? So I say food server the zones, right? I can you know put baz, right anything under baz Um and use this dns server What if I'm cool about core dns is that you can specify a port you can't do that in bind Um bind everything's on 53 Well bind you can listen but um the client can't ask on other ports, so um And then the bar server You have bar example.com forward it to these uh dns server. So In this configuration Then once you know if Then it kind of injects itself in the middle if if core dns can't find the the name instead of forwarding it to result.com It'll try these servers And then from there if it doesn't know then it'll forward it to the result.com locally. So, um, this is a way to You know add different dns servers in In your cluster, right? So as opposed to like, you know the core upstream one From your result.com file. So um, so I have a question for you christian as well as there is a uh a question from Amen. Amen. Um, so I'll I'll ask mine first because I think it'll be a quick one. Um, which is Core dns, right the the open shift dns service also applies to static pods that are deployed. So for example the etz d pods on the control plane Correct. Yeah, so that um the the static pods also get uh Because they take part of that sdn, right? So they're they'll they'll get that same configuration Yep, so the and the the five second version is effectively for pod resolution So if you're in a pod and you do a dig or an ns lookup or something that It first looks at its host file It then looks at its result.com which will point it to the core dns pod which is It's a service. Um, that's going to be running. There's a pod on every one of the nodes in the cluster And that will then look across all of the other Services and the forwarders that are defined in this dns config that you have up here before finally Looking to the resolve.conf at the node level So if it is trying to resolve something external to the cluster You need to make sure that that resolve.conf on the node is configured correctly, right? Correct. Correct. And then also, um Um, one last kind of thing to interject there the host is, um The the nodes host Etsy host file also does come into play in that same layer as the resolve.conf file, right? So it'll um You know, we're looking at the the host resolve.conf file that that Etsy hosts there also Um comes into play so So the the question from from iman, uh, can I install open shift ipi across Multiple network zones for example one for control plane and one for workers So you you know the answer to that I do um the answer The answer is for ipi Today, unfortunately, no because of how keep alive D works the way we have it configured is that it requires layer 2 adjacency um so um For it for it to work, right? It is possible Uh to configure keep alive d to work across dot gone layer 3 Um, but that's not how we configure it and that's beyond the scope of of support currently Yep. Yeah, I have nothing to add that you've seen me answer that like 400 times and I know I know you've answered it too. So It should be a bot, right? Um Yeah, it should be Yeah, I think you should have that regex. Yeah So fahad, uh another question can open shift span multiple esxi clusters on the same vcenter when using vsphere ipi That's an Andrew question That is on the roadmap, uh, so I know that Technically if you were to look today at the documentation for like the industry in entry provisioner and all that other stuff You can create the vsphere.conf and have it understand multiple clusters and all that Uh, so we've had a number of customers that are asking about this We did some internal testing. We found that it's mostly works With 4.6. I think they were like 95 success, but they found a few things They created issues It is there is a a jira issue for it. I'll have to dig up the jira issue And if I can do that the next four minutes, I'll post it in So not not yet, but multi dr s cluster clusters Is something that is on the roadmap cool That sounds cool so Another question for you. Yeah, so set up dns forwarding on my home lab, which works great If the upstream dns server becomes unavailable the authentication cluster operator goes into a degraded state Any reason for that? um, so I So the authentication cluster operator So I'm gonna so first and foremost I'm gonna need more information like that or like digging to the why that would be so Before I go into my long-winded answer for my short-winded answer is I don't know But I've had issues. So usually the authentication cluster operator being degraded is a symptom of something else, right? And so I've always found some weirdness with the router So if you have like a router operator, I would check the look at the the router logs. I would look at the authentication logs I would look to see There might be a breakdown of the dns chain at that point um the cluster The authentication cluster operator probably has the upstream quote-unquote upstream dns name like oauth dots, you know You know cluster name dot example dot com And that's probably in the pod and since it can't look that up. It's probably breaking down I'm just guessing at that point Is what's happening Yeah, no such host. Yeah, see look there. So the so the if you're watching this and not looking at the the chat If you look at the chances It says no such it's no such host um That's that's your external lookup and I think that's that's what's failing the health checks for the external lookup Yeah, I think the the authentication operator relies on the quote-unquote well known endpoint um To certify that it's talking to the right thing and I think that connects externally through a node port Yeah, so it looks up the node name Which it's relying on that external right those aren't stored in core dns Those would be in either the upstream dns or in that mdns responder depending on your installation type So I think that's why that's happening Yeah, I did so we're just assuming. Yeah, so oh we got two minutes. Nice. Yeah less than technically All right, um, so monster frames, um facing an issue I asked him to join discord by the way and ask in there Yeah, or I don't think we can tackle open stack right now No, I'm I am not an open stack expert by anybody's the stretch of anybody's imagination. So, uh, please Chris doesn't have a cycle to yeah, so Please follow up on either discord or you're welcome to send me a message andrew.sullivan.redhat.com or If you want to send me a message on twitter Practical andrew and uh, I'm afraid to say that I believe my direct messages are publicly open So you can you can reach out that way mine are definitely open for short two s's Feel free to DM me on twitter too. Yeah, and we'll we'll connect you with the right people We'll get an answer for that so don't don't hesitate to reach out Um, so chris, uh, christian my apologies, uh, any last minutes closing thoughts in the last 45 seconds or so Yeah, so it's it's really, um, I posted link to the docs. Um, all I can say is just, um, if you're having trouble with dns Just follow the chain and see where you're falling down. Um, so it uh many times You know, uh, people blame dns when it's not dns where dns is a symptom Um, so it's good to always troubleshoot with the osi model We had one of those even at one of those this morning Yeah, um So thank you christian really appreciate you coming on today. Um, as always, it's a pleasure to have you appreciate you sharing the knowledge Thank you. Um So for our audience, thank you for joining us today As I said, please don't hesitate to reach out at any time with any of your questions Uh, you're welcome to contact me directly via email andrew.sullivan.redhat.com social media practical andrew on twitter Uh christian, do you have social media or other? Yeah. Yeah, so unfortunately, I have um a weird social media handle Not too weird christian h 814 so, um My first name in last initial 814 so now you guys know my birthday and I expect presents Go ahead and there you go. Send me a tweet. Uh dm dms are open. It's 2002, right that you were born Yes, yes So And as always chris, please, uh, please take us home Yeah, uh, thank you all uh coming up on the channel here in mere seconds. We'll be an open shift comments briefing talking about Uh building multi cloud provider platform gubernettis. So stay tuned. Thank you all