 Good morning. Good afternoon. Good evening. Wherever you're handling from welcome to another edition of ask an open shift admin today We're joined by many many people to talk about the wonderful thing that is multis and CNIs So Andrew you want to talk a little bit more about who we've brought on today and what we're doing here I would be happy to I have to say that today's topic and the one from a couple of weeks ago with SED these are two of the most exciting things the the things that are most interesting to me from an open shift perspective Because they they poke my buttons of being super technical and super critical of The things that are happening inside of the cluster. So I'm super excited You know to just have everybody on today. So same. Yeah, so before I get started I want to just have a reminder that this is one of the office hours series of live streams that we have here on open shift TV So what that really means is that we are here to answer your questions So for anybody who is watching at home watching from the office watching from the train wherever you happen to be Feel free to ask us any questions that are on your mind at any point in time during the stream And that's ideally, you know for us because most of us are Experts on the admin side not on the development side. We'll entertain those developer questions But so we may not be able to help as easily if we can't help We have a discord channel which I will drop in chat right now And you can always ask there and that'll give us an opportunity to go find the answer for you So that being said we do have a topic for every one of our shows and today I am as I said super excited to talk about CNI and SDN and so CNI being container Network interface and and I'm going to rely on my our guests here to correct me and make sure that I am speaking correctly So first of all, I would like to introduce and and welcome mark curry who is A consulting product manager with the cloud platforms business unit Thanks, Andrew. Yes. Uh, so again, my name is mark curry. I'm responsible for networking with open shift and uh, and so today we're we're going to talk all about hopefully answering your questions about OpenShift networking and CNIs and the plugins and multis and how all of those tie together and why it is that That it works that way I'm joined by Two of our top networking engineers, um, Doug and Tomo. Uh, if you guys could please introduce yourselves Hi, I'm Doug Smith. And uh, yeah, I'm a member of the open shift networking team And uh, I work on a team that we call the open shift plumbing team And we're interested in getting your, uh Workloads all plumbed up to the networks, uh, and the hardware, uh to enable advanced networking use cases And I'm joined by The guy who does all the real work, Tomo. Go ahead, Tomo. Thank you Hi, hi guys I'm Tomo Kumihayashi and I'm pretty working on the multis for four years or things with the duck And then yeah, the I'm pretty happy to say that the, uh The answer your questions about the multis CNI and then OpenShift as well. Thank you Yeah, one of the things I love about red hat is, uh Like all of you are I think legitimately the smartest people I know and yet you keep passing the buck right to the next person of, you know, I You know mark is, you know consulting product manager who hands it over and says I know nothing It's completely up to Doug who hands it over to Tomo and says I know nothing. It's completely up to Tomo I I love that factor or that aspect of red hat Uh, so before we launch into that, um for our audience for the regular attendees You know that I like to cover a couple of things that are top of minds. Um, the things that have come up Internally and externally, uh, regularly over the last week or or sometimes last couple of weeks So the first thing that I want to talk about is a bit of a Shelfless self selfless plug shameless plug. That's the word I'm looking for. There you go I'll get there eventually. You're on these ghosts. Yeah Uh, let's see. Let's share this window Uh, so the first thing is uh, just a reminder that every week we do publish the summary blog post from the previous weeks Uh, or the current week, I guess is the way it works. Uh show so You see here, we have our Blog post from last week where we had Christian Hernandez on to talk about dns and all the things going on inside of there I'll go ahead and link that into the chat. Thank you. Uh, so those come out Friday morning I'll have one this Friday that Recaps everything that happened here. We'll share all of the links as well as the questions and stuff like that that we've discussed To help make that more discoverable for everybody So the second thing I want to talk about is So for those of you who are currently deployed or have currently deployed 4.6 clusters and You noticed uh now it's been a little over a month, right? That's 4.7 was released and you still don't have a stable update channel Or a release in the stable update. Uh, so yes, we know we Unfortunately, there have been a number of bugs that have been found The most critical one the biggest one that i'm aware of is one that we've actually talked about before Yeah, and that was you know where when we update to 4.7 With vSphere clusters that are running vm hardware versions 14 15 16 We begin to see some sporadic right inconsistent packet loss, which of course is a bad thing So I know they're working diligently on that. I know that there's a ton of work a ton of discussion happening internally I see the emails flying constantly every day Where they're trying to to get those things fixed for everyone so that they can open the stable update channels I will offer that if you're curious about that whole process and what it looks like and how it works and all of the Efforts and all the decisions that go into that Rob Zomsky published this great blog post What five months ago now or so that talks about This entire process and what goes into all of those things And one of the things I thought was particularly interesting. All right, if we scroll down here All the way to where is it this chart? So this chart talks about the different channels that are in here And I think last week I talked about how there's both a fast and the latest and they're effectively the same One of the things that I thought was interesting that rob mentions in this post is we suggest that you have at least one of your clusters running the fast channel So that way, you know if these types of things are going to affect you before you roll it out to all of your other stable clusters So I thought that was an interesting. Yeah interesting thing that also helps us Particularly if you have, you know, the telemetry collection turned on because that is how we help to determine when there are issues That might prevent us that might, you know, maybe we should wait to roll into stable with those particular releases Uh, so great blog posts, um, you know, super timely information Considering that we are it is a little bit longer period of time than we normally expect to go from a new y release 4.7 And having those upgrades available in the stable channel But do give it a read if you haven't seen it already All right, so Uh, the second thing that I want to talk about here is just something that this is one of those that Constantly comes up that I like to put out reminders for so much like DHCP is required for IPI One of the things that we get asked about a lot is compact clusters three node clusters Right when we announced three node clusters, right? So you have just the three control plane nodes that are also marked as schedulable for regular workload We use the term right for bare metal clusters And what that really means is not necessarily it has to be installed on physical clusters or physical servers But it can be deployed to any cluster that was deployed using the non integrated or platform agnostic aka bare metal upi installation method So this is sometimes a little bit confusing One of the things that the docs team is working on so you can see i'm in the 4.7 documentation And i've selected the installing on any platform. This is the platform agnostic You can still see we still have this, uh, configuring a three node cluster option here So it doesn't have to be Bare metal it doesn't have to be physical server specific It can be virtual servers so long as they meet the resource requirements that Are installed with that non integrated platform agnostic basically platform equals none As we see here type of method So just to keep that in mind and the last thing I have before we kick over to mark is A question that got asked Internally earlier this week, which was can I convert an ipi or upi cluster into bare metal? Right. Can I remove the cloud provider integration? And this question sprang from I don't remember if it was a vSphere cluster or rev cluster But they had deployed using I think it was ipi and basically they wanted to begin adding physical Nodes so it's a virtual cluster. I'm going to say it's vSphere ipi Virtual cluster and they want to start adding physical nodes into that cluster And you can't do that you can't mix the platforms unless it's a non integrated cluster So the question was well, can I just turn it into a non integrated cluster? And unfortunately the answer is no And vice versa. You can't change a non integrated cluster into a cloud provider cluster Okay, that's enough of of me rambling So we've already got a question, but I will okay leave it for I'm going to save it But I'll just put it out there Feel no obligation to answer this team and because we're going to kind of talk about it One real use cases of multis people want to see that and second What kind of c and i's? have better performance For offloading switching filtering and so forth so on So let's let's try and cover that as we talk about our things here. Yeah, we'll definitely cover that chris In and remind me if I forget, but I'm pretty sure I will remember those. Yeah All right, so thank you greg appreciate that or oh leg. Sorry. I can't read this morning either So enough of me rambling. I will hand over to you mark and Let you I know you you had a couple of things that you wanted to get started with so Great. Thank you, andrew and chris. Uh, so to kind of kick the conversation off I don't want to presume or assume too much about what people understand So I want to talk about some fundamentals And so the first thing I want to talk about is make sure that people have a clear understanding of What exactly a c and i or c and i plug in is and how it is they get used So simply put as you heard andrew say earlier the kubernetes container network interface or c and i It's really just a specification and a set of libraries for writing plugins to configure network interfaces in Both linux containers and pods So when a kubernetes pod is spun up it needs networking information for its interface and it gets that from the c and i plug in So we have a default out of the box c and i with open shift as you might imagine Our is our current default is based on ovs and we call it open shift stm We have another next generation c and i plug in that ga'd at open shift for six and that is targeting to become our new default at 4.9 That networking is that we're plugging in is open virtual network or ovm So one of the first things I want you to get out of that Information is that our own networking for open shift is itself a plugin Which implies the possibility of swapping it out for another plugin So to that end we also support a special c and i plug in named career kubernetes We we worked with our open stack team and so for those customers that are running open shift on open stack and prefer to avoid the double encapsulation that happens when You stack one overlay network on top of another open shift on top of open stacks Courier kind of distilled down to the simplest explanation It's really just a way to collapse the two networking stacks down into one That being open stacks neutron networking plug-in whatever that might happen to be So that when a pod is spun up it reaches down to open stack gets networking information and assigns it appropriately So in addition to those that we fully support within open shift There are other primary networking plug-ins that we also work with So through the partnerships that we've developed we fully support several of these And each has their own market differentiators And sometimes they fill a gap in our own default capabilities that might solve a specific Critical problem for a customer that maybe we otherwise either could not solve or just simply couldn't solve in a timely manner So each one of those third-party CNIs we don't just say yeah have at it You know swap it out and support it Each one of those has to go through a pretty rigorous certification process the goal of that certification process is You know, there's sort of a predetermination of the lines of support between the two organizations Red Hat and whomever that vendor is maybe it's Cisco juniper tigera whomever And so this way the customer can just simply call red hat Both organizations get woken up to whatever problem or issue they have And then we've already predetermined Whose responsibility that particular problem is and then we get to work on it without the customer worrying about having You know, who should I call red hat or vendor X? So that's that's one big part about certification is determining those lines in the sand of support also to ensure Minimum kubernetes and open shift networking requirements are met I'll talk a little bit about that and ensure the plug-in runs workloads That customers expect around open shift. It should it should be you know, normal quote-unquote kubernetes networking And also to prevent plugins from doing things that decrease security The classic scenario and I'm I'm sure if you've been doing administration for a while, you've seen this You visit a page and says here's how to do this an open shift or or red hat in general And step number one is disable se linux, right? So we we want to avoid Let's not do that. Please don't make panwals cry so yeah, so uh, so that's that these these are some of the goals of why it is we we um Force certification on the things that we support. So What is the certification process for these vendors? They have to first of all make sure that all the containers that make up their solution are themselves Certified, right? We we don't want somebody just to to create a solution based on a container image. They pulled from who knows where We want to certify have them create a kubernetes operator and we want that to be certified The minimum functionality of that operator is it just simply must be able to manage the life cycle of the cni plugin They can make the operator as advanced as they'd like He can do things like add the ability to ensure The health of their particular plugin across upgrades of the plugin itself or for even open shift Another part of the certification is they they have to pass the same kubernetes networking conformance tests That we ourselves pass every time we make changes to our open shift plugins to validate that sdn Now those tests They don't cover specifics of of many of the market differentiators of the third party solution But rather they just ensure to users that Standard quote-unquote kubernetes networking is going to behave as they expect and nothing was broken And to to be clear when we say it's a certified You know sdn plugin and I just realized you keep saying cni and I often refer to it as an sdn So maybe after after you're done here, you should explain to me the difference I think I said when we were staging this that i'm going to play the role of dumb guy Which is like the role I was born to do so But just to be clear with the certification like we aren't testing and certifying and validating that Third party's functionality, right? We trust that their sdn is going to do what they say it does We just want to make sure that from a kubernetes networking standpoint It it meets that minimum standard Correct, and thanks for pointing that out andrew on the clarification there is that the cni Plug in is that thing that plugs into that spot in kubernetes? That can be called uh to to to get networking information All the advanced features of networking are handled by the sdn portion of it or the controller portion of it And so uh, so for example ovn ovn is really a controller That is used to manipulate ovs flows to to do the things we want it to do And the cni plugin is the thing that assigns networking information to the interfaces of the pod Um So so that's an interesting distinction right of cni is The kubernetes standard of how a pod connects to the network And then and the the the network provider the sdn is the one that implements that cni interface That's right. Okay. Yep, exactly. Um, so um, you know and what i'm talking about here and we haven't gotten into um secondary interfaces and secondary plug-ins But and i'll talk about those in a second But you know, let me just complete this uh this first conversation by saying really what i'm talking about here Is the primary networking plugins that are part of open shift and so i've highlighted our current default open shift sdn Our next generation, uh, that'll become our default at four nine. Uh, it's currently g8. However, which is ovn Courier kubernetes and then we have a a bunch of third party ones So what are the third party ones? in no specific order We we we value all of our vendor third party networking solutions So in no particular order there are ones from uh, i'll try to try to do this mentally alphabetically, but there is There is calico from tigera. So we have a great relationship with tigera and support their plugin Some of the key reasons why somebody might choose calico is um, they they demand, uh, maybe bgp Maybe they like some of the advanced security features So it was tigera that really upstream some of the network policy features in the very beginning And they they have some proprietary add-ons to that that some customers might appreciate So those are those are a couple of big things why somebody might choose calico another another one Cisco aci so Cisco aci Obviously is is supported fully by Cisco and it it really is I hear a lot from customers. Hey, I've got aci deployed throughout my data center Maybe it makes sense for me to use Cisco aci as the plugin and and maybe it does Maybe because that Cisco aci cni plugin might interact with the rest of the ecosystem of aci throughout the rest of their data center And provide some advantage And that's really for the customer to to determine another another one that we support Is uh, uh vmware So vmware has what they refer to um, they actually have two plugins The first plugin vmware has something they refer to as the nsx container plugin or ncp More specifically uh traditionally nsx has referred to more of the one that's associated with esx hosts Whereas nsx t is the one that's been more associated with Kubernetes and the container plugin But more generally it's just called ncp nowadays and we do have a certified solution with them So of course if you're deploying on top of A vSphere environment or ecosystem there may be some benefit to you're using that one Now one of the you know, so they're the other they have a second plugin I can broadly classify that first one as being Proprietary to vmware and it's not open source And it's designed specifically to work on vSphere platforms The newer one which they are actually just starting certification of today is their entrea cni plugin Entrea is an open source version of a plugin That that is not limited to just vSphere infrastructure deployments So they have just started the certification of that plugin and I'm expecting that to complete sometime in the second quarter of this calendar year And then we have others so juniper has their contrail plugin, which we're actually working with them right now That's also in progress to get certified And I think I've I think I've remembered everybody Apologies if I've forgotten someone but those those are the big ones that we're working with today Yeah, and and again, we don't Like we don't performance test each one of those So I think you know one of the questions was which one has the best performance Well, we don't we don't know certainly for all the partners because we don't you know We don't test them for their performance or validate their performance claims or anything like that And I'll also say like same the same stance I have the same stance with cni and sdn that I do with csi and storage provisioners, right of They're all my favorite children Yes, and and my favorite one is the one that works for you and meets all of your needs Right, you know, we really have no preference outside of that Yeah, that's absolutely true And I would like to add to that So we do nowadays also require a validation with some of our layered products Like for example service mesh. So if a customer uses Let's say calico with open shifts service mesh Uh, can we guarantee to the customer that that's going to behave properly? And so we've added additional layers to that certification to do additional validation That they're not going to break some of the basic features and functions Oh, I didn't know that that's good to know shift virtualization Yeah, I didn't know that that's good to know I should really one of our topics should be the certification process chris. I think that's a good point. Yeah certifying those things Question though, is there any hardware networking that works with open shift and kubernetes right now? Or is all of it sdn? Well, like all of it's become sdn Cis one of those hybrids where it is actually there is hardware involved in some of the decision process of aci Right, so you can actually I mean you can literally run Perl scripts on sysco routers that affect the hardware networking and influence the cni. I excuse me the aci Uh ecosystem that's involved. So I think that's it's really limited to to that extent so i'm just gonna there's been a couple of um tangentially related questions that i'm going to address real quick um So one from uh usame, uh any plan to add v-sphere csi driver on operator hub? So I don't believe that their csi driver is an operator No, um, so the first step would be for them to create an operator out of it that we can then certify and put onto operator hub So that that would be uh The first step and then the other one. Where did I where did it go? And actually andrew you bring up a good question about operator hub if I can interrupt you for a moment Which is um So cni operators, um, you may sometimes uh, actually, I don't know what the current situation is but I believe it's the the operators actually show up on our operator hub, but as you might imagine Today it's actually There may be more for information purposes because cni plugins are actually done at uh install time Got it Yeah, and that's uh, that's another question that um, I had for you at some point down the line, which was you know, we we see Or I see I wouldn't say frequently, but um, it it comes up of can I change my sdn, right? Can I change the cni plugin that i'm using? Or something happens and I need to change it like i'm forced to change a kind of deal. Yeah And I think that this is a good opportunity also to kind of go into multis Yeah, and or segue that way. Yeah, and actually before I just before I do that Let me qualify my previous comment, which is you can change cni plugins, but let me be careful in my description So we do have a mechanism today. Um, that was actually uh created by by our the engineers who have joined me on the skull tomo and dug which is the ability to flip from One primary cni plugin to another using multis and I'll explain multis in a moment But also when it comes to secondary plugins, you can add them along the way they would show up in any newly created Pods from that point onward and maybe they can address some of the details of that in more details later But before we get to that, let me do jump into multis to kind of get that conversation going so Back in open shift three the telecommunications industry was really the catalyst for this But but it's definitely not just telco It really was anybody who was doing any kind of network function virtualization or NFV When they started to transform their virtual network functions or vnfs to cloud native container based network functions or cnf a very early gap that was identified was the inability to have more than one network interface on a kubernetes pod so The the primary functional gaps then were where people wanted to have these additional interfaces for purposes of network segregation For both functional purposes like performance and non-functional purposes Like security, but also, you know the ability to do things like link aggregation and bonding for a network interface redundancy So to solve this problem red hats open shift sdn engineering And and dug and tomu were were big leaders in that and NFV partner engineering teams Also dead williams fengpan these these guys were big players in this they formed a networking plumbing group, which you heard Doug described Earlier that he's he's responsible still for the networking plumbing team within within red hat But they formed a network plumbing working group as part of the kubernetes network sig This was done during kube con 2017 To address some of these lower-level networking issues and kubernetes So it was chaired by red hat, but it was broadly attended and with many representatives across the industry I think we had somewhere up in some about 17 different members that were all contributing input Somewhere along that lines All with the common goal of achieving consensus on a de facto standard For implementing multiple network attachments in and out of tree solution So there were a number of use cases that we gathered and a standard specification was proposed And what we collectively agreed to build And what we did was we built a reference implementation For the solution using an upstream project initiated by intel named multis So multis c&i Is a meta plug-in for kubernetes c&i Which enables the ability to create multiple network interfaces per pod And assign a c&i plug-in to each one of those that have been created um, and so so fundamentally the there is a There's a static c&i configuration that's going to point to multis and then every subsequent c&i plug-in as called by multis Would get its configuration defined in a custom resource definition object I like to imagine multis like a power strip So multis is the power strip that converts a single plug-in point that's static c&i plug-in point of kubernetes To multiple plug-in points so we can add additional interfaces defined by additional c&i plug-ins um in our first release of open shift 4 We built multis into open shift as a default meta plug-in whether you were using secondary plug-ins or not So the option is always there if you're not using additional ones fine It's just basically a pass-through if you do want to use additional network plug-ins Then multis you don't have to do anything other than you know define the the crd and and plug it in So a couple of questions there so you'd use the terms primary c&i and secondary c&i plug-ins and I I want to I would like to ask what's the difference between those and then my follow on is I've also heard of c&i And sometimes in reference to multis being referred to as a pipeline They get pipelines those plug-ins together And usually I hear this from the c&v or the open shift virtualization team because they have their c&v bridge and their c&v tuning plugins and they feed My understanding is they feed from one to the next so if you don't mind Yeah, um on the second one I may refer to the engineering folks on this call to talk more about how that works functionally but there is a An ordering to which the the crd's are parsed And and the configuration is done that may be More what they're referring to by pipeline, but maybe dug in and or tomo can jump in on that one But uh to your first question About the types of c&i plug-ins. There there are broadly two different types There are the what I call the primary c&i plug-ins and then what we loosely call secondary plug-ins so So you heard that the primary plug-ins those are the ones that that basically define The primary interface on every pod in the cluster and kubernetes itself Doesn't uh fully understand or it doesn't treat like a first-class citizen any of the additional interfaces kubernetes End-to-end is primarily focused on that primary interface. And so all control plane traffic Let's leave multi-site of the picture for a moment all traffic control plane kubernetes traffic and data plane traffic Traditionally would flow in and out of the uh eth zero on every pod in the cluster When you start to add secondary interfaces kubernetes remains on that eth zero and you can define with those crd's and additional plug-ins You can define additional Network interfaces that are separated from that now The good thing about that is that this helped to solve some things that customers are asking for like in particular You know, I'll talk more about some of the different plug-ins, but let's let's choose one to discuss and that is s r i o v so s r i o v Is a plug-in that basically allows your traffic to bypass Even the linux kernel networking stack and go directly to the nick and so as long as you have a s r i o v capable of nick You can communicate. That's that's the fastest possible way to communicate from the pod to the core network of the cluster And so users were saying look, I don't want that to be encumbered by all the other kubernetes traffic for purposes of maybe performance maybe for purposes of Security or whatever So what they said was they one of the immediate and first use cases was to use that secondary plug-in To separate some of that data plane traffic from the primary interface You asked before for, you know, what's a good use case for this for multis A really great use case is when customers are saying Just they want to use just exactly that functionality We were one of the first in the street to enable high performance multicast streaming Because of this ability for us to plug in that secondary interface s r i o v c and i plug in And so customers could redirect their traffic out there and and achieve Host Line rate or their nick line rate for their traffic Leaving the leaving the pod and going on to stepping on to the cluster network So we have a question and mark. I don't know if this is a question for you or dug or tomo, right? But so hc 631 asks, um, you know, you've talked about crd's related to multis So which crd's relate to multis and what do they configure? And if we want I do have a cluster that I can walk through some things if If you'll want to go that far or if you just want to use words to paint a picture that works too um, I'd like to invoke dug and or tomo to Sure, let me let me give a quick overview typically when you're administrating a open shift cluster what you're going to look at is your networks object and that's where you configure your networks as a whole and also It's probably a good place uh to Kind of memorialize the configurations you have. So if you have Parameters for uh, open shift sdn for obn, you're probably going to have them there And you can also configure your additional networks there in a field that I believe is called additional networks that is then an abstraction from the Custom resource that multis used it's called a network attachment definition and it's that particular, uh Custom resource it's very simple. It's basically a field. It's a blob of json. That's a cni Configuration itself. So it allows you to say this is the cni configuration for this additional interface So i'm i'm gonna share a screen here real quick So that we can walk through a couple of things and and i'll ask you to Perhaps explain what we're seeing. Oh, I don't want to share what are we seeing I don't want to share the whole screen. Not that there's anything to see there. I just wanted to share the one window So because I understand ultra wides on our because I have an ultra wide monitor on The stream just don't work that well I mean they work that you just can't read them. Yeah, it's hard to read So I have here, um, um, I have provisioned an azure cluster this morning. It's running four seven two So Doug, you were just saying that we want to look at the network crd. Um, so So I do an oc get network Which is a global object. We see that we have the cluster And if I do a dash YAML We'll see the contents of that And this should look kind of familiar Because this is more or less the same stuff that you saw in your install config dash yaml or dot yaml When you were setting up the cluster and that includes you know, you can see here It's open shift sdn if I were to substitute that for ove and kubernetes If I were to configure additional networks and all of those things all of that happens with this particular object, right? Nice you got it and what you would do is you would just be adding a line here under the spec that would be additional networks and In the open shift ox it details all the particular parameters and goodies that you can configure there You also could potentially do an oc explain A Network attachment definition. Hey, there you go. Great. Yeah, awesome. There's an Perfect. Yeah, and kind of two things to point out here that are really important is The name the metadata name and you're going to use that You're going to refer to that in pods as an annotation to say Hey, this is the additional network that I want or one of many additional networks that I want and then In the spec the config that's a cni config which are in json and that's how you configure An individual cni plugin and some of those fields are static and required for everyone. So name type or plugins cni version those are generally required and then the rest is freeform. So it could be Specific to the cni plugin So so that let me add there one comments in addition of that. So the not only the name also the name space is there important at that time So the where if you creating the pod so how does it? Yeah, so the network attachment definition and the port should be in same name space So if you create the cnb br1 at the default and if you create the port for that in the another name space Okay, let's see the hooper at that time the amount is may cause an error So please create the network attachment definition and the podium in the same name space Got it. Okay. Yeah, that that makes sense I think at one point the documentation said that they could share across namespaces But I when I tried that I think I was told that was a bug in the docs. So So, um, there are so generally, um, we They're required to be in the same uh namespace. However, the default namespace is special You can refer to a network attachment definition in the default namespace from any namespace Oh, that's interesting. I didn't know that wait say that again So is so say for example, you've got say you've got uh, well 100 namespaces and you um, and you need 100 different network attachment definitions. So you could use it in 100 namespaces We said no, we got to have a way so you can share that in that particular case So you'd create the network attachment definition in the default namespace and then when you annotate it Use a format with a namespace slash name. So you would say default slash foo for example And uh, that would allow you to do it. So yeah default is uh special And so also keep that in mind for security considerations if you don't want it to be used across Uh namespaces as well. Yeah very valid point. I assume that's either a permissions thing or just delete the default namespace and you can You know fix Absolutely Okay, that that's really interesting. I'm yeah So it's it's funny because uh, you know chris and I chat about these shows and I usually try to have subjects and things that I Like to think I I'm at least knowledgeable on so that I can hold the conversation And I feel like I have learned just a tremendous amount from you already. So thank you. Yes, exactly All right, so um chris, do we have any questions that have Once an additional network is configured and the network crd the net attached definition is already created, right? Yeah, so our operator creates that for you. So yeah at that, uh, which is the cluster network operator that does that Okay So I have a question. Um, and in particular if we look at and I think mark you refer to these as the secondary network plugins Uh, if we were to look, uh, for example in the open shift github and I'm going to share my screen again I think that's the right screen to share it is so if I go to like github.com slash open shift if I could type and talk If I just do a very cursory like type in cni in the search field here, right? We get a bunch of cni plugins and I and I think these are what you're referring to as the secondary plugins like this whereabouts and You know, here's multis, which is not a secondary. Um, you know, route override cni s r i o v Yeah, um, so on and so forth. So can you kind of describe what some of these are maybe when I should or shouldn't Well, let me let me adjust it at a higher level and then um, given the fact that Doug and tomo are authors of some of these I will uh, I'll redirect the conversation to them pretty quickly um, but the premise is is that um, these these uh secondary Uh, cni plugins these enable our customers to immediately take advantage of Some benefit that's afforded by some uh, some other cni plugin implementation You didn't you didn't have to wait for this to show up in Kubernetes proper. This is something we could implement as a secondary cni plugin Faster it may end up um evolving into something that's a fundamental feature of Kubernetes But but but this is something that today would be enabled as a secondary function on the secondary interface So there's a number of different plugins that we support Several of these actually came from the kubernetes upstream reference reference cni plugins And we of course do do the redhead thing which is to downstream and then enterprise harden those particular plugins and then there are other ones here that we created out of need Uh, there was something some functionality that was not there that our customers were asking us for um, so for example, you know, there is um, You know, uh ipvlan and so this is really for about assigning Sub interfaces their own unique ip address that and they all share a common mac address an example would be um something like aws which uses ipvlan instead of vxlan overlays for their vpc offerings and And so you would see lower latency and improve throughput with that another use case for something like ipvlan or customers with pods that are in vms that want to use the vms mac for egress traffic and and not that of the host So there's a number of use cases to each one of these um, and You know at this point. I think what I'll do is I'll hand it to to dug in tomo You know dug, you know, there's another one you have in here That I know for sure that you were the one who created it and that was the whereabouts Maybe maybe start with something like that and explain how that got started Yeah, absolutely. Um, so Uh, where about it's a special kind of cni plugin So even uh, as we were talking about what's the difference between a sdn and a cni and there is actually a number of different kinds of cni plugins um, and One of those is an ipam cni plugin. So it's like a specialized type of cni plugin that works with other cni plugins that provides it with Uh with ip address information and in the case of whereabouts We discovered that um in some scenarios um Users were having trouble getting ip addresses to all of their Workflows across the cluster. So we had a lot of examples that used a host local ipam cni plugin And that ip address information is stored locally to each host So you go to assign the ip address on two hosts And you get the same ip address, which is of course a disappointment because guess what in the real world Everyone has a cluster that's bigger than one and maybe in a dev environment. You just have the one host Um, so we realized, you know what it's not always uh easy to get that uh across hosts So what whereabouts does it is assigns ip addresses to other cni plugins and therefore interfaces to pods and kubernetes Using custom resources. So you just give it an ip address range And it says, okay, I know how to figure out which ip addresses are allocated or not And it assigns those for you and one place where we've seen this to be particularly useful is in isolated sROV networks. So let's say you've got this, uh, you know Uh You've got a media streaming platform and you've got all your high performance video audio going out this one, uh interface, but what if you don't have, uh, dhcp access on that? Well, you don't necessarily need to go and then set up a dhcp server. You can just Uh, throw in a couple lines of config and say, hey, can I get an ip address from this range, please? And and that's what uh, what whereabouts does Very cool And so it's not not like assigning ip addresses to pods that are connected to open shift sdn because it has its own mechanism to do it It's rather if it's connected to some other, you know sdn for lack of a better term or some other network Uh, that is c and i controlled and it needs to determine an ip address for that. Um, so I i'll pick on like, uh, open shift virtualization, right? You said, you know, hey, I can create a second interface a second network interface on my host and It's connected to some network that doesn't have dhcp, but I can use whereabouts to manage ip assignment for that Yeah, absolutely. And uh, it has a number of features like excluding ranges So say for example, you're trying to play nice with other existing infrastructure in your network And you want to say like, hey, I know I have this Slash 24, but we have these legacy 10 ip addresses that, you know, I don't want to collide with those so you can specify stuff like that Very cool So I'm gonna Change directions ever so slightly And I'm gonna ask you guys Um, what is a bit of a scary question? Um, so I hope I'm not alarming you and that is What can go wrong? When we think about and I know like sdns, you know, sure I was about to say One of my previous professions was a network engineer and it's like, oh all the possibilities Yeah, and and I think it's one of those like, you know, a lot of times we think about, um, You know, the sdn isn't able to instantiate itself, right? The nodes can't talk to each other the vxlan tunnels can't you know Can't be created whatever that happens to be so Are there common things, uh, let me narrow this down, right? Are there things that commonly go wrong and Do you have any suggestions for like troubleshooting or how to identify those? Mark, do you mind if I uh, take it away? Um, please do All right, cool. Um, so As you know, I mean networking, uh, I wish it was just as easy as plugging, uh, Ethernet or optical cable and then everything worked that would be great Um, but in the meanwhile a lot can go wrong. Um So let me talk about a few of the things that we've done to help mitigate that for you in open shift um, the number one of which we have in terms of, uh, additional, uh networks Is we have an admission controller so that when you go and create these, um, custom resources It does some cursory checks on those, uh, to make sure that things are formatted the right way Um, that kind of some Simple mistakes, it can stop you and say, hey, can you double check this before I go and instantiate it? Um, so that's one thing Kind of the next thing that you might see happen is if you've made a mistake that, um, our admission controller couldn't, um, pick up on, um, Moltes may pick up on it And Tomo, uh, created a, um, functionality for Moltes that uses Kubernetes events So if Moltes finds a problem, um, it's going to create a Kubernetes event and you're going to see those in a oc describe pod foo And hopefully you see an event in there and it says like here's a Here's like a common mistake that we see all the time is you go to use, uh, mac vlan c and i and you're like Hey, I want a mac vlan interface on this and you need to specify a, um, interface that exists on your host system If you've specified the wrong interface name, um When Moltes goes and runs, uh, mac vlan c and i, mac vlan c and i is going to return to Moltes and say, hey, didn't find, uh, eth foo And Moltes will then create a kubernetes event that's going to say I couldn't create, uh Mac vlan because there's no interface on the host called eth foo So that's going to be one of the first places you're going to get your hints And in terms of c and i, um, c and i is generally, um, a one-shot event driven, um Akii and your sdn itself may be doing things that are More complicated has more of fault tolerance has more of a state to it Whereas, um, there's going to be c and i events the most the ones you're the most concerned about as an administrator Is on c and i add and c and i delete so that's when your pods created or when your pods torn down so if you go to create a new pod and, uh, it doesn't come up if you You know, you're doing a watch on a oc get pods and saying crash loop back off and all that Go ahead and describe it first. That's the first thing I would say to do because something might have happened during a c and i add um If you don't have enough information there to figure out why the pod wasn't created properly Um, there's a few other places that you might want to look Um, one of which I would say is, uh, the kubelet logs. So the kubelet, um Loop c and i cryo. There's like a kind of constellation here that all works together a lot of those logs are going to be trapped in the kubelet and, uh The c and i api itself is all standard in standard out. It's fairly basic and that So when you have an error, it's in standard error and that standard error is going to be picked up by the kubelet itself So you should be able to Rip through those logs and figure out if you need a little bit More detail so, uh, we've got a Well a statement and a question and we'll I'll go ahead and ask your question just um so that we can, uh get Verbally address it right so statement from hc 631 Trailing commas which are normally accepted as functional json Don't get accepted in the network object if they aren't followed by another key value pair in the json In the network object. So I think that was just a statement of it seems like, uh, there might be some non-traditional parser behavior there I would love to have an upstream issue created for that because that's actually news to me about the json spec I honestly thought and maybe it's from a c and i Uh, specific bias, but I I didn't know that those trailing commas were okay So certainly, um the If you look for the container networking, um github org upstream that is probably a good place to start to file an issue And that'll be cool to fix awesome, uh, and then we'll lead us, um, so We can change the primary c and i after install. So for example move from open shift sdn do ove and kubernetes What happens to the pod and service configurations do they need to be restarted? Does the node need to be rebooted? that is Handled for you um by the operator and what I believe that the Um, the operator does is that it gracefully drains the nodes and reboots them when it needs to Okay, and I think that that address that also answers one of the kind of very first questions we had which was can you replace the sdn, you know, and and You know substitute one for the other and I think the answer to that Is yes, although as christian pointed out in the chat earlier Sometimes the options that you want to use so for example ove and kubernetes with hybrid networking have to be Decided upon before the install happens So yeah, if I could address that one first and then I'd like to pass this on to dug to maybe talk a little bit more about how this this works We um, we have our first real challenge in this space with Moving from our current and legacy default of open shift sdn to ovm so we need a way to Get our new customers on to our next generation ovm networking where all of our new development is going Without asking them to greenfield redeploy their clusters from scratch again So we looked at a lot of different possibilities for For the primary cni plugin to go from one to another and it turns out that that multis is a great tool to facilitate that move And essentially, you know the problem actually, you know, uh, dug Do you want to address some of the specifics on how that works? Yeah, totally and basically in a nutshell, but just is dug sorry to interrupt you I just want to let you know that we've only got about two minutes before and we have a Open shift commons briefing right after this so Just to let you know Cool 20 second answer is essentially your pod has one interface. It's your primary pod to pod Network traffic what we do is we plum another interface in with your alternative sdn and then we switch the traffic Over that way. That's actually how it goes nice yeah, that's Multis seems like it was I'm not sure if it was super smart for thought or accidental genius, but it seems like it solves a lot of problems Super smart for a thought andry come on. Of course. Of course. That was that was what I was going with So as I said, we we do only have about a minute left now before we go over to open shift commons briefing So I want to take the opportunity to thank you mark. Thank you dug. Thank you tomo for coming on today This has been really phenomenal. Thank you so much for all of the information that you've shared with us So I saw hc 631. I saw you said you have another question Please feel free to reach out to me or chris. So andrew.sullivan at redhead.com or on social media practical andrew at twitter You're welcome to reach out to us anytime. We'll make sure that the team here gets those questions and we can get good answers for you absolutely Yeah, and so thank you again for everybody who who is watching Please keep an eye out monday for the excuse me friday for the follow-up blog post that has all of the Details and stuff that we shared inside of the session Thank you again mark dug and tomo and have a great rest of your day everyone. Thank you very much everyone Yeah, thank you. We'll see you here in a few minutes and open shift commons everybody. Thank you very much for tuning in