 Yeah, let's get on. So, whatever my computer to finish restarting. I did an update. Okay. Okay, here it is. And meeting minutes. There we go. Okay. So, if you have not added yourself to the agenda yet, please do so. Can you add me as well? Okay. And so events, so we now have. Three, well, we have three recurring talks or weekly meetings. We have this one. Every Tuesday we have the NSM docs every Wednesdays and we now have the NSM use cases, which has been moved from Friday to Mondays. Every other Monday. And so, if, if you're able to. Romkey or Prem are you online? Yes. Yeah, if you're able to fix and have the calendar onto this for your. Yes, yes. And it's, but it's every other week. Yes. So basically it will overlap with the. Yeah. The other thing is if folks could. If you could update, push a patch to update the website. Currently it shows on the community page on the website. Yes. That will fix it. Definitely. Cool. Awesome. So. We have a talk coming up. So this Friday will be. Service mesh days. And. See. I have the schedule right here. So I will. I will post it. The service mesh days is March 28th and 29th. So the actual talk itself will be. On the 29th, the 28th is comprised primarily of workshops. So if you are in town. That is definitely. Definitely consider. Joining in. We also have the Intel out of the box and network developer made up, and that is during the afternoon on April 2nd. And so we will have a 90 minute. Talk and. Hands on. Workshop that people will be able to try. There is the day right before. Right before. Starts. And I'm grabbing the link to that right now. April. Okay. And the link. There. Great. So. So. We will have a 90 minutes. Talk and hands on. Workshop that people will be able to try. There is the day right before. Right before. So we also have ONS coming up of where we have. Three, three talks. One of them is the intro to a network service mesh. We have a panel discussion. About using Kubernetes as a network service orchestrator. And we have a NSM and open stack integration. Exactly. We are doing in conjunction with the Sila at Ericsson. I, Sila has told me that she may now be, which is probably not going to be able to show up. So I may end up just giving me the talk. MPLS, SDM and NFB event in Paris, April 9th, 2012. Good question before we move on to that. So, are we still on for the LFN demo booth stuff? Yes, I mean, yes it. So in fact, I was just chatting with Nikolai about certain issues, so we have, I mean, we are fully into it. Yeah. Awesome, super cool. Cool. So yeah, so by the LFN booth and you can see ODL and members who are as much in action. We also have a, so I already mentioned about the Paris event. So we have a container world coming up in April, 2019. Sorry, Fredrik, just a quick interruption. I got to know that there is going to be an unconferencing track for ONS. So just wondering, should we submit a talk and just pasting the link on the chat window? So this is a good opportunity wherein this is more like a free flow discussion if we want to properly leverage this. Yeah, that's a good idea. So do we want to do something with us towards like just giving people to meet the community and use it as a time to as a set time for people to come and talk with us? All right, exactly. And also we can talk about, it's a free flow. So if you click on the link, you'll see the schedule there are the on-app discussion, OPNSE. So this is LFN. So what we can do is we can probably share it and at least post it along with open delict. And then we can probably open up to the audience and ask, I mean, we can have a free flow discussion. Yeah, that sounds good. Ed, are you up for that as well? Yeah, it also occurs to me by the way, I stuck in the link for the events page. So if we're gonna do that, we want to make sure we listed on the events page on the website. The other thing is we also need to figure out where and when we want to do the traditional NSM happy hour as well and get that updated on the website. But it looks like they've got, yeah, let's put something for NSM in there. But I don't necessarily want to do a talk so much as I think you said just something free flowing, like an NSM boff perhaps. Only caveat is I'll just check since NSM is not technical under LFN, I'll check with Phil Robb to see if we can include it. We can probably add ODL and then we can do that. Yeah. I'll check. In Europe, handle with us doing some on some of this time. One other quick question, the MPLS SDN NFE stuff, do we actually have anything there at all for NSM or should I remove it from our list of events at this time point since the CFP is fast? I think we, well, is anyone going who is interested? I mean, I'm okay with removing it because we don't have anything that's NSM related that's in the event itself. Yep. Okay, cool. I have some colleague who will attend and I know that Daniel Bernier from Bell is going to attend the MPLS Paul Congress, but it could be open discussion about NSM, but nothing official. Okay, cool. Cool. So mid-April, April 17th to 19th, we have Latino World with Prem giving a talk and he's going to be talking about networks for as much. We have Hugh Connie, you coming up with Barcelona. And so with that, yeah, let's go and remove those bigger ones. If you are heading over, make sure you book your hotel sooner than later because the hotels have a tendency to talk to you. The hotel situation is rapidly deteriorating. So if you haven't booked your hotels, absolutely do. I think we probably will have a couple of talks in a happy hour and some other things going on there. So you'll give me a little bit of time. I'll get that information up here. Cool. Yeah, and if you end up missing it, there is a really good transit system in Barcelona. The train leaves every five minutes or so. And so if you don't find somewhere near there, you get one that's near one of the trains, but just make sure that the line you're getting on is close to where your hotel is near the line that it's on because otherwise it didn't do a few transfers. And we also have a couple of located events. We have the FIDO mini summit and we have the, let's see, that one, the call for paper closes on April 5th. So if you would like to talk about, sorry, did I get that wrong? No, you're right, it's April 5th. Yeah, so if you like to talk about MSM in the FIDO mini summit, then definitely submit a talk. But is there a list of accepted talk for the CubeCon? Very easy to answer. So we have a couple of slots that are available to us on the anticipation of us being a CNCF project, but the talks that we submitted, none of them were accepted. This is actually not super surprising. There's been a lot of stuff floating around Twitter where the kinds of talks that accepted tended to be fairly within a particular range of things that were well understood by the program committee, which is a part of the course for program committees. So I think effectively what it comes down to is, as we become more well-known and understood beyond simply networking part, I think we'll do better there, but we will still have some things that we can do. Yeah, the CNCF appears to actually reserve some slots for you in order to help mitigate this. So they hand out two things that they think are useful, but the community are not very aware of just yet. So I mean, we should be good. I just need to get that sorted out. Cool. So we have CubeCon in Shanghai. I don't think anyone submitted a talk for that though. I'm inclined to declip it then just because our events list is becoming unwieldy. Yeah, I agree. Let's move it. And ONS Europe talks are, the call for papers is already open. We have a little bit of time before submitting in. So if you intend to talk there, feel free to do so or feel free to engage us and we'll help you put together a compelling talk. So this part is unfortunate. We have MEF 2019 and CubeCon North America on the same days. So there is a talk that has been, or there was something similar to MEF 2019. The call for paper for CubeCon has not been open yet. And with that, are there any other events that we need to talk about or should we move on? So, Ed, NSM CMTF proposal. So it is now in, do you wanna talk about it? Yeah, so the CMTF proposal's gone in. We have our two sponsors. We're anticipating review for the proposal sometime in April. Now, the way I think this is currently going in the talk is they have normally the technical oversight committee meets at 8 a.m. Pacific time on Tuesday mornings every other week or actually the first and third, the first and third Tuesdays of the month. I believe what they have done because they have a little bit of a backlog of projects is they have taken the Tuesdays at 8 a.m. that they normally don't meet and they're now meeting at those times just to process project proposals. And so I think that would mean that if we were to be scheduled in April, it would be either April 9th or April 23rd. And because that is at the same time as the NSM community meeting, one of the things that we can decide what we wanna do when we actually get a firmly scheduled time slot is do we want to simply cancel the NSM community call on that day and redirect folks towards the talk call for the project review? So that makes sense, Ned. And also one request, can you share the final submitted proposal? I know the link, you have the link, but it sort of has all the form and the real grand finale. That link is actually to the final proposal right there. Oh really? Yep, yep, that's the one that actually got pushed as a full request. Oh, okay, okay. Yep, yep. Yeah, so is there a value in showing up with people and numbers to the meeting? I don't know necessarily that there's all kinds of different kinds of value that could come from that. I don't think there's value in the sense that you will influence one way or the other how the review goes, but some of us will at least have to go and present it in that meeting, so we'll not be able to be here. And it might be the kind of thing that would be nice for the community to be there to witness. And for some people, I think there may be value there. Yeah, so I'm up for having it canceled and for us to continue at the meeting itself. Do we have an exact date yet? We don't. I'm currently working on the scheduling. The sort of rough cut estimate that I got was probably sometime in April, but the scheduling is to a certain degree up to the talk and I'm trying to get a clear picture of exactly all the ins and outs that go into that. Okay, that makes sense. So, it turns out that some of the more helpful people at CNCF have been in India the last week and between going, being there and coming back, it's been a bit of a slog for them. Yeah, I can see that because they have their first Kubernetes days event in Bangalore, I think. Yep, we're excited. Well, I think it's a good idea to redirect the community call to the talk then. And so if you'd like to come and watch and so on, then definitely feel free to do so. So I believe that those talks are all open. So there should be no issue with getting people in. Yep, I'm... Anyone else have other opinions or thoughts? Well, I think that the most, I'll say, I mean, the core people should be there. I mean, in case there are some questions and things that we can help. So effectively the call will be more or less obsolete. I mean, this will be a group call if they're all open. So I'm all about just moving there. Cool, that makes sense. Cool. All right, so I think our next thing is just to get the time slot organized and then we'll put an announcement on here. The worst-case scenario is the announcement is done less than a week before. In which case we will put a big banner on the top of this document saying please, please go to the other meeting. Okay, so we have a CNCF testbed that we're starting to do some work with. So there's some stuff that needs to be done towards us. Actually, if anyone wants to help out with this as well, it's a relatively easy task. Let me pull up the URL for it. There's gonna be an interesting meet in the middle here, I think, because we're going to want to stand up the chains of things in the CNF testbed as a network service in points. And then I think there's some potentially interesting stuff around the NUMA issues, the CPU pinning, those kinds of things that may be quite interesting work. Does that match your understanding from the folks from the CNF testbed? You guys know that environment better than I do. That's, yeah, that sounds about right. I think we're trying to do this in multiple steps, help with the use case of using NSM with OpenSAC is one of the items, and then being able to use NSM as an option with Kubernetes for use cases in general on the CNF testbed. And so we're going at that in several ways. There's quite a few tickets that are running right now related to this on that one. One of them is getting the V-switch in the pod, which is actually completed and several other things that are related. That's just an NSM related tie-in in, but is there anything else, Fred, about the OpenSAC and NSM? Yeah, I think what you just linked, the 213. Quick question. So you do now have the V-switch running in a pod, correct? Yes. Have you guys sorted out how you're handling the CPU pinning stuff? Are you still running everything in privileged containers, for example? Yeah, we're doing that in steps. No, no, it has to be done in steps. It can't be done as a big bang. I completely agree. Right now it's, the testing is more on pure functional and then as we add in the pinning, we'll deal with other stuff like performance whenever we get to that. But there's some items with Mellanox and the driver issues when we moved the V-switch itself, so not the CNFs, but when we moved the V-switch into a pod, we have that working on Intel, but on Mellanox systems, the packet systems have Mellanox NICs. And we saw some issues with the driver, so the current working code is with the Intel packet servers. Oh, that's, okay, that's very good news. Okay, so if you've got it in a pod, if the V-switch is in a pod and you've got the Mellanox NICs, I mean, you're using Intel NICs, then it should just be a relatively simple matter of getting everything turned into a network service input or a network service client to get NSM working in those testbeds. And then everything else around CPU putting stuff is things that have to be figured out in their own time anyway. So, yeah, okay, no, there's big total silence. So we may wanna break this part down at the item that's highlighted right now, that CNF testbed use of NSM, that wouldn't be 213, we probably need an Epic or I'll create a project or something that contains all the tickets. That ticket 213 there is for using in a Kubernetes cluster that has NSM enabled talking with the OpenStack cluster that's deployed using the CNF testbed code. So we should, we have, I guess two efforts happening at the same time, adding NSM to the CNF testbed so that it can be used anytime with Kubernetes clusters. And then the other item is that ticket 213, which is a Kubernetes cluster using, I believe it's all the make files and stuff that's currently used in NSM for deploying or setting up a cluster and then adding NSM to the cluster. Okay, let's not do it that way. So we now have Helm charts for NSM. And so I think probably our best bet is going to be using the Helm charts rather than the big file machinery. The big file machinery is delightful if you're a developer trying to work on NSM itself, but it's kind of vicious awful if you're just trying to deploy them, which is why- Yeah, I was gonna bring up the same thing after looking through the code. So the one area with the Helm charts is I don't think that they're not CI'd at the moment. So we need to start working on getting that stuff into the CI. Because we don't wanna be breaking the CNCF testbed or others. One thing that we could possibly do is, do we wanna make sure that the Helm charts work on a, we don't have to run them with every call. We could run them on a nightly basis as well. So that way they don't take too long. Yeah, I have been thinking about nightly builds. And yeah, that was definitely one of the things that I was considering. Like we just deploy a packet cluster. I mean, if we're talking about the currency, I just deploy a packet cluster and do Helm chart deploy just to verify that something is there and then maybe do some pinks. I don't know, invoke the checkscripts, something like that. So the one thing I want us to think about when we do that though is, there are absolutely things where they just take too long and we really get things like nightly builds. And that's fine, that happens. But I wanna try and see if we can keep as much in the line of the incoming testing that's done on a per patch basis as we can get away with without bloating, verify times to insane levels. Because that way we actually do know the role is in a good state at all times. So, that said, I don't think, I'm just gonna go back about 20 minutes, end up being helpful. I think they actually start causing people to do crazy ass shit. Well, we are a little bit over 20 minutes, I believe today and with some pending changes in the testing framework, we can very well go to 45, 50, something like that, which is not really... Can we potentially parallelize some of the testing? Yep, maybe. I mean, we have some patches that are being prepared for having a support for namespaces, which could allow us to run different DNS managers in different namespaces, which could eventually help. Like if we have unique namespaces for each of the tests, then they probably can be parallelized. We should also talk about that, because I would expect trying to run multiple network service managers in different namespaces. So essentially multiple network service managers per node. There are a bunch of places that we can opt to trend carefully for that to work properly. Yeah, of course, of course. So, pardon me. So no, okay, that's good, that's good. I mean, I sort of set my piece there, which is life is full of trade-offs. And I think that's mostly it. All right, so one option that we potentially have as well is to throw more hardware at it over time. And I think the namespacing stuff will definitely help in another aspect. One option that we have is to spin up, is when we spin up a cluster, we could actually spin up a persistent cluster. And that avoids, so we could add and delete namespaces on the fly if we managed to get that to work. Yeah, there's a lot of work that we need to do in order to get us there. And I'm not entirely sure whether that would meet the standard MSM practice in production. There's a question on that as well. I think that we are keeping this constantly deploying and destroying the cluster for pricing purposes, like for just having kind of price control on what we use. I don't know, is that all? Yeah, so basically what it comes down to is being a good citizen, which is we should try and strive to make sure that we're actually not consuming insane amounts of resources, and that what we're doing, we're actually consuming it in an efficient way. Right, so right now I think if memory serves, when a CI run occurs, we start up two of the smallest instances that Pacin offers, and I think that runs a seven cents per instance. So we've got a total of 14 cents cost to run a CI run, which is not bad. If we were to double that to 28 cents, presuming that we were actually doing something useful with it, like parallelizing our testing, I wouldn't think that's an egregious use of resources. I think that the next actually available instance is about 40 cents, so it's more than double. Oh, no, but if we're going to parallelize tests, you really want to spin up two new instances, you spin up another one. Ah, I see, I see. Running on bigger instances doesn't really help us parallelize, but running on, instead of running one cluster, if we were to run two clusters, for example, which meant we could parallelize the test running. That's a fairly marginal cost shift. Quite frankly, I'm much more concerned about figuring out why we occasionally have zombie instances. Yeah, we do. Rememberly, that's something I'm much more concerned with than the notion of starting up a second cluster. We do want to be able to listen. The CNCF and Pacin have been super nice to us about all of this. But more than anything else, I tend to be someone who thinks in the world in terms of value. Back in the good old days when I first joined Cisco, one of the core values the company used to espouse was frugality. And they were super, super, super clear about the fact that frugality had nothing to do with how much money you were spending. It had to do with taking care to maximize the value for the money you were spending. Okay, so I think that's a good thing to keep in mind here. A quick question, you would mark this in progress, the VVVVV switch in a pod thing, Taylor. Does that mean that we haven't yet quite got the VVVVV switch in a pod, Taylor? So the testing with, I think the last thing was enabling hyperthreading on one of the Intel quad Intel machines. That was yesterday so that we could increase basically for all the test cases that we're trying to validate. We couldn't deploy as many CNFs. It's working, we wanna do more of the testing. Okay, that's super useful to know that detail. Cause again, that makes the next step that we would do for getting CNF, NSM working in force, the test bed, once that VVVVV switch is running in a pod, that's next step gets much simpler, much simpler. So that's super good to know. Yeah, one quick comment here on the CPU pinning of the CNFs or even the V switch. As you guys understand, the NUMA affinity will be associated to the physical NIC as well. For example, the interNIC that's been discussed here. So on a test bed point of view, I was thinking, we should have a NIC per NUMA node. You don't need to have it on day one itself, but the final test bed probably should have a NIC per NUMA socket. And that'll be the right testing for the various CNFs running on each of the socket. Yeah, you're absolutely correct about like what makes for good results. There's an ongoing set of interesting questions in Kubernetes around how to handle the NUMA affinity of things and it's the very short version of this is it is not going to ever work the way that it worked in something like OpenStack where you just do very fine granular mapping of stuff. That's never going to be acceptable in the Kubernetes community. That's the bad news. The good news is that there are things in progress in SIG node and resource management working group for actually allowing you to get what you need without doing that fine level of granularity of NUMA mapping. And those are hopefully going to land in Kubernetes 1.15. And I would expect the CNF test bed would want to take advantage of that. Did that answer at all some of what your comment was? Yes, my experience is surely coming from the OpenStack side. So yeah, that granular pinning was required and was done, but here if the flexibility or if it's not going to be so granular or strict in pinning, I understand, you know, but ability to exit out of the right Nick as a policy might also be a good thing to think about. One thing to keep in mind too is in June, AMD's drop in the Rome architecture. So the actual silicon underneath all of this is about to get some drastic updates from AMD and Intel are going in drastically different directions, but Intel is going to give you the ability at the hardware level to kind of customize what your NUMA zones look like. And then AMD is coming up with this like, I forget what they call it, but it's basically distributed across all the different dies spanning the different sockets. And even within the same socket now, if you so decide to, you can like carve it up to, I think total of like 16 NUMA zones or you can just take advantage of the distributed bus and just say all sockets are one NUMA zone. And so like, I think when we get to maybe around this fall, even how DBDK consumes these architectures and stuff will be different and it'll give us the ability even at the hardware level to kind of tune, therefore not forcing Kubernetes into awkward positions because it can be basically ignorant of all this stuff that's going on in the actual hardware itself. That would be super nice. By the way, if you have friends in AMD landia, you need to encourage them towards the Kubernetes community and make sure that things work out in a way that's good for them if they have good architecture. Yeah, I think I've got some slides. I'll try to track them down, but the Rome architecture is very unique in the sense that they're getting away from like the NUMA madness. And they're saying, we're gonna homogenize all this, you're gonna pay a penalty, but it'll be minor. And we think that the ease of use overcomes the small bit of like, we're talking nanoseconds of latency here. Yeah, no, I mean, that actually works out, which I think it does. Then that would greatly simplify life because the NUMA madness is kind of awfully painful, particularly in Kubernetes. No, so the comment earlier about the open stack way of doing things, the reason I mentioned the difference with Kubernetes is, absolutely, if you're coming from the open stack side, you're used to granularly specifying on what this thing running on this core, which is in that NUMA zone because that's where the NIC is. In Kubernetes, it's a little bit more I have this thing that wants this NIC and Kubernetes figures out that that NIC would really rather that things who were using it were running in this NUMA zone. And so does it's level best to schedule any cores for that pod into the NUMA zone of the device that it's trying to consume? Does that make sense? Yeah, it makes sense. Yeah, the placement of the CNF, yeah, that makes sense, Ed. And that happens in open stack as well. It's just that there are a few detailed scenarios. Maybe it'll be deviating the meeting today, but there are detailed scenarios where you wanna make sure that the traffic enters to the right physical NIC so that the received path is optimized for the CNF. The placement of the CNF is one issue and how to steer the traffic towards the right NUMA in a server which has multiple NICs. That is another problem. Yeah, I know the unfortunate problem with the current solution as I understand it for NUMA zones and Kubernetes is that it has a couple of sort of presumptions that aren't stated. And one of those presumptions is that a single pod is only gonna really be understood in a single NIC because it doesn't really have a good solution for the, I have NIC zero on socket zero and NIC one on socket one and I have a CNF that wants to use both. It really has no meaningful solution for that problem. Sure, and I think as Jeff was pointing, the CP architectures are evolving and changing, right? I think it might be a good idea to put the boundary with the V-switch and the CNF doesn't care. And if we have to add any NUMA awareness or some intelligence, that can be in the V-switch layer rather than the CNF's one. But we do want a mode that, we want to avoid this inter-socket communication, right? That buses the bottleneck, correct? I mean, we do want a mode that, like... Yes, Soranki, they're the... Yes, I mean, the full vertical alignment, right? Yes, yes. The way that you picture that in AMD, at least in Rome and potentially in Intel, that's gonna change, like, instead of a QPI, you're now gonna have, like, just this distributed bus that spans both sockets and all the different dyes inside of each socket. So, I mean, at least on the AMD side, now that they're moving back into the enterprise-class server market, like, they are not gonna do NUMA from the standard of, you know, this memory lane with this PCIe lane, with this socket all go like this. And I've now had to cross the QPI two times and add an X, Y, and Z latency. Like, some of that is gonna be abstracted. Some of it's gonna become more complicated, depending on how you decide to carve these up in the bios. So, it's gonna be some exploration and it's not gonna be the, you know, X86 we've known for the last several years of I wanna get everything into one vertical so that this memory lane comes out of the same socket as the PCIe lane that I've pinned this VNF to. So, just that spot answer, the only... So, one thing we have to keep in mind is that transition to newer processes is, I mean, it's slow transition all day. So, basically, the existing architecture, right, will there be that for a while? I mean, at least from what we have seen, right? So, we need to sort of be able to support both. Yeah, so, Ramki, the execution of the CNF on a core, which is associated in NUMA, you're right. We need to have an opinion, the placement part, right? What I was trying to say was the CNF, it exiting out of the server via a particular NIC, that can be hidden behind the V-switch. We've got to divide the problem into two pieces. Networking, the exit and entry point to reach the container is one aspect and placement of the container is another aspect, right? So, there's another part to this too, right? Because keep in mind too, like, and this goes to Ed's point, we've got to decide how convoluted and complicated we'll ultimately make this. Because if you build a CNF as multiple services in separate pods and use something like MIMF, are you gonna try to force that all of the pods that are co-located are all within the same like memory address space? Like, so that way those internal memory interfaces don't have to cross the QPI, or is it gonna be the potential that, you know, socket ones, you know, six lanes, like first pod gets scheduled here, second pod is scheduled in the other one, even though those are vertically aligned, you're now still crossing the QPI based on MIMF because where these information ultimately lives. I mean, we're gonna have to like, tease this out and not want to get too bogged down because we'll just recreate all of the headache that OpenStack gave us. Yeah, so I think it's nothing to do with OpenStack, Jeff. It's just about the VNF or CNF being a very fat instance and is spanning across sockets. So that, but you're right, you know, that kind of low level details the NSM shouldn't worry about. That's why I was also kind of indicating that maybe we can solve it under the V switch layer. If a complexity is needed there, you know, not that I'm saying that we should go and solve all these issues. This is actually, so this has been super interesting and useful. I do want to move on because we've got some other things on the agenda. The one comment I'll make in closing is that what we've mostly been talking about here is essentially a pod placement problem at the end of the day. And the good news is network service mesh, that's not what it actually does. So we have an interest as a community in how this gets solved and signowed in the resource management working group. And I would highly encourage folks to participate in those spaces. And we certainly care, but that's not specifically our problem to solve. It's a problem we very much need to have solved, but it's not going to get solved in NSM. So one request yet. So the ENSN is a very common hot topic for the upcoming ODL demo, like our next talks and, you know, the ONS around panels, because there's always a, I mean, ENSN means one app interaction. I also see sort of the open stack VM interaction, right? Can you talk about that? So I think what you're saying is, could we trigger the agenda priorities so that we talk about the ENSN now because we're running out of time? Yeah, much appreciate. We have some slides too. We had some good discussion on the use case call yesterday and we have some updated slides actually. Yeah, if everybody else is okay, I'm happy to do that. Is it, does anyone have something else in agenda they'd like to see at higher priority than that? I think this is good. I would just like to say one sentence. So for the upcoming release date, they stayed the same. So end of April 23rd, as we planned, I am going to do on Center PR to update the sites with the proposal for a table to reflect that. Yeah, if you could do that. And if you could just insert in the meeting notes the salient dates, because I remember having them. Yeah, yeah, I will, I will. I don't remember what they were, which is part of why it's done that. Yeah, I am, I am, I am. Okay, let's go to the next one. Many thanks. Cool, would you like to share, Romki? Yes, thank you, Ed. Yeah, I'm good. My pleasure. Can you see my screen? Yes. Excellent. So in yesterday's use case call, we had a very good discussion. What Nikolay point of blue was like Ed, you several folks had kicked off a very nice document on E&SN. So what Prem and I did was sort of took it and then summarize some of our key discussion points just around E&SN. That's fabulous. Steal from me, steal everything from me. That's exactly what they're doing. Oh, thanks. Yeah, yeah. Thank you. So I'll get straight to E&SN's here. So basically the nice picture, so basically you had some... So just to add to, just to add to Romki, it is the need of the hover Ed. We are struggling a bit with OD also. One of the reasons here. Okay. Yeah, sorry, Romki. So the key message here is E&SN. It's a gateway or external controller function as depicted very nicely here. So basically on the northbound side, it's the E&SN protocol here. On the southbound, it's these equipment-specific APIs. That's what E&SN are about. Exactly. It speaks whatever the hell. It's not our business. Yeah. So now, so what we did was essentially took this and then mapped out certain specific scenarios. We'll walk through them. So one case is essentially, I mean, we wanted to start small, not blow it up, drive through examples. What he said was, let's take one case, SRIOV unique VLAN per VF, so complete hardware slicing, right? Complete. Where essentially what we're really saying is, hardware in this topology, hardware port is already nailed on a specific node and PNF. I mean, basically the port is known, where traffic is coming from, everything is known, predetermined. And what happens is E&SN, in this case, the key to note is E&SN exposes one and only one endpoint, right? Thanks. And if you look at the control flow, E&SN requests E&SN, E&SN assigns the VLAN ID. In this case, you're talking still VLAN. Very, very simple. Yep. And then in this example, we said, let's let it assign VLAN ID 100 and VLAN ID 100 gets programmed with a data plane on both sides, right? On this side, on the NSN side, on the port and also on the PNF, right? Yep. So far so good. Very simple. So now enters a little more complex scenario. Uh-huh. So here, what we're saying is, there is no SRIOV, right? Correct. I mean, more, yeah. Interesting one, no SRIOV. Of course, hardware port is already nailed on the node and PNF. And the same thing, the key is E&SN exposes only endpoint still. So basically, as you can see, you have two functions to deal with from E&SN side, the gateway and the PNF, right? Okay. But E&SN, as far as the E&SN goes, it still exposes one and only one endpoint. So the first step here is essentially establishing a tunnel between the port and the gateway, that's the first step, right? Where E&SN finds that VXLAN ID, right? For the tunnels, right? Another next segment. So what you do is then, at the end of it, you create this tunnel between the port and the gateway, right? Correct. And as far as the port goes, you have no idea that you're connecting to the gateway, by the way, right? It's all completely accepted. You're connecting to something. Exactly. It has no idea what's happening. He doesn't even know there's a PNF on the other end. That is correct. Yes, yes. In fact, yeah, we can take this picture and maybe putting a whole box around this. I think that will send a better message. I'll do that. I think that may be a good idea. And there are some boxes in some of the more recent slides that are sort of like nice color coded. This is a network service stuff. And it may even be the case that your gateway box here lives inside the physical network function in some cases, right? All it's really doing is terminating the tunnel, correct? That is correct, exactly. But yeah, but I thought it would be good to depict a disaggregated scenario and put a box around this, right? Yeah, we'll do it. Yep, yep, yep, yep. Communicating is hard. So, so next. So once you have done the VXLan creation, so now goes the second part, right? So it's not over it. So now you are using that VXLan tunnel to signal the VLAN ID into it, right? More interesting. So basically what you're doing is you're still talking to ENSM all this. So now, ENSM assigns VLAN ID. So basically you fix the VNI. Remember you fix the VNI. You know which tunnel you're going, right? Now we're generating a VLAN ID, 100 for that VNI, right? That's what ENSM gives out, right? And interestingly enough, you have to remember that, hey, from here you're sending VLAN ID 100, VNI 1000, goes here, right? And then the gateway could do any translation. You have no idea what it could translate to. It could translate basically. None of our business at all, agreed. Correct. But I thought it was very cool to show the end to end, to show, I mean, a real deployment scenario, right? Yep, yep, yep, yep. No, I get that. One thing to keep in mind as we look at this, and this is something that's super counterintuitive because we're used to thinking about things like VNIs particularly as point-to-multi-point concepts. And so they end up being quite a bit scarcer. The truth of the matter is that when you look at that original tunnel, if we let's bounce back to slide five really quickly, because it's easier to explain there, it's a simpler slide. If you bounce back to slide five, that tunnel is actually not parameterized by a VNI. That tunnel is parameterized by a source IP, a desktop IP and a VNI. Those are the three parameters that uniquely specify that tunnel. Yep, yep, yep. And the reason this is super interesting is that the space of VNIs, which you must select a VNI, is not the global space of VNIs. It's the space of VNIs between that source and desktop IP. That's it. And that ends up making the problem much simpler because the multiplicity of things you have available is enormously larger. And so the likelihood that you would have to do something like add additional layers, like VLN tags along that tunnel is much smaller. Not saying it doesn't happen. I'm sure there will be cases where it happens, but the likelihood is much smaller. In fact, Ed, thinking through this and another question came to my mind. So that was one of the related topics. So basically, so the key is essentially, this is very akin to sort of the MPLS, a label allocation strategy. If you go back to MPLS, so basically the downstream router will assign the label, very similar, right? So now in that scenario, of course, if you, the MPLS world, so what happens with each router is the full-blown headchair, right? So basically, you can be assured that it's highly available because each, I mean, imagine very simple, you're talking to the gateway, which is a router as an example, right? And that implemented in SM, it's got headchair functionality, right? So for example, if one control plane unit goes down, then you have the backup control plane to make sure that things are stable, right? From sort of a control plane label exchange, or in this case, ID exchange perspective. But if you come here, what happens is like, basically if you're letting the node, like you said to your point, assign the labels, like basically it's all of local significance. But the question is, what if that node goes down? I mean, we have headchair, but of course at a global Kubernetes cluster level, not at a node level, right? So how do we handle such scenarios, right? Well, so think of it this way. Effectively, the way we've handled it today, the current resilience and story is, if I'm a network service client and I have a connection that takes me somewhere, right? And I don't really know what happens inside that connection, I just show packets in and they come out at the other end and vice versa. If the network service endpoint that I am talking to goes away, right? The particular instance goes away. What network service mesh will do today is it will attempt to auto heal. And what auto healing means is that it attempts to go and establish a connection to a new network service endpoint that provides the same network service and has the same network service selection criteria. In the hopes that things will be more or less okay. Now, when I say more or less okay, let's be super clear. If the network service endpoint is stateful, something like a stateful firewall, unless the guy who wrote the stateful firewall wrote it so that state could be shared among replicas, you're going to have some state loss there. But network service mesh will auto heal those connections to whatever the network service endpoint is. So there is resiliency built into the system there, definitely. So the point to note here, it is like the ID itself, right? So basically, let's go to this very simple example, right? This simple example, even simpler. So basically, the IDs, like you have some running ID generation here, 100, 101, et cetera, right? And then this function goes down, correct? So basically, this is the one doling out IDs. So you, and this was the one managing these IDs, right? Correct, this will an ID space. So now, and then now this node connects to some other node and moves ahead, right? So basically, then that will start giving out IDs from a different space, right? Not necessarily these numbers, correct? And this- Yes, yes. Go ahead. So there, it's just that probably we have to work out the scenarios, how this is all going to line up, right? I mean, basically, like, hey, this gave out like, say, 100 to 200, now that region is unusable, right? From the spot perspective, because that function, that node died, and then now it connects to a new node, we have to see what that new range, that don't start, right? Correct? Which may not be the same. Let me ask the simple question here. In this picture, who do you think is assigning the VLAN ID? The ENSM is assigning the VLAN ID. Okay, good, good, good, good. That's my understanding as well. It's always good when we share the same understanding. And you're saying when the ENSM goes down, then the next place that someone tries to connect to may be a different ENSM, which may have a different set of VLAN IDs that it's allowed to manage within this context. That's considered, yes, exactly. It may completely different range, yes, yes. Yeah, and the way the current auto healing would work is essentially the pod still has its kernel interface. The cross-connect to the tunnel would just change, right? So as long as it is the case that the other ENSM does whatever has to happen to the physical network, such that now let's say a VLAN ID 200 is assigned, goes to the same or an equivalent physical network function, the pod never sees any of that. The pod literally never sees the VLAN ID, right? It doesn't know. Yeah, ideally it should never see a VLAN ID. It should just deal with the V-switch locally. Yep, I do apologize. We're running up against the top of the hour. Yeah, so what I'm saying is, these are all like, it can get a little tricky depending on different scenarios, right? I mean, basically in some, there may be a V-switch or there are SREOV cases, this is some details you have to really work out how this is all gonna come together. Because typically, how this is all handled, it's all through some global ID management, right? But this is basically taking into account all the policies. If you recall, we are discussing the use cases, right? In some cases, you're assuming like, hey, these are all point-to-point, some it's a little, I mean, slightly different from that. Yeah, so Nanti, is your proposal that the downstream guys should allocate and that's a better model? And it kind of takes a page out of the MPLS? So I'm also, I would love to think slightly beyond just a simple downstream allocation because so the point to notice like, no matter what we do, right? Even if we do like a point-to-point allocation, we are reserving VNI, sort of from a no V-switch perspective, right? And we have to see the global effect. At least to my mind, I don't think we have worked out all the scenarios, when things go down, how the ID management is gonna happen across. We're gonna have to cut it short at the moment because we're already five minutes above and we're holding people back at the moment. So can you add this to the meeting notes for next week and we can continue the discussion? Yep, yep. Cool, thank you so much. Thank you very much. These are super good things for you to bring up, Ramki. They are things we need to walk through. I think at the end of the day, they don't end up being tricky. They're just the things that make them simple are super unfamiliar. And so it's really important to talk through and make sure we got them nailed down. No, exactly. That's what I said, because you're getting to the next level of detail on implementation, right? Want to make sure this is all fully understood and nailed down, including what are the different strategies for handling these complex policies around isolation and all those. Cool, cool. Take everyone, see you all next week, same time. Thank you, bye.