 Hey guys, hi Ed. Hey, just better have I mute myself. Yeah, Zoom is muted by default in comparison. This is exactly the right answer. It should be muted by default. I love that laugh. Hey. Hey, hi guys. Yep. Hi Nikolai, hi Frederick. Good morning. New persons, join it, hi Denver. Yeah, I know. It's good to have this open. Would somebody be willing to share the issues board? Let me see if I can find it. Can everyone see the issues board? Yep. Yep. Awesome. OK, my experience has been that it's useful to work this backwards because you sometimes discover that things need to move from one column to the other. And if you start with the later columns and move towards the earlier columns, then you don't end up repeating yourself as much. Should we start working backwards? We've got a bunch of stuff that sort of landed in the past week. A bunch of PRS have gone in. We've seen some things like the ad, set Mac kernel, stuff land, set kernel, rat kernel, et cetera. Anything anyone would like to particularly highlight in that grouping? OK, shall we jump into the in progress? So create get Mac network server chain element. I think this went in this morning. Is that correct, Denise? Yes, it is merged. Excellent. This was the piece to make sure that we could go and fetch the Mac address and populate the destination Mac when a request comes into a server. So that's very useful. Andre, I know that you started poking with the create metrics network service chain element. Still in progress. We arrived at an interesting discussion about exactly where in the API that should go. And I think you were queuing up some stuff to discuss in the community meeting. Yeah, at the moment I'm experimenting with it still. I think putting it into the pass is not quite right because amount of data could be quite huge. I want to have similar structure like path but for the metrics and just send it with an update event, a separate map. OK. Very similar to the connections, but just with the metrics. And right now I think the propagation and proper handling into the chain elements. OK. So yeah, interesting thing is to limit with the interval how often we could receive the metrics by the client. So client will be able to, I think, configure the interval for receiving the metrics. And if the chain is quite long, every item will send the metrics with the interval configured. So if it will be more often and time mismatch on a different endpoints, finally it will not send more than just configured amount of metrics. So one thing I had been thinking, and this may or may not be the right answer, was that because we are periodically refreshing the connections so as to avoid timing them out, effectively when I send the connection through the system, this ends up bumping the monitor as it goes through. And the thing is changed. And so if we attach them to the path segments in connection, then a client can basically pump for more metrics at will. You can essentially say, look, send the request through. The request comes back, gets a metric on the request, the monitor gets updated, et cetera. But passing all the connection with all tokens back more often could lead us to a huge amount of data transfers. No, and that's definitely something we want to think about and consider. Now, the other thing I have to ask is sort of what are we likely to actually have in terms of metrics here? So for example, I think because we're dealing with virtual wires that the meaningful metrics are probably going to be something like packets receive. Basically, packets receive, packets transmitted, bytes receive, bytes transmitted, and drop are probably going to be the interesting ones to allow you to debug. So it's not a huge amount of data. The overhead, frankly, of sending it back is a greater amount of data than the actual data itself because you're putting it with other information and sending it back. Yeah, but for all the chain, for example, if we have multi, like VPN, multi endpoint on every application. No, no, I understand it's definitely something worth looking at, and I'm glad you're thinking about it. So I'm kind of excited to sort this out and figure out what we want to do there. Anybody else have anything to say on this sort? Cool. All right, so that's still in progress. Migrating Kernel Fortress, a new style cross-connect network service stuff. Rotislav, I know you were looking at this and so is Ryan Tidwell, who I don't think is here. How is that going? Yeah, I saw his kind of initial work and actually I'm really thankful for your feedback on improving that. Yeah, I'm sure it was not the best, so. It's, there are two things happening here. One is, we're trying to maximize code recycling, which is generally a good thing. And then we're also sort of shifting the style with which we're doing a lot of these things. So, yeah, and trying to keep things very sharp and crisp and so forth. So, but yeah, it was good to see some forward motion on that, because I know that the SRIOV, talking to Ryan and to Prem, and I don't remember how this conversation with you as well, there's a certain amount of stuff in terms of the kernel SDK that'll be recycled a little into the SRIOV piece, because for example, same piece that you use to set the IP addresses on a kernel interface that happens to be part of the VEVE pair is going to be the same set of code that you probably are going to use for setting an interface on an SRIOV backed kernel interface. Yeah, indeed. And also the neighboring and everything else. Yeah, I know, like the neighbors routes, all that stuff, right? Yeah, yeah. You'll be pulling that forward. So I think probably the faster we get that going in kernel forward, the better we're gonna be. And the good news is it breaks into small pieces. So you can go work on individual small chunks of it. Yeah, yeah, I agree. Cool. Awesome. Create authorization network service chain elements. I've seen some stuff go by from you on the Celia, but I'm not sure quite where we are with all the pieces. I know that Frederick had some comments and so how is this going? Where do we stand on this? Do we have Elia this week or is he out? He worked it today, but I have not see he connected. Okay, cool. So if somebody wants to sort of poke him over Slack or whatever and we can loop back, he turns up. Yeah. Cool. WireGuard remote mechanism support. Frederick and Ardon, I think you guys have been kippetsing on this. Do you guys have things you'd like to say? Yes, I'm almost finished with WireGuard on kernel forwarder and we need to discuss with you about implementing WireGuard into VPP if it's required. Yeah, I mean, that's really a question of what folks want to do. But I mean, it would be interesting to add it as a thing to VPP as well. Yeah, we can make a plugin for VPP and use it just for net service mesh. Maybe we can make a plugin for VPP agent instead of VPP. Well, I think it ends up being sort of two steps. One is if you want to run the data path through VPP proper, VPP itself has plugins and then you need to get the stuff up into VPP agent so that we can actually conveniently poke at it. As I know, VPP agent has some kernel stuff inside already and not using VPP directly. So I think we could, if we have a kernel already, maybe we can reuse the same code, but VPP agent. I think though most of the stuff that they're doing there is in the service of being able to deal with the things that are plugging into VPP. So I don't know how thematically interesting that would be to them. So I think this is interesting though, kind of sort of like breaks up the question. The first is sort of like, where do you want the data path passing through and the work that I think your Artem is currently doing is getting Wirecard working just on top of normal kernel data path. And then the question then becomes, do you want to be able to run that data path through VPP? And if you do want to run the data path through VPP, you need a Wirecard plugin for VPP so that you can actually handle that in-cap. But that's one thing, I do have any comments on this, Frederick, you've been very quiet, possibly muted. Definitely muted. Yeah, sorry, so I was trying to shuffle multiple things. So can you give me a little bit more context so I can respond properly? Yeah, so we were just sort of convincing a bit about the Wirecard stuff and Artem had said that he's got it working through the kernel data path, which is awesome. And please note that you can work things through whatever data path makes sense for your board or you don't have to work through VPP. But the question of whether or not and how to work it through VPP had come up. And Artem had opined that if we wanted to have a Wirecard data path going through VPP, we would need to do a VPP plugin for Wirecard. I see, yeah, so that's one possibility. A second possibility is we could also create something that then interacts through something like MEMIF. And that then separates it out from VPP. So anything that supports MEMIF can make use of the path without having to drag VPP in as well. In terms of which path is more appropriate, I mean, I'm personally okay with either particular path, but my recommendation would be to take a look at the overall complexity of both of them. Like, I don't know how complex this is to set up a VPP. So quick question, is Wirecard running, is Wirecard constructing the packets itself and using raw sockets to push them down to a kernel interface? Could you repeat the question? Is Wirecard constructing the packets itself and pushing them down to the kernel interface? I guess it's not clear to me, like in the case of the kernel data path for Wirecard, it's not clear to me how the mechanism of that works. Yeah, it's like Wirecard first making handshake with endpoint and then it's encapsulate packets by itself. Like, I think weeks long works almost the same just without encryption, as I understand it, and without handshake. Okay, so let's talk a little bit about sort of the life of a packet. So I'm an application and I grab an interface that I've got that I'm sending, I'm opening a listening socket on listening for TCP. I presume that interface somehow hands off to Wirecard so that Wirecard can do the end cap. It's doing end cap and then send this packet using UDP. Okay, so yeah, you're right, Frederick, it might be a better intermediate step to try and send that using Mamias because we can then reuse the stuff the Wirecards guys have got in terms of building the packet. Yeah, and to get the mechanics a little bit more tight in terms of how it works on the kernel side. So you end up with a particular interface. So when you create a new Wirecard system, it tends to create a new interface within the network namespace that you're currently active in. And so when you send a packet, exits out of that interface, then the kernel mechanism will create or the kernel module will create the UDP packet and then inject it back into the network namespace which then leaves out of your main interface that's within that particular network namespace. And so it definitely creates the packets. One of the questions then becomes, how do we capture the packet in that particular respect? Or in the longer, my hope is that we could use something like Borington or one of the other Wirecards or one of the other Wirecard applications that exist and just tweak it so that we can, and so we're using the kernel interface. We generate the packets using the mechanism of our choice. Okay, cool. So it sounds like there's a lot of interesting work going on here. I'm super interested to see where the first piece lands. Cool. Awesome. Can I suggest something? Please. Would it make sense if we have dedicated Wirecard for order? At least initially, I think that's probably a smart move. Does that make sense? This way you can combine it with either VPP or with kernel and just install both of them. I know that today we are not the best at supporting most but like if a client or a server or endpoint suggests or needs a Wirecard interface mechanism, then the Wirecard folder will be the only one supporting it. So it will get automatically selected, I think. Yeah. So I mean, I think that's probably a really, it's definitely a smart way to start, right? Because then you just deal with the simplest route between A and B to get to Wirecard. And then once we've sort of got the simple thing working, you can decide what makes sense. Because there's two sorts of pressures in the system, in my mind. The first is if you just go do a Wirecard forward or it's the simplest way to get something working with Wirecard. And then the second piece is, I think users are going to want to run as few borders as they can get away with to solve their problem. And so you want to be able to reduce the number of forwarders that they run. But I would say that the first pressure being able to just get for A to Wirecard working is the one that's strongest at this stage. And we can always come back and incorporate it into existing forwarders, particularly the way the SDK model was being rolled out where you've got small chunks of things. Does that make sense? Yeah, of course, yeah. Yes, I think that's actually a very interesting point. Yeah, I've got a couple other paths as well that I can think of on this. But perhaps what I should do is I should stick a variety of approaches I can think of on a Google document so we don't eat up all the time in this. And we can pass it around and solicit feedback on it. That's actually an excellent point because we've actually got, we've only got 10 minutes left on this call so we should probably get moving. So add NSM test suites to reduce cluster count on CI. This was something, let's see. Do you wanna say something about this, Denise? Oh yes, this PR will reduce CI time and reduce cluster count. It will add possible to create test suites which will reuse NSM and forwarder pods in a couple of tests. And we will save time on deploying and cleaning NSM and forwarder pods. I remember, yes, I remember now. It's about 15 seconds per test. Which several hundred tests is a big deal. Yeah, I remember this now but basically right now every time we run a test we're standing up network service mesh and we're tearing down network service mesh. And with this we would have suites where we would stand up network service mesh and we'd run a bunch of tests on it and then we would tear it down. Oh, yes, currently this issue is blocked. I just wait for a new release. Andre, how's that going? Yeah, yeah, we'll try to do it as soon as possible. Just wanted to do some cleanups for cloud testing too. No, that's fair, that's fair. Okay, cool. Adding the SRIOV mechanism. This is stuff where we've already got a PR that started landing from Xemic where he's starting to bring the SRIOV kernel and user space mechanisms into API. I know that we had sort of a really good back and forth conversation there, Nikolai. Where I left it, it looked like you might be okay but it wasn't clear. Yeah, okay, I will have to check again. That's what I wanted to add. Like this is the very first PR which actually we also had the very first heated discussion about it. I still somehow feel that SRIOV is not the right naming of the mechanisms there. Okay, no, that's a very valid thing, right? I wanna get that settled out. That's why I sort of turned back and said, okay, we've had conversations about some of the big issues you raised. Where are you? Cause I had lost track of the naming issue even though I think I agree with you that it's probably a good one. Yeah, I don't think I haven't seen any further comments there. Okay, so we can take that conversation back to the PR. So in general, I know Radislav, you've been looking at some of this as well and Ryan and sort of moving some of the pieces forward for this. Any progress on the stuff we discussed earlier on the kernel bits from you guys? Nothing on my side. Yeah, I don't hear any remarks on that. Cool. So we had a PR that came through from Peter on initialisms and this basically came down to his suggestion that we, you know, he was suggesting that we basically alter a little bit on the API, how we're handling capitalizations for, you know, basically things that are initials, initialisms. So like the one he pointed to was we had net NS and he was suggesting capitalizing both the N and the S as they stand for namespace. So I would suggest both speaking on this. It's an interesting conversation. We probably want to start out how we want to handle that. And we've already talked about the metrics stuff. So moving on to the to-dos. Is there anything in progress that didn't land here? Okay. So moving on to the to-dos, examples sort of test OPA use cases. This is stuff I think Ilya is working on. Frederick, is that correct? I believe so. Okay. So the create the network service registry client to add the pod name, node name and possibly cluster name labels to registration. This is actually a great starter issue because it doesn't actually, it's not particularly hard, but it would involve running a little chain element that we could stick in clients that would add these labels to them. So is anybody interested in picking this up and going and beating on this? We can bring it up in the next meeting as well. Yep, agreed. This one also very similar and also a good first issue. Create authorization monitor chain element. This is basically saying we should probably do authorization for monitoring, which is important. And then I realized that we don't currently have sort of the core elements for monitoring to let us chain them easily, which is probably something we're going to want. So that issue is there. Porting the SRV-6 mechanism. I know you finally landed the SRV-6 stop item in the mono repo. It would be good to eventually get those pieces ported over as chain elements in SDK VVP agent. Okay. And then package core adapters can't be used with package next. This was a really smart catch, Denise. I think that we've merged the solution to this. Is that correct? Yes, it is correct. Cool. So I'll go ahead and close that issue then. And then you also had brought up and we wanted to talk about how we're gonna handle integration tests. Oh, I have closed this issue because I think RIP by planning will solve this problem. Cool. You raised an important point. I wanted to make sure it didn't get lost. So anything else that we're missing on the board here. So one thing I do want to sort of be very, very clear about is that we, so it turns out that GitHub has just added a new level of repo access that they're calling triage, which is lovely because it means that people can be added to the community who can go through and assign things to projects and assign themselves and others to bugs and that kind of stuff. I think most are all of the folks on the call here who are actively working on stuff have been invited to join the network service mesh contributors team, which has those privileges across the repos. So if you have gotten such an invite and you haven't responded, it would be good for you to respond. If you haven't gotten such an invite and you're working on things, please let me know. I may have missed some folks, but that way you can make sure, for example, that your stuff lands on the board in a relatively straightforward way because if you go, for example, to an issue, you can just click and add it to the issue PR tracking project for an issuer PR. That way it all sort of bubbles up to visibility. Cool, any other questions? It sounds like we're about to start the next meeting. All right, so we'll carry over and allow a few minutes for folks to join. Fantastic. See how we use the next five minutes to make coffee. For folks just joining the call, usually we start about five minutes after to give folks time to join. If you could please go and add yourself to the attendees list. That would be fantastic. No, that's fine. A reminder to everyone that this meeting is recorded every week and posted to YouTube. So do keep that in mind. Be a little cautious with what you share, et cetera. And speaking of which, is someone willing to share the weekly meeting notes? That would be super helpful. Yeah, we'll get started in another two to three minutes. Yeah, I always enjoy the backdrop that when I see the giant NSM behind people, like you said. So for those of you who don't know how to do that, there's in the Zoom desktop version, there's an option to set a backdrop and you can load an image of your choice. So feel free to do that for, even if you don't choose not to use NSM, you can choose your own favorite backdrop as well. Okay, let's get started. So welcome to the next Network Service Match meeting. We hold this particular meeting every Tuesday at 8 a.m. Pacific time. We also hold a Asia-friendly meeting every other week. Nikolai, do we have one this week or is it on next week? In my schedule, there's no one, although, yeah, it's the first and third, so not this week, next week. Sorry, not for the network's sake, for the NSM Asia-friendly one. So that means we have one last week, so there must be one next week then. So we have one last week, so we have one last week, so there must be one next week then. Yes, there is, yes, of course, it's every other week. Yes? Yeah, sorry, I was looking at them. My apologies, yeah. So there will be one next week at, I believe, 3 a.m. Pacific time. Also participate in the CNCF Telecom User Group. The next one will be on February 3rd at 8 a.m. Pacific time. There is also one on every, which is every first Monday, every third Monday, there's also one at 3 a.m. Pacific time. The Zoom is linked from here. The CNCF SIG network has been rebooted, which also occurs on every first and third Thursday of the month at 11 a.m. Pacific time, so we also participate in the CNCF SIG network. Now, the CNCF SIG network is a little bit interesting in this scenario because the technical over-cycle committee is starting to delegate the analysis of projects and especially inbound projects into a sandbox to the related SIGs. And so the SIG network is one area where it would be good to see a diversity of people and ideas so that we can get a maximum amount of input that has not started yet, but expected to happen soon. Actually, last week or actually the week before that we reviewed two projects, there were two projects that were presented as a SIG network that was the main topic there. Oh, cool, watch time quick, and I will go and find the recordings. Yeah, I mean, the thing started to happen and it's a really interesting introspection. I mean, at first it sounds like, okay, NSM doesn't have anything to do with that, but I believe that we as a community are trying to address a more general networking problem and it's really interesting to be on top of what's actually being proposed, what's happening, what projects are coming to CNCF. So I think it's a lot relevant to what we're trying to do here. Cool, and so major events coming up. In San Francisco at Go SF, I am going to be running a talk on cloud-native zero trust and the talk is going to be on how we can use the variety of CNCF projects together, such as network service mesh and Spiffy Spire and Open Policy Agent and a few others in order to achieve zero trust. And I'll also be talking about not only what works but where the gaps are as well. There's certainly some gaps need to be filled. So if you are in the San Francisco area in that time period, please come and join me and I will post more details as I get them. Can I suggest something if there is no recording, but I mean, if there's recording, if there's no recording, if you cannot, whatever materials you have to our events page on the website, because it sounds like a really, really interesting topic and yeah, sharing is probably worth for us. Sure, I'll ask them if they're recording it. And if not, maybe I can do something myself where I can record my own voice and try to produce something on that. But we'll see, worst case scenario, I can always do something afterwards and stick it on YouTube. Okay. Cool, so we also, so this is also right before KubeCon and FoundingCon Europe. So that will be from March 3rd to April 2nd, which will be in the RAI Amsterdam in the Netherlands. The schedule is going to be announced very soon. And if you'd like to see the list of collected some of the talks that we know of, Taylor has compiled a list that you can go look at. And if your talk is not on that list, feel free to add it. We also have an NSMCon, which will be co-located. We are accepting proposals to talk and there is a sponsorship perspective that is currently open. So if you'd like to talk or sponsor, please submit earlier rather than sooner is always better. But of course we won't penalize anyone for showing up later. And with that, we also have a larger room. So remember, those of you who were at last year's room, the room capacity is twice as big. So, and last year we had standing room only. So that'll be good. On the KubeCon, I know that there's at least one talk accepted from what I know, I mean, from the ones that we submitted. So I know that we cannot talk about it yet, but yeah, there's at least one submitted and accepted. Which is... We will have all kinds of... I think we can save the happy dances till after Wednesday, but Wednesday is going out and there's a lot of hope that we will have things go quite well. Yeah, because I think that up till now, we have at KubeCon, we get only the maintainers tracks, more or less. And if we have one that is out of the maintainers track, it's at least one, I hope a couple, then it will be great. But yeah. Cool, so we also have Open Networking and Edge Summit North America. And so the CFP closes on February 3rd and the schedule will be announced in early March. So there is KubeCon and CloudNativeCon China in Shanghai. That will occur in May. The call for paper closes in February 21st. So please make sure not to miss that if you are intending to visit China. I have a small update of these both, these ones. So I'm planning on Open Networking and Edge Summit to submit something. And I'm working with Michael, who presented his work last week here. And also we are trying to figure out something with Taylor. And for the KubeCon China, actually on our last Asian call, Jay mentioned that he's going to submit talk about NSM. I don't know if we'll be able to do any maintainers there, but that's still like a month. I don't know, who knows, maybe. But at least one talk about NSM will be submitted from the local community there. So yeah, that's it. Nice, so I'll also put something forward as well. Los Angeles is real close to where I'm at, so it's pretty easy for me to get to. We have Open Networking and Edge Summit in Europe, which is going to occur in Antwerp. We leave in the same location as last year. The call for papers closes, our call for proposals closes on June 7th. And the notifications occur in July, July 9th. We finally have KubeCon and CloudDataCon Boston, which the CFP will open on April 22nd. And they close in June. So even though it's a long way away, I started thinking about what type of things you would like to see, because when you are in KubeCon EU, it's a great opportunity to meet people who you can potentially present with. And with that, we have a couple announcements. We have a new NSM projects page. So if you notice, this particular project is an organizational level project, not a project level project. That was a good weird saying. And so we are going to keep track of issue and PR. We had a public call just before this one. So 30 minutes before this call, every Tuesday we are running the issue and PR tracking call. So if you would like to join in on helping us with issues or PRs, this is a great place to work out what's happening and to initiate contact for help. One of the things that we're going to try to avoid, and we can do a better job of this based on today, is to not try to spend time solving or fixing issues on here. So it's okay to ask for a couple, a little bit of help, but there's a lot of information you need to get packed into 30 minutes. It's specifically designed like this, so that we don't jump into the weeds when we're talking about it. So we also, with that, do we have Lucina on the call? I'll be doing the social media updates from now and when I'm available to make these calls. Oh, cool. Welcome. And please introduce yourself for the community. Sure. Hi, everybody. My name is Ashley. I am working with Folk Cooperative and have been helping out with a bunch of social media for Network Service Mesh. Thank you so much for that, by the way. You are welcome. No problem. Glad to help her. I can. So as far as some updates go, as far as the Twitter account, we have gained 10 followers since last week. We are now sitting at 665 followers. We've followed an additional two accounts and as far as tweets and retweets for the last week, about 19 of those have gone out. Included in these tweets, a lot of them have been NSMcon related. So reminding people and encouraging people to register, reminding them of the CFP deadline coming up, as well as trying to get the word out there for sponsorships. There have been some retweets from CNCF account, just further promoting KubeCon, as well as DayZero events and promoting the diversity scholarship applications. And then there have been a couple of blogs that have been retweeted from VMware, as well as network simulations with network service mesh, a really nice write up over there. And then the video recaps from last week's meetings. So those are all on Twitter and most of these are up on LinkedIn as well. And LinkedIn in the last week, we've gained an additional 10 followers. So it's been really good to see that we are consistently increasing the following across the board in Twitter as well as LinkedIn and hoping for further engagement as we start promoting more of the events coming up at NSMcon as well as any events that are accepted for KubeCon. And yeah, hopefully we will just continue to see that following and engagement continue onwards and upwards. So if there are any other announcements that need to be made, then please feel free to reach out to me and I will continue, yeah, like I said, promoting NSMcon and any podcasts that are to come up as well as any future events. Thank you very much for the update and also thank you very much for the help. Like this type of stuff helps us tremendously. Sure thing. Cool. With that, we have a very important topic on the agenda, which is our new repo pipelining. So, Ed, you have the floor. Yeah, let me go ahead and start talking through that a little bit. So we started as some of you are probably known breaking some things out of the motto repo. So let's actually go to slide. Next slide. So as we've grown with our UNO repo, the network service mesh, network service mesh, it's become very large and complex. And it's not entirely obvious off the back. Because if you just go through and you count the lines of code in the system, it's actually not that large, but thematically it's doing a lot of different things and that makes it sort of unwieldy. And the CI for the UNO repo is very, very long. So it encourages people to make larger changes at once because it takes so long to transit the CI. It overall discourages contribution because it means you can come and bring a patch and discover that, you know, you've got these long CI cycles and something goes bump in the night for reasons you don't quite understand. And it slows development velocity in general. And so these are sort of problematic. And so if you look at the current state, we've already started a bit of an experiment. So in the motto repo, we've got a CI time on the order of about an hour and 20 minutes in network service mesh monorepo. And we've done some initial pipelining. And what this means basically is we've got an EPI repo where we've relocated the EPI. We've got an SDK repo which has platform independent SDK bits, right? So there are no dependencies there that are specific to VPP agent or weird kernel dependencies. It's just sort of munging things because it turns out there's a lot of munging things that we need to legitimately do. And if you look, the CI time for each of these repos when you push a patch is sort of about a minute and 20 seconds. Some of them are a little bit faster, you know, closer to a minute. But generally that range. The other thing that we've managed to set up is we've set up and this has all been done with GitHub actions. We've set up some GitHub actions so that if a patch is merged into say API, automatically that gets pushed as a PR to SDK. So about 30 seconds after something merges to API, a PR pushing the API change forward to SDK comes in. And you can go take a look. And if it passes, you can just merge it. And in fact, once we gain some comfort, we can even set up the GitHub action to auto merge those if they pass CI, right? Although that's not quite what we're doing yet. And then when you merge something to SDK, including the updates from API, SDK about 30 seconds later, it goes and automatically pushes PRs downstream to SDK VPP agent at SDK kernel. And so you're just adding up the CI plus transit times here. You're looking at about a five minute 30 second max and not counting the any review time. And so if we were to get to the point where we were comfortable merging sort of clean updates from downstream, from upstream that pass CI, the transit time through this whole chain from an API change that ends up being harmless could be as little as five minutes, 30 seconds. And so this is what we've tried so far in terms of pipelining. I have great questions. The questions are good. So is this 30 second the time for the actions to actually get triggered? No, it's the time until the PR basically for the actions to come up, do their thing and finish. Okay. So basically when something is merged to API, that merge causes a GitHub action to run that GitHub action has to go pull the code for SDK, update its dependencies and push the PR. And that takes about 30 seconds. Okay. And then the other thing, because as you said today, this would be more or less manual process. So like API creates. Yes. Yeah. I know that because I have picked at the slides, as people can imagine. But I know that that probably you're going to talk next about these things. But if we do this automatically, then how are we going to track back like, for example, if something like there's an API change, then this automatically goes to SDK, then this automatically goes to SDK, VP agent, but then something breaks, then how do you revert back all the previous, you know. Yeah. And certainly something breaks. There's an interesting question about what to do when something breaks, right? Because there are a couple of things that can be true. Let's say that you make an API change, it floats to SDK that goes well, it bumps up to SDK VP agent. And so you get a PR there that has broken CI. Right. So you go look at the PR. It could be a couple of things, right? One of the things that it could be is that the API change actually requires you to do some work in SDK VP agent. It's not actually a breakage in the sense that, you know, that you got to go fix API. It's something you got to go fix SDK VP agent. Yeah. Yeah. Yeah. That's the happy case, right? Yeah. And in which case you would go and fix that. But the real question is, is how do you get the backtracing? And one of the things I've been poking at is how to improve the backtracing. Because what I'd like to do for improving the backtracing is basically to improve the commit messages and the PR messages as things float through the system to make it easier to backtrace. Right now they just tell you that it's one of the automated updates and what repo it came from. I'd like to be able to have it actually indicate, okay, it came from this commit coming into the repo. And here's the link to the PR for that commit. That kind of thing. So you could sort of chase it back quickly when you see a fail. But that's a little bit of work I still need to do. I kind of have, I kind of know how to do it involves a certain amount of parsing of environment variables because all the necessary information is there in the GitHub action, but not quite in the form you would want to go sticking your PR. Does that make sense? Yeah, yeah. Let's, let's just go through this and probably, yeah. Yeah. And I'm sure as we get experienced with it, we'll, we'll discover more ways to make it much nicer. So. So one of the, one of the nice effects about this as well is like, suppose that I'm depending on a particular component within, with an NSM. Or maybe I'm tracking master of NSM itself. If, if, if for some reason that the, the project fails, like there's an update, it fails an NSM integration test. You know, of course we have to go back and fix it throughout the PRs, but may depend me as a, as a person who depends on that on the NSM. Related package. Don't see a break because that break should be isolated in a, in a PR in most, in most scenarios. So, so there's, there are some, there are some mitigating circumstances that there's areas where it breaks, but because we're, because the versions are being locked through the go, through the go mod system, you know, we're not, we're not implicitly saying always send me the latest. We're saying if you're relying on something, you're saying send me the latest one that is known to pass all tests for that specific component. So just as an example, you know, yeah. So basically that, that, that I think it's to be quite good. And in hand and glove with this, by the way, is we're probably going to do a lot more unit testing at the individual repo levels because we would ideally like to minimize how much stuff we catch in integration. Cause we want to catch things earlier rather than later. So this is sort of going to the proposal for where I would suggest we go, we're actually too part of this already, right? So we have an API repo now, it auto propagates to SDK. And that's for the top level API is involving network service meshed off. Then, right, then we have an SDK repo. Now, when something gets merged into SDK, it should auto propagate to the platform SDKs, right? Things that have platform specific code because they will depend on it. And then it may also propagate to command repose. And we'll talk about command repose in just a second. And examples of this would include things like, you know, sorry, back one. Examples of the platform repose would be things like SDK, VP, agent, SDK, kernel, SDK, SIOV. And those, of course, when something merges there would auto propagate down to whatever commands are dependent on them, which is not going to necessarily be all of them. Next, cool. So, and then the idea with the command repose is to have various commands one per repo. These are the places that would publish Docker containers initially to a staging Docker registry. So they could be pulled by things further downstream, but eventually, you know, but effectively that's what's building and putting together the Docker images. And that would auto propagate to things that are more packages like Helm or a Helm repo or operator repo. Could you go back one? I mean, the idea would be that these would be for various single things, right? So like the Kubernetes network service manager or a Kubernetes forwarder of some kind would be examples of this. Okay. Next. Yeah, a quick question. I assume that this, the CMDs include also the various containers, the prefix service that we have now and everything that's. Yeah. So like NSM and it would be an example of a command repo, a proxy network service manager would be an example of a command repo, that kind of thing. Okay. But it has the nice effect that they only end up depending on the things that matter to them, right? So for example, the case, the network service manager command repo probably doesn't depend on SDK VPP agent. Right? So it's not going to get updates from SDK VPP agent. Okay. Hey Ed, another question here. Is there a place where the end state of this repository restructuring is laid out as I was going through these slides, I had a hard time teasing that out, which kind of confused me as I was looking at understanding how the pipelining would work. Okay. So let's dig into that for a second. When you say in state, I was attempting to not necessarily successfully manager to sort of build out to towards the end state, which I think we're getting relatively close to, which is essentially you wind up at a place where you have a repo with your integration tests, and then you have repos for various platforms, specific things like I'm going to go execute this on packet. And the actual integration tests run in the integration platform repos. You know, so I'm going to go run this in Kubernetes and package, EKS, or maybe I'm going to go run this as, you know, K8's open shift, right? Or K8's something else. And that's where we eventually get to the integration testing to figure out whether or not, you know, what we've actually gotten has trickled through successfully to fully integrated system. Did that at all answer your question or did I miss your point at all? Yeah, maybe I didn't phrase it right. That was still a useful answer though. Not all answers that don't answer the question are not helpful, right? You can be helpful and not answer the question. Exactly. No, it was, my question was a little more simplistic as we're breaking apart the unirepo. Is there a place where, is there anything I can look at that would kind of diagram how that unirepo breaks apart? So you mentioned like these command repos. Now that you've talked about it, what I'm seeing here makes sense. Like we have a command repo and then we have an integration platform repo for the different platforms. Like what does the network service mesh project look like as far as like the repository breakdown? That's the end state. What maps to where out of the unirepo is I think part of what you're getting at? Yes, exactly. That's a more succinct way to put it. Okay, no, that's actually a really valuable ask. And I feel slightly silly for not having done something like that. I'd be happy to go and do something like that and come back, but it sort of involves sort of picking through the directory structure of the unirepo and mapping that to the kind of repo that it would turn into. Does that make sense? Yeah, I think that would be helpful. And I kind of wish I had had that context before trying to digest this. Yep. And there's a comment here. I think you made a comment in the chat, Peter. Do you want to speak up here? Do you want me to read from the chat? I'm happy either way. Some of you can read that. Okay. So it says the chain is fine for changes like additions in the API repo. But what about changes which are changing method names or parameter types? They'll require an update at least on the SDK level. So anything merged to API will block new additions until SDKs will deal with the update. And I think what you're getting at is if I were to go, say change the name of a method in API, I'm going to have to go and quickly fix the downstream. Otherwise new API changes are going to be blocked waiting for me to fix the thing that I pushed into the downstream. That's exactly what I had in mind. Yeah. I guess the point would be, yes, we're going to have to either act quickly on that or if nobody is willing to change the downstream, we may have to back out the particular API change that nobody is willing to fix in the downstream so that we can get ourselves unblocked. But yeah, it's going to require us to actually be somewhat vigilant. Yeah. Related somehow, I think, to this. So today, for example, when we switched to multimodal repos we had challenges when we wanted to update Kubernetes or let's say any other, I don't know, the logging frameworks and whatever. So how do you think this is going to work here? I guess that the API is not particularly dependent on Kubernetes, for example, or any piece of Kubernetes. It actually makes things in some ways a bit easier because we're trying to keep things more targeted. So for example, one of the things with API is it has relatively few dependencies. I think it basically depends on GRPC and ProtoBuff and maybe a bit of logging stuff. The same is true for SDK. The platform-specific pieces start bringing in a little bit more. I would expect things like Kubernetes dependencies to come into the command repos, quite honestly. That's the point at which you're dealing with Kubernetes stuff. So it does mean that if, say, we wanted to bump our Kubernetes version, we'd have to bump it for the appropriate commands. But let me sort of, yeah, that we would have to go bump it for the appropriate command pieces in the system, yes. Which might be, I don't know, dense at some point. Potentially, yes. I mean, one of the things that this will do for us is it will let us manage our dependencies a fair bit more tightly. So for example, there are already some things in the CI for the existing pipeline repos that will go and check to make sure that certain things haven't crept into the Go mods. So for example, if you look at SDK, if you look at some of the existing repos, they're making sure that we, for example, don't pull in a dependency on the mono repo because that sort of unwinds everything. You know, that kind of thing. So we can manage our dependencies tightly here. How about if, for example, I don't know, I think that logging is probably the most used, like dependency will be used all over the projects here, the different repos. So hypotheticals in the situation. I want to fast forward my CMD, I don't know, Kubernetes networks, NS manager. I want to fast forward it to the next logging version, but I don't want to move my API. Do you foresee any problems here? I mean, you only did it really, you would normally have Go related issues in it. So one of the things that we actually are doing is we're, that there's an option when you update Go dependencies to update a dependency and its dependency. So for example, when SDK pushes to the SDK platforms, it pushes an update not only to itself, but to the things that SDK depends on. So for example, if I were to come in and update the version of protobuf that I'm using an API, that would get carried by the API's push to SDK and it would propagate through the system as well, because probably is the case that we don't want the common dependencies back tree to go out of sync. I mean, it's messy if something, if a sibling repo here that you don't have a dependency on comes out of sync, but you really don't want to have like 16 versions of protobuf running through the system. And that will automatically resolve itself currently as the updates get pushed. Okay, one more question here. I think that this whole discussion is really important for us as a community to be aware of what is planned and how we're going to tackle the potential problems. Figure out if we want to actually do this and to talk through it and figure out the right way to do it. Because even if we want to do something that is vaguely shaped like this, it's not at all clear for me that like exactly as I have presented, this is the right breakdown of all the individual pieces. So my question would be like, so what if I wanted to add the example somewhere in this chain? Where do you think that they would fit and should we consider splitting them in also like CMD, but it would be like example dash, whatever, Ripple's, but this means that at some point you end up lowering it. Yeah, I mean, it's really a question of how you like to split up examples. I expect that examples would be something that SDK and possibly various the SDK platforms might push to. You know, so for example, you know, I know that many of the examples depend on the PPA agent. So it's very probable that examples would be something that gets pushed to by SDK VPP agent. And then you could decide if you wanted to break up the examples into separate repos or not. That's sort of your choice. Yeah, I mean, I don't think that we should resolve it now, but it's just like, yeah. Yeah, I mean, there are definitely options for that. But yeah, I mean, it also, it sort of forces the issue of when some dependency breaks things. Because I know right now if examples falls out of sync with the mono repo, literally that can go on for a while before anybody notices if we aren't vigilant. Or as here, if it depended on SDK VPP agent or SDK or whatever, like you immediately get a PR that shows your breakage. Yeah. The same situation should be with documentation. So it's extension of examples. Example. Okay. Cool. But should we move forward? Yeah. Yeah. So I want to talk a little bit about failure detection and remediation, because this is sort of an actual question. And we talked a little bit earlier about the case where it's just legitimately, you're going to have to go and clean up the next step in the system. Right. So you make a change to the API. You're going to have to clean up the SDK. But this is sort of the case where legitimately you broke something. So let's say that somebody, a PR has merged into SDK platform for some platform. And that propagates down to the various commands, and it ends up propagating through the system. And the unit tests at each level are perfectly delighted with what's happened. And then you eventually end up propagating to the integration platform pieces. And one of those fails. Right. So you've got a failure. Sorry for the interruption, but am I correctly reading that that in the home, you know, we will have some particular tags for images. So something new instead of what you have actual is because we have latest almost everywhere. And that sucks a bit. So we, I didn't quite understand when you said in helm that you are images, which are appointed to the latest in the repository. And that leads us to the problem that, for example, tests cannot be replicated after some time. So that's why we're trying to run off test because image was updated. Okay. So basically the idea would be that when we get to actual release images, you would go and basically push the release images, but on the helm charts, effectively with things like home charts and whatnot, you'd want to push a helm chart for a particular state of the system that's been merged to help. Exactly. That was my question. Yeah. So you actually should have a version of the helm chart that you can go point back to that was exactly the version of the helm chart at that point. Not only home chart, but also the version of images to the test actual PR. Yes, exactly. Right. So you will actually be able to say this integration failed that it failed on this helm chart, which you can point explicitly to which contains the explicit version of the images that failed. Exactly. So it should make recreation quite a bit easier because we actually have a record of exactly what it was that was tested, but you can go back to. Isn't it a matter of rebuilding them like locally because currently we delete all the old images. We used to keep them, but then our Docker Hub accounts became like... We don't actually delete them. We put them in a different registry. Okay. Oh. So the deleting images is a massive pain in the ass. Exactly. Yeah, it's extremely hard to delete images. And so what we actually ended up doing was we cleaned out our, our publishing registries so that they only had the release versions. And then we actually have separate registries that we use for the... Archiving. Yeah, basically for the image by image builds that we do. The other nice thing here though is right now, even on the CI registries we're using, if a command hasn't actually changed, then we don't rebuild the command unless we don't get a new Docker image in this new model. So if, say for the sake of argument that, SDK VPP agent settles down and the command forwarder or VPP agent settles down and doesn't change for the next three months, we wouldn't rebuild it and we wouldn't have a new image pushed. But anyway, you sort of get to the point where you eventually land on the integration testing unit of failure. So this gets back to your point about chasing back what the root of the change was. And we will need to improve that, but you chase back the failure to wherever it failed on that PR. And you both fix it and you add a unit test to catch that particular breakage. So that we don't ever bubble through to the integration again. The goal is to progressively, hopefully never actually break on the integration platforms. It'll take us a while to get there, but that's the goal. Okay. Cool. So and the advantage is, you know, it gives us a clean road map to introduce new platforms. So if someone comes in and says something like, I would like to do a mumble mumble forwarder. The answer is that's fantastic. We can give you a mumble, an SDK dash mumble mumble and command forwarder dash mumble mumble repo and go play. Right. And that sort of doesn't get directly in the way of things. It allows for much faster CI experience for users. You know, because you go and push things in a minute and a half later, the CI has gone through. It also tends to bias towards test catching things earlier with unit tests rather than later with integration tests, which I think overall is over time going to be very positive. It also allows for forming of sub communities, which I think is very healthy. Right. So we've already got sort of like a loose grouping of folks who are focusing on the SRIOV stuff. Right. So if we wind up with an SDK dash SRIOV and a command dash forwarder dash SRIOV, you would naturally get sub communities for forming around those. Or for example, we've already got some folks who've turned up who've been working on an operator for us. And it gives them a natural place to organize themselves around. And I think that gets to be quite healthy. Any questions, comments? I mean, we've had, I think a really good discussion around this overall. I think it, this looks super great. And I, I have been complaining about being monolithic for a while. Like you cannot do a co-ordinated project and have a monolithic repo, right? You have to go micro. It gets painful after a while. Yeah. Yeah. One of the things that we pitch is, build your infrastructure, like you build your applications, you know, start moving away from monoliths. And so I'm very happy we have moved away from our own monolith. So. One of the things as well that we can consider. So we have the operator, we have help. And if the, if the operator becomes a high quality, it may make sense to have help install the operator rather than having helmets, all deployments and so on. And that would further simplify the path. And even though we would add one more layer. So just, just something to think about because there'll be a lot of people who will install things using helmet. They won't know how to use operators directly. And so I think, but I think both of them combined can produce some very interesting results. And with that being said, even having an operator and, and even if we get it to help with upgrades and so on, we should still be very careful to design it. So that if the operator breaks, or does something wrong that we don't end up in a, we don't end up in a bad spot. And a lot of this is just making sure that's the infrastructure itself that MSN itself, even though it's, it's being helped by the operator does not rely on the operator for its own success. Yeah. One of the things about operators that's both bad and good is operators have the ability to cover for many sins. Right. So obviously we've done a good job so far of designing network surface mesh to be very resilient. So it's not sensitive to life cycle-ish kinds of things. You don't have to do A and then B and then C in order to have life work out for you. And that's really positive. But an operator, if you have fucked up and meet it, so A has to happen before B has to happen before C, an operator will cover that up for you. And so we do want to make sure that we don't let our operator get too complicated. Yeah. Keep the operator simple and make sure that we bake those into the auto keel stories that are within MSN itself. And so with that, I don't have any other major points to pick out on that. See, we have, is there anything else we want to talk about on the repo pipelining or should we move on to the last item on the agenda? Cool. Let's talk about Google summer of code. Nikolai, you have the floor. Yeah. I think the CNCF is proposing and us like being a CNCF sandbox project or we got the proposals and I have just received a reminder that we can submit for Google summer of code to the best of what I remember effectively since it was doing the sponsorship and it's up to us to provide contact and then a kind of mentoring person from on our side. So I think it was just worth it bringing up to the wider community that there's this possibility and if someone is interested in mentoring or less, I think it's something that we might take advantage of. No, I think this could be super fun and we have lots of little things that we could potentially work out as a Google summer of code project. Do folks want to brainstorm a little and we can keep this as an agenda item for next week and capture some of those ideas for GSOC projects. I just would like to add that in connection with the previous topic like the repo pipelining having Google summer of code, someone helping us would be much easier because you form a separate community as you said like mumble mumble. I'm quasi infamous in Cisco for putting slide decks together that are titled mumble mumble architecture. Pick up your mumbling. Exactly. I mean, my recommendation is well towards this is that we like the new, the way that we're setting up a new repository, we can get some pretty interesting complex, well, things that appear to be complex, but are still within the grasp of someone who's new in their career and get some high impact things into it. And so it doesn't have to be like, and I'm not saying unit tests are not high impact, but you know, if you approach it from their perspective, you're going to come join us and you're going to write unit tests versus you're going to come join us and you're going to write a you're going to write a new thing that allows you to connect in with, with PBDK or SRLE or something else. You know, it's the whole whole different range of excitement from one versus the other. So, but yeah, definitely, I think we should start a Google doc on this and we should, we should start to brainstorm ideas towards this and also invite people to be mentors. So it doesn't have to be me at or Nikolai. If you're an experienced engineer and you want to help mentor someone through some of these things, you don't even have to be a full expert in NSM yourself. And me at and Nikolai can help with those details. If you are willing to, to help person who is who is new in their career. So we can also sort of like a mentor chain. Yeah, we can, we can help you mentor someone as well. Yeah, I think this is definitely true. And in fact, I'd say that the broader range of people we have providing mentorship within the community, the stronger we're going to be as a community. Absolutely. And I would, and I would love to see more people be able to enter into the position where they can help mentor others as well. So like we, it doesn't have to come from, from Austria. You can, you know, mentorship can come from, from many other areas as well. And vice versa. You know, I often look at many of you to, to help you work through things as well. And so. But yeah, with that, is there anything else that we want on Google somewhere? Cool. And who posted the initialism stuff? So I, I stuck it in the agenda, but this is something that Peter, have I pronounced your name correctly, Peter? Yes. Okay, go ahead. I try very hard to get people's names right. And I'm really bad at it. So. It's pure. So. Okay. You'll try. I can definitely try and do that. That's, that's how I would, that's honestly how I would have made the attempt. It's just very, you know, often like I make those sorts of attempts in the, the poor person or the other end of entry just says, no, it just, just, just use this. So the ultra basically had raised a really interesting PR on the API repo around initialisms. And this was something I wasn't even aware was a thing, which makes me feel very silly. In the best possible way. And it's the kind of like, get things right, make the API lovely that I really, really like. And this was about a particular set of, do you want to explain a little bit Piotra, what initialisms are or. Basically it's using copper case in names for part, which is based on acronym. Yeah. So the, the example that the PR hits on is a net NS. And capitalizing the, both the end of the S and net NS. So. And it's based on a code review comments, which is a set of, let's say, good advices to how to cut in going. Yeah. And so I kind of like this approach to things. But I'd like to try and get a little bit of consistency around it. If we're going to go this route. And I did want to make sure that it was something we discussed before we started, sort of making these. Sure. We need to be consistent across the whole project and NSM in different places of code is differently named. I mean, with different cases. And do you happen to know, I saw some comments when I was researching this, that they were looking at adding new initialisms, the ability to add custom initialisms to go lent. Did that actually, is that a thing? I don't actually believe human beings are capable of consistency by themselves at this kind of thing. But I do believe in linters. It would be best if that could be updated automatically, but I think that linters cannot know about all acronyms, especially some specialized projects. There's definitely a human decision that has to say NSM is network service mesh and is therefore an initialism. That's definitely a human decision. You know, and things, but hopefully once we've made that decision, we can get a linter to keep us consistent. Even simple crap across the project. Oh, yeah. No, I, I, I, if you look at the current CI.yaml, I am not, I am not above the abuse of crap. So I have seen something similar with my VS code. Like suggesting that I should abbreviate it like this or like that. I don't remember. So if something for VS code exists, I would assume that something for the linters should be possible at least to, but yeah, who knows. So maybe the future step will be to try to find the mechanism for that we should automatically check PRs against a new code with introducing some, let's say bugs, semantic bugs. So it sounds like we do want to go this route in general with initialisms. So, and we want to investigate, you know, basically getting the linters to look after us in that regard. So I use that. Am I hearing that consensus correctly? I'm fully for that. Your opinion was never a doubt. All right, excellent. Cool. Yeah. It was, it was one of those things where it's very much a, I remember having this conversation at one point in my career with somebody where we were talking about a good taste. And part of the problem is that excellent engineers can have differing opinions, both of which are correct when it comes to issues of taste. So I wanted to make sure that we had buy-in before we proceeded with something like this. All right. So. Okay. Are there any other last announcements before we close up? Okay. Well, I would like to thank everyone for attending and we will see you all again at the same time next week. You all have a good day now. Thank you. Thanks. Bye.