 Hello. Hey, welcome to KubeCon. Welcome to the intro and deep dive to the CNCF Tag Network, Technical Advisory Group for Networking. We will also talk about the CNCF Service Mesh Working Group in this session. I'm going to jump right in and introduce myself, introduce the other co-chairs of the CNCF Tag Network. My name is Lee Calcote, founder and CEO at Layer 5. Ed Warnakey is a distinguished engineer at Cisco. Ed sends his regrets to our session today. But I am joined by our other co-chair. Nihao, I'm Ken Owens from FISO. I lead a cloud security network and engineering team at FISO. And so, yeah, the three of us get to hang out with a lot of other people who join the CNCF Tag Network calls what every first and third Thursday of the month. There's a lot of activities that go on, a lot of projects that get reviewed. Before we get into any of that, I'll be quiet here and maybe let Ken kind of tell you all about Tag Network itself. Yeah, so as you know, Cloud Native has been a huge change in the industry. And a lot of what we're trying to do within the SIG Network Group is to just make sure that as we look at moving to the cloud, that some of the components that make up the core of how workloads move, the way they communicate, the way they get data from one point to the other point is really critical. And it's not different than other network, but it is different than other networking. And so what we try to do is make sure that we clarify and inform those differences. Why is it important? You're here to talk to us like service mesh and meshes and performance and trying to optimize and get better performance out of the way. Those container networks are communicating to each other. And then we do like one of the things you hear very quickly from us is we want to bring in projects and innovative ideas around the networking space. And so as Lee mentioned, we have these meetings twice a month and it's important that if you have an idea, if you're doing something in this space, even if you don't know if it's really innovative, that's OK. We would love to hear about it. We'd love to talk to you more about it. And that's kind of what we feel like our passion is around helping to bring ideas forward in the CNCF. Ken, there's a question that I get fairly often. And it has to do with whether or not your organization or needs to be a member of the CNCF in order to come and participate in a group like this. Yeah, that's a good question. And the answer to that is you don't. You don't have to be a member. You don't have to pay any dues or any fees to be part of this community. We are an open source community. There's a lot of interesting projects that Ken and I and Ed as well have had the pleasure of reviewing, providing commentary on, kind of assessing and helping steward some of these projects through their, well, either into the CNCF and kind of through these various stages that the projects go through. And so, Ken, I wonder if you might educate us a bit on part of this process and why it is that we have so many projects as we do with some of the stages are. Yeah, yeah, definitely. So I think it can be overwhelming when you first think of an open source community and how do you get involved? How do you get engaged? And at the CNCF, it's been part of our mantra from the very beginning to be very inclusive of different projects and different ideas. We had this phrase at the beginning that we were not going to be king makers. Our goal was not to kind of go out and pick and choose what technologies mean cloud native. And so as we kind of went down this path, we thought it would be good to kind of have the ability to have sandbox capabilities. Let's have a project that they think it is a CNCF sort of project. They like to understand more about the CNCF. So like me mentioned, we would meet with you, usually present. It starts off kind of like a presentation to a tag meeting. And the presentation is really like you'd expect any sort of a product pitch, right? You're kind of explaining what it does, why you designed it the way you did. You're looking, we're looking, at least I are looking for mostly is what is the value that this is going to bring to the community and what are the problems that you're trying to solve in the community, right? Those are really the two big things we're looking for. If there are other technologies that are similar, we'll ask questions about those technologies. How do you differ? Do you differ? And like I mentioned, you don't have to differ, but we do want to kind of understand the space, make sure you understand the space that you're in, how you compare with other projects in the space. As you can see on this list, we have quite a few projects that do similar things to each other. And a lot of names are the same, like Service Mesh and Open Service Mesh and Service Mesh Interface and Service Mesh Performance, a lot of Mesh names out there and a lot of acronyms that you may look at here and say, well, what does that, what does BFE mean? I thought that was like best friend type of thing, right? But no, it's not. So it's kind of a lot of interesting names and interesting activities that really go in to these projects with answer your question specifically, where you kind of start off in the sandbox area. And from there, you move up to the incubation area, right? Where you're kind of, you've proven that you have some of the basic interest in the space. You have some adoption. That's kind of how you get from sandbox incubation as you have a little bit of adoption. And then to graduate, you have to have a little bit more adoption, I can't say when now, adoption, sorry. And you have to have at least two or three other contributors from outside of your group, your main, your core group, right? It's, we don't want to sort of support a project in the CNCF that is only driven from one single source, right? That no one else outside of your group has seen interest in. So, you know, if you're a company A and you only have company A working on the project and no one else in the industry sees value in helping you with your project, then it's really a company A project, not an open source community project. And so that's kind of the key goal there is to make sure that it moves out. And the best advice I can give, and I'll ask the leader to give his best advice next with the best advice I can give to you is, you know, go into this with the end in mind, right? So if you want to be a CNCF project, that means you want to be open source. You want to have contributions from the industry. You want to solve values, solve problems that the industry sees as valuable to solve. And you want to do it with an open license and open, you know, governance within your, within your GitHub project. And so, you know, go into it from the beginning with that mindset. And it'll be a lot easier as you move through these different phases within the different cycles of the CNCF to graduate, it'll make it a lot easier on you. Yeah, to add to that, to Ken's point, it's, you know, creating and stewarding any one of these projects is a significant undertaking. It not only does it take, you know, multiple people, but to Ken's point, like a healthy project is one that has various interests in mind or there's folks that get their paychecks from various places. And that in part helps with just kind of the democracy or the meritocracy maybe that occurs within the projects community. It helps a bit with like longevity if the two maintainers get both were on the same bus when that bus went off the cliff. And then in part why, I mean, that's a significant component as to why both users of these projects come, why the CNCF is so popular in the first place and why it is that people bring their projects to the CNCF is for a number of reasons. But one of those reasons is as like a third, a neutral holding ground, if you will. Some of the initiatives that Ken and I will speak to that go on inside the service mesh working group are like they're really, you know, they're cross cutting and they really need a, well, what's the right, you know, a non-partisan or non-partisan perspective. So it isn't about, it's about the technology and kind of finding, you know, arriving at the best answer, you know, irrespective of other political interests or other interests. It's really hard to avoid some of those, but at least for my part, I found this is about as good as us humans can get it, I think. Agree, totally. We kind of look at, in addition to shepherding and stewarding and working closely with projects, we feel like we need to kind of get more information out. And that's part of why we present at CUBE, kind of kind of get more information out. But we also look at creating working groups. The service mesh working group is a very important, a part of what Lee and I both are very interested in supporting. We have quite a bit of information about that below. Also the universal data plane API. We also are doing white papers. And this is what has kind of been interesting because you kind of have your technical white papers. So that's sort of the, you know, you want to kind of get a, you know, publish something around, you know, like in this case, we've done a network native, cloud native networking principles, white paper. But you also can kind of get more, like very deep and more technical and then eventually get kind of recognized by a magazine, right? And so I'm a member of the Institute of Electrical and Electronics Engineers at IEEE. I was also an Ada Kappa New, which is the Honor Society for Electrical Engineering. And as part of that, I was able to kind of publish a white paper, kind of an update to my thesis that I did when I was in my master's project a long, long time ago, like when I was a kid. I had all this gray in my beard. And I kind of mentioned to my advisor who I was working with on the paper and for this article publication that I'm involved in this, you know, the cloud native computing foundation that, you know, I'm working with Lee and we had some really interesting work around a newer type of networking in the cloud space called service meshes. And then we thought it would be, I thought it'd be interesting to have a paper on, you know, what is, you know, kind of the, you know, performance aspects and how do you analyze the performance aspects of service meshes. It kind of came after a white paper that Lee and some of his, you know, some of the guys he's been working with on the service mesh patterns and reference implementation work was published. But this was sort of taken that to the next level now and kind of going after a kind of a global audience with the bridge, IEEE bridge magazine. And what we are doing now is, you know, looking towards the future that the Institute of Electrical and Electronics Engineers has a lot of deep, you know, communication and networking articles that they like to publish. And so my next goal with Lee is to kind of work with the IEEE Communication Society which I'm a member of and try to get some of our more advanced, more technical publications into that. Those are usually very algorithm driven and a lots of, you know, equations. And so we're going to get, you know, as technical as we can with our service mesh algorithm work that we're doing, but it does fit well. And the future people we're working on right now is techniques of objective service mesh optimization. And it kind of goes with, you know, a good story that I'll let Lee tell you about how we kind of got to, you know, why we look at performance is, you know, why we're analyzing performance and why do we want to try to look at adaptive service mesh optimization? Because I think it's a really interesting how we got to this point I think is really interesting. Well, as people adopt more and more cloud native infrastructure and as they do so with this kind of next generation software defined networking with, as they do so with service meshes, there are, well, you know, there's a lot of things that you can enable in a mesh, there's a lot of things that a mesh brings to you and you can turn on. And there are, you know, performance considerations around that if you ask for the service mesh to provide a lot of resiliency to your microservices and try to help you overcome what we know to be fallible networks, the more that you're asking a service mesh to do, maybe the more that you're asking it to retry failed services, service requests between your services or maybe you're asking for it to encrypt the traffic between your services or do mutual authentication between them or reroute traffic or like there's a list. Actually, we're gonna look at a list here in a little bit. Well, that means that as an operator of that infrastructure and actually as an owner like a product owner or a service owner of the workloads that are running on that infrastructure you're concerned with, you have similar and overlapping concerns from an infrastructure-centric perspective, you wanna know, you know, hey, am I doing it? Not only am I doing it right, but maybe am I doing it well? Like, is this a well-oiled machine? And how does that compare to how others are doing it? Part of answering that question of like, excuse me, am I doing it well is, yeah, there's comparing to others but there's also just sort of, it's a continual measurement that you'd wanna do and kind of compare to yourself and have a relative benchmark onto your own standards or onto your own historical ability, onto your own custom workloads and custom environment. For those of you who may be kind of like, I don't get what he's talking about. If you've been in an environment where you hear it's slow, well, I'm not quite getting the response in the time that I'm expecting to get it in but I don't know why. Or if you're used to hearing these types of like complaints, I'm sorry, concerns from your application team that things aren't working the way they feel like they should work. The whole point of this is to try to, as Lee said very clearly, I try to identify like, what is the right solution and what's the best way to optimize instead of spending days and weeks or in some cases years, I guess some companies spend years trying to optimize, right? This technique really helps you fine tune that we get into in a few minutes, fine tune and I won't say like immediately but it's much quicker than the traditional way of trying to tune knobs and tweak knobs over time and test and test and test it. It kind of does that testing for you over many iterations as quickly as you want it to, right? You don't want to obviously break yourself but you want to kind of give it some parameters and say, but then these parameters go figure out what's the best design and it does that, which is very cool. Sorry Lee, go ahead. There's a lot of knobs to twist as well. So it, well, it's speaking of a lot of knobs or speaking of a lot of mesh-y things. There's more than one service mesh out there. There's, for some of the service meshes there's any number of distributions of those meshes. There's been an emergence of specifications and some emergent standards that focus on interoperability. So, and actually as it turns out that there are a couple listed here in a potential third. Both of these two service mesh interface and service mesh performance are, we have the fortune of these projects calling the TAG network, their home group, their home TAG. The SMI ends up focusing on interoperability from a functional perspective of the various features of a service mesh. Can you functionally address those features and configure those features in a universal way and using a common interface, a common set of APIs? That's really the focus of SMI service mesh interface. If you're kind of familiar with CNI Container Network interface, that's a good analogy to draw mentally. Another one here on service mesh performance or SMP. And it's about trying to make it both uniform and succinct the way in which you would characterize the performance of your cloud native infrastructure. There's a real focus on service mesh and the way that you can measure their performance. The longer that the discussions around this project have gone on, the more folks get involved and the more people push to expand that narrow focus from service mesh to a bit more of cloud native performance. And so in order for the project to do that successfully, though, frankly, there just needs to be more of you involved, more people involved. I think a key component to this is, Lee kind of mentioned earlier that when you're trying to fine tune some of your latency or some of the authentication, mutual authentication aspects, as you're looking at all of these different knobs, it's hard to sort of know what's the right tune. How do I tune it correctly? And when you have a standard around the interface conformance and service mesh performance, things that other groups in the industry have said, these are things that matter. These are performance aspects that we wanna look at. And you can kind of go back to that spec and say, based on SMP, this is what I wanna set these parameters. These parameters would be the important parameters for me to look at. And then you can kind of fine tune your starting point based on that. And a lot of that has to do with, just sort of what are these common tests, right? And when you think about this from a software development standpoint, as you come up with your user stories, you're trying to define ways to test those features that you've met them with those user stories, right? I think this is a very similar model, right? When you come up with, here's a sort of characteristics you want for your mesh, you need a set of standardized tests that are gonna help you identify if you've accomplished your goals. And those patterns that we see evolving here are the patterns that we wanna then test. And we wanna try to make these patterns the standard patterns. So, I know 60 seems like a lot, but when we get into them in a little while, you see that it's really classified into maybe eight or nine different types of categories, right? And so you can kind of put yourself into that category and then you only have seven or eight patterns to look at versus 60 patterns to look at. As Ken was saying, there's a growing catalog of service mesh patterns that promote kind of reuse of best practices. These patterns are intended to be mesh agnostic. And they're also intended to capture a bit of behavior, like this example here of a circuit breaker. A circuit breaker is one of my favorite pattern. And being in financial services, right, we run these type of tests all the time. This is a test that shows if you're on a certain path, your traffic is flowing a certain way, you have certain characteristics about your performance that you're measuring and you're within your SLAs that you're measuring. And something happens, right? So you're trying to sort of simulate a node failure or a router failure or a switch failure or a low balancing failure, right? You're trying to simulate something along that line that could fail. And when that happens, right, you wanna kind of see how quickly it takes you to recover completely from that issue, right? In the pattern, you're kind of looking at how fast it takes you to recover from this issue. I kind of call it chaos testing a little bit, but it's not fully chaos testing, but we try to sort of like randomly do this all the time to make it more chaos-like. Not only is this initiative about cataloging, identifying and espousing best practices of these patterns, but it's been also about codifying those best practices into individual patterns that are expressed in YAML and can be taken and realized against any number of that same pattern can be realized on an Istio mesh, on a Linkardee, on a console, on, I wouldn't say all of the mesh, this is about 10 of them that the group tends to focus on and the patterns are designed to be agnostic in that way. And so... And that's like really important. I found this very helpful in sort of that exact, that use case of having a, you get a request from one source, you don't have to go talk to another source to get the authorization, and then you have to go talk to a different source to get the fraud and other scores done, right? And you're doing all of that at the same time as quickly as you can in milliseconds, right? And something like Nighthawk has done a great job of helping to simulate these multiple different types of streams we have in different directions and how well, you know, I saw this fine tune a lot of little aspects along the way that you wouldn't have thought about without something like a low generator in that help you find it. Can you bring up one thing that I've also heard from others and I wonder if this hasn't been your experience as well. If you, the thing that you were just saying, like if each of these microservices are secure and they each, you know, as they go to receive a request, if they need to authenticate that request, in a particular, so if you've got 50 microservices, they each will receive the request, they go back to kind of the central authentication system or 50 times, like. There's definitely ways to optimize that, right? Exactly, yep. Okay, yeah, sounds like you guys have been, I've hit that as well, so just a very bottleneck. And there's also like, I'm on the same line, it's not to take too much time, but another interesting use case that Nighthawk helps with I kind of call them those like big events, right? So if you have a huge, you know, something big going on, that's gonna drive a lot of traffic to your site, for instance. It helps you with those sort of like one-off events of kind of looking at, you know, if I'm getting a ton of information in one place, how do I optimize for that? But not take away from my other, you know, I'm still getting traffic from other places, so I don't wanna take away from the traffic from other places, but how do I make sure that the bulk of new traffic coming in or the amount of traffic coming in, right? Because the throughput is gonna grow way up for this one site, you know? But the latency and the resiliency have to still be low, right? You can't have your latency go through the roof just because your throughput went through the roof, right? And so those sort of, you don't really realize how they relate to each other, but if you've ever been in this world, you know that if you get a lot of throughput, your latency goes up through the roof quickly. So that's kind of your typical response, and you wanna optimize so that doesn't happen, right? You wanna make sure that as your throughput goes up, your latency can still stay the same, and it doesn't happen through magic, right? You have to plan for it and put controls in place and algorithms in place to help you manage this in a certain way and optimize is really the key word, right? Optimize how you bring that traffic back in because you don't, to your point, Lee, with the authentication, you don't have to bring in 50 million requests. You can bring in five requests to represent those 50 million. It's hard to do, but it can be done. Yeah, Ken, there's a good time to, just with recounting how many times you've been around the block, I guess, is gosh, there's this, well, there are these concepts in the world of networking and whether it's in the physical days or the initial virtual days or maybe these meshi days, and I think you're talking about the different types of, different types of network traffic and maybe classifying this network traffic and trying to provide some differentiation around how these are handled. And so, what is, there's a term for some of this. All your service. Okay. Yeah, that's the big term. Kind of the, getting your network, kind of optimizing your network around different types of service levels, if you will. And it's amazing thing between the quality of service is something that is probably happening all around us all the time. A little harder for organizations to control out on the internet, but it is also only sometimes talked about in these new fangled service mesh networks. And yet it continues to still be very critical there. I mean, I think the example that you just gave is a perfect one. It was like, if you have that many authentication requests and yet you also have this shopping cart request, like maybe the shopping cart request or the debit card request, maybe it should be guaranteed to get to its recipient and what happens if it gets bottlenecked or kind of choked in the middle. And so it's been very nice to share with you all. Ken and I might be in the chat right now. I'm not sure if either the two of us will be up that early, but I hope so. So we will. Sure. Thanks for having a great question of your conference.