 Good afternoon, everyone at KubeCon in Europe. I'm sorry to not be there today, but I'm really looking forward to talking to you about what's going on in the TAG network and service mesh working groups. My name is Ken Owens. I am a vice president of cloud security and engineering at FISO, a financial services company. I have with me today Lee. Hello, hey, my name is Lee Kelko. I'm the founder and CEO of layer five. Ed Warnacky is also pictured here. He is another co-chair of TAG network. Ed sends his regrets today. And so, you know, just to kind of level said, we like to do this at every KubeCon. This is the, I think, eighth or ninth one that Lee and I have done together. And so on the second virtual one, hopefully we'll get to see you guys soon, maybe in the U.S. But in the cloud native world, a lot of things change in a way you think about doing enterprises and security and networking is a big one, right? And so the TAG network's mission statement is to really enable the widespread and successful deployment, development and operational resilience and intelligent network systems in a cloud native environment. So what that really means for Lee and Ed and I is that we really want to help to kind of clarify and inform what projects are needed as you go into the cloud native environment as you're here. Very quickly in this session, we have a lot of things that we've been doing over the last several years. We want to collaborate with individuals and we really want to attract developers and engineers that are coming up with great ideas in the network space. Gaps that you uncovered as you looked at going cloud native, as you deployed Kubernetes, as you find networking enhancements that are needed. We'd love you to bring those to us, whether it's service mesh related, whether it's firewall related. We don't really have a negative opinion about any of these technologies that are in use today and we know they all need to evolve and get better. So we're definitely want to invite you and we try to, for the most part, you'll see that the tag network group is very impartial. We're not trying to take over a project. We don't want to own, but we have our beliefs come from the CNCF beliefs that we're not going to make any kind of kings. Our goal is to just really work with you and to help steward your project through the different gates and phases of the CNCF project, which used to be easier than they are today. It's a lot more complicated today to get project through the CNCF and so we are here to kind of help you do that. And just like I mentioned before, Lee and I have been doing this for quite a long time. We were the original chairs for the CNCF working group way back in 2019 before the pandemic hit us. We had started with the cloud native, cloud networking interface, CNI. We were very involved in getting CNI and helping to do it CNI into the CNCF. As you can see, we were pretty busy that year. We got a lot of different projects in. We didn't stop as we rolled into 2020. We continued working with projects and you can see, you know, service mesh interface was sort of one of the, that was one of the big projects we had for that year. As we went into the rest of the year, we were doing chaos mesh and open service mesh. As we look at, you know, going into 2021, we brought in some emissary ingress and the KHEV. As you look at, you know, North America, we brought in the service mesh performance that you hear us talk a lot about today, Submariner, Silium and Messery, which are all big, big efforts that we are going to be talking about today in this session. At China, at the end of last year, we talked about FabEdge and that is now a CNCF project and something that we're excited about, especially Lee and I, we've been talking with Istio since 2019 is we're proposing that they are a CNCF project in, you know, this year. Other things that we like to do is Lee and I are very active in the community, right? And so we had an opportunity to provide a white paper to one of the nice, one of the bigger IEEE, Institute of Electrical and Electronics Engineers committees had a special session on cloud-native networking, which we were able to provide a willing nice paper into that bridge magazine. The bridge is part of Ada Kappa New, which is a part of IEEE and it's an honor society for electrical and computer engineering. And we are hot off the presses, we are continuing that effort and the research that Lee is working with the university and continuing to evolve and we have some more research we probably will be presenting in that IEEE publication as soon as possible. So, Lee, if you want to add anything to that. Yeah, it only gets more interesting as we go. The, you know, I think the last publication that we had done, it was, it laid the foundation for some of the, well, this next one about adaptive optimization. So is the current one under study? Actually, I think we have a slide on Mechmark a little bit later in this talk. And so part of the reason that Ken is stepping through each of these is, if you can't hear it in his voice, there's an implicit call to participate. And so the, yeah, this next research paper is, well, it's a doozy. Cool. So we'll get into some of our projects, right? So we've been working with a couple of different specifications and in this environment, it's really important to define specifications and be very clear about what it is you're trying to accomplish, right? And so one of the first things that we did was a service mesh interface, SMI, and that's a standard interface for service meshes on Kubernetes. As you know, trying to standardize anything is across a evolving technology like Kubernetes was not an easy thing to do. And this was a big body of work that we did. A lot of kudos to Lee and a lot of the work that he helped coordinate across the community to get that going. And then we published the SMP last year as a standard that kind of just helps describe a common terminology around how do you measure your performance of a service mesh? And then, you know, as we continue and continue down this path of trying to create specifications, one of the biggest issues we have in the industry is kind of getting this interoperability between different types of service meshes, right? And so we're trying to standardize the first step there is, and especially with Cloud Native, we really feel like APIs give us that common playing ground, right? So if we can kind of start with a set of standards around APIs that enable the services, that would then help us really build that federation model from day one versus, you know, waiting six or seven years and not having a model and then trying to create a new model from scratch. So we're trying to go after this problem very quickly. So on that topic of standards, right, we thought it would be good to talk about, you know, patterns. And patterns are similar to standards in that it's sort of trying to outlay what are the steps and what are the components and what are the set of like common controls or common behaviors or a common set of configurations that would be needed to support different types of use cases, right? And what do you want to say use case or pattern? We like to turn pattern. We think it's more of an engineering discipline to think about patterns instead of, you know, other common terms you could use here. But we have this GitHub site service mesh patterns. The value here, right, is that we're trying to look at enabling the business language around service mesh adoption, right? And so you have the behavior you're trying to capture and you're trying to look at it from a user, which is an application developer usually, right? How is that application developer going to get to take advantage of these patterns? They have to be agnostic, right? We can't just define one pattern for one technology, another pattern for another technology. We're trying to say these patterns are independent of the technology, independent of any solution that you're buying, right? We should be able to deliver the same experience to a development user community or the end user community without having to specify what the parameters are for that technology. And then the most important piece of this, and this is something that Lee and I have been tackling with in the industry for a decade or more, it has to be reusable, right? The world definition of how good your pattern is, is it defined in a way that can be picked up by different groups of individuals and they can use it without having to recustomize it for their use case. If they do want to add to it, Lee and I are great, super smart people, right? But we may have forgotten something and whoever's working with us on these patterns could have forgotten something. And as time goes on, you can believe there'll be new innovations and there'll be new things that come up that we hadn't thought about. And so the second big important thing about patterns is they can be added to, right? You're not creating something that has to live forever. These are living and breathing and they change over time. And as you can see on here, we call them building blocks, right? They're basic building blocks. And as new things come out, as new integration patterns come out, we'll be able to modify those with the community's environment, not just Lee or I doing it, but it'll be a community effort. Yeah. And to Ken's point there, some of those users of the patterns, sometimes they'll be referred to as a template or that's like the way in which they end up treating them occasionally as well as a common point of reference for measuring a couple of things, answering questions like, are we doing it right? And by it, that could be, and Ken's probably going to talk about these, but that could be using a function of the service mesh in this case, so a circuit breaker potentially, setting retries. And you could set 1,000 retries, you could set one retry. What's the common practice? And maybe more importantly, what are the considerations to account for as you're trying to find that magic number for how sensitive your circuit breaker is or the security posture of the mesh, how frequently your identities are generated, what the administrative domains are, how those are carved out between maybe you're in a multi-cluster environment. And so, yeah, to what Ken had described is to the extent that we can and to the extent that those of you are participating, these inevitably end up having best practices built in. And to what Ken was saying before, if you're taking, sort of, rationalize over a service mesh pattern and a service mesh specification, one is a lot more formal and can take quite some time to define those APIs that Ken was talking about. And in these patterns, we're looking at templates, we're looking at best practices, things that are reusable, just like Ken was saying. So, yeah, anyway, it's, there's a lot of patterns out there. There's a lot of things that the mesh can do. And so, it's a lot of patterns being described and collected. This is what we're really excited about, right? Because Lee presented last year in China, KubeCon that we had a few of these service patterns, right? So, we had circuit breaker, we had retries. I think we might have had manual TLS, but we didn't have multi-cluster yet. We didn't have a lot of filters created yet. I think we might have talked about, like, a JWT transformer. And so, as you can see, we've been adding and formalizing these patterns. We actually have a catalog now. And that, to me, is just really exciting for this community to know that, like, there's a community of patterns being developed and they're open. They're open for people to use. They're open for people to contribute to. And before we move on to something, we're going to move on to what we're doing next, right? So, that's okay. You can pull it back up. You know, with this, it's exciting because we know there's a lot of use around EBPF and a lot of stuff around OPA. And if you're not familiar with OPA, that's okay, right? OPA is Open Policy Agent. It's really critical when you get to a service mesh, especially. But in my opinion, just the cloud-native pattern itself, policy and declaring those policies become so critical and so important, right? And there's some interesting things that you could do in a pattern with collating events and doing a single tenant type of a test, doing some preprovisioning and seeing what happens after you do that preprovision and analysis, right? From an appointment side of things to kind of make sure your intention is being met versus, oh, I wonder if it's going to be met. Like, you know, let's actually define what we want, deploy it and see if we actually met what we met to meet in a preprovision state before we turn it live, right? Before we make it a neighbor. And so, to me, this is really exciting that we've been working on this. And I was going to ask you, Lee, to kind of talk a little bit about, like, I know this is coming soon, and, but before we get to the coming soon, please talk a little bit about what, like, the state of the current patterns that we have and some of the adoption you're seeing, and then you can kind of give a little bit of an overview of where the exciting pieces of the new policies that are coming soon. Will you see that being beneficial and how soon you see that being available? Yeah, of the current patterns that are there, they are, well, when a pattern is first described, it is, for certain, like, compatible with a given service mesh. The goal of each of the patterns is to be service mesh agnostic to the extent that, you know, that particular service mesh is capable of that, of the given function of the patterns that we're seeing on the screen here. You know, nearly any mesh that you could, well, actually, I should, so the circuit breaking of retries and of mutual TLS, like, all the service meshes that I can think of, that's, they've got that covered. And so as these patterns are being defined, while each mesh generally has mutual TLS circuit breaking and retries, they have that capability, they don't all implement it the same way, which is what Ken was speaking to about service mesh interface and service mesh performance, the specifications to try to get each of the service meshes, well, to try to be able to define patterns like this in a uniform way is really reinforced by standard specs, standard APIs. And so they don't all implement these capabilities in the same way. So as the patterns are worked on, verifying their compatibility is part of the current ongoing work. Since last we've met, or since last we've given a deep dive into the service mesh working group, have been a number of additional WebAssembly filters that have been brought in. And like Ken was saying, these are open source, these are available to all. The WebAssembly filters, I was just talking about compatibility of the patterns to individual service meshes. If you think about the WebAssembly filters, these are provided in context of the service mesh running an envoy data plane. And so last I counted, I think there's about five or so, depends on how you count an open source service mesh project versus vendors offering of that project. But there's at least five or so meshes that are running envoy data planes, which means that there's at least five or so that support these filters. In an interesting way, as Ken was speaking to EBPF and OPA, there's a lesser number of service meshes that are natively cognizant of EBPF and utilize EBPF as a foundational data plane traffic filter. Some of them support EBPF optionally, and some of them, I was going to say, some of them only support EBPF, but I don't think that that's true. Ken had said that that Cilium had come into the CNCF since last we met, or last we gave an update at KubeCon. And I don't think that it's true that Cilium service mesh only supports EBPF as its CNI. I have to go, I have to go, look at that. One other thing I want to mention that Ken called out in really, I don't know if everyone would have caught it, but he was talking about this pre-provision policy. Normally, I have to expect, as people think about OPA and policies, that you're genuinely thinking about authorization use cases, securing things. The logo kind of intimates that to you, that it's, things are secure behind here, right? That there's a Sentinel. But at its core of its capability, OPA is, well, it allows you to define rules and evaluate those rules, those policies. And those policies don't necessarily have to do with authorization. So to Ken's example, if you want to make sure that, you know, that your environment is ready to go, it's possible to define that, this is the right word, that attestation in a policy, codify what it means to be ready, and then use the engine to evaluate it. And so kind of some interesting things. So we're, yeah, we'll see, we'll see what happens in six months when we speak again, if the coming soon is. Yeah, definitely. If you guys are interested in helping, you know, test some of these patterns and contribute, we definitely, as Lee mentioned, like always looking for environments, definitely reach out to us. And, well, and so there's service mesh performance. It's a CNCF project. It is, it's currently sandbox level. Ken was highlighting one of its sibling projects, SMI. And so service mesh performance is, well, I guess, you know, in some respects, much to my continued surprise, kind of a popular project. Like, so many people have this question around the overhead of a service mesh. A lot of people get the fact that the service mesh can provide a lot of value and that is becoming commonplace kind of where you find the Kubernetes you find a mesh. So they want characterize and understand the service mesh layer in terms of its performance. One of the things that this particular project has been enabling is a specification to express that in a standard way, but it's also been espousing. Well, it's been espousing the fact that it's not just speeds and feeds that you need that you should measure. That's about that's like kind of half the game, the ball game. The other half of the ball game is the value that you're deriving. So part, you know, like part of what you're trying to measure is you're the overhead in context of how things are performing from a quantitative, like sort of cold hard metrics perspective. And then the other portion of that is to assign value to the functions that are being provided by your infrastructure. And a lot of times that can be done in context of business performance metrics or application level metrics of how many shopping carts are open, how many are actually being successfully closed, how many are being impacted by your infrastructure and to help give people language to speak to about those two things overhead, you know, performance in context of overhead and performance in context of value, the value being derived. Meshmark is born as a cloud native value measurement index. And so Meshmark falls under the service mesh performance umbrella as a component of that project. And it is, well, it's actually the subject of a talk this week at service mesh con. And I believe that would have already happened as everyone is getting this update. So I kind of described the effort that Ken and I have spoken to about Meshmark. And I'll peel back the onion here one more step just to get a little deeper. And that is that Meshmark is so as a performance index, as it's measuring those cold hard overhead, the cold hard performance characteristics of your cloud native infrastructure, your compute, your compute, your network, your storage, and having a kind of a heavy emphasis on network. There are many utilization classes to quantify, to measure and to understand how well the infrastructure is running. So some of those classes, the classes themselves tend to focus on a particular type of resource. So if the network is being measured, that would be a utilization class because the network is a resource. You can measure the network in a number of different ways. Maybe you're measuring latency, maybe you're measuring throughput, or maybe you're measuring number of errors, or like an on and on and on and on. And maybe some of those are more important to you than the next. So maybe as that utilization efficiency number is being calculated, maybe that's really, maybe the one for latency is really important to you or not. And so you can, the Meshmark itself allows for you to assign a weight or a discount, a positive or a negative weight. And so there's a fair bit more to this. Ken has been involved in the Service Mesh Performance Project and Meshmark for some time and Intel, or a collection of folks at Intel have really been helping push this forward. And so, so quickly on that, you know, lead in touch on that, but you can also use things like cost, right? Cost can be one of the utilization classes, right? So the utilization classes are generic by nature, because you can kind of select the different parameters, you know, like in my case where I work, security would be a very important utilization class, right? And there's different aspects that we would, that we would add to these to the Meshmark performance for that and weights. And so it's a, it sounds complicated, it looks complicated. But when you apply it, it makes a lot of sense. And you're kind of doing this in your job, you just don't realize it, you're kind of inherently applying a couple of metrics and your own weights in your head to what you think should be there. The issue is you probably don't apply it equally and the same way every time. And you probably forget other things that are important at the time, but maybe another time you remember them. And so this sort of gives you a, what's nice about this is like what you were saying gives you a standard way of doing it. And it's backed by, you know, real information that you can show to your leadership. Hey, this is why we picked this choice. We looked at these parameters, we gave them these weights. If they disagree with the weights, you can change the weights and rerun it and show them it doesn't change the result, but at least they see why and what you're doing. So it's a really important effort. So, sorry, I just wanted to kind of throw that in, Lee. Yeah, totally. The financial aspect, yeah, that's a great example and the security aspect of how it isn't just about speeds and feeds. In some respects, I guess you could consider dollars, speeds and feeds, but their dollars are definitely a resource. You can spend them in different ways, OPEX, CAPEX, and yeah, yeah, anyway, the two of us are both excited about it. So, you know, check out the service mesh contact. One of the other initiatives that gets stewarded under the service mesh working group is Nighthawk. Nighthawk is, well, I don't want to, it is a load generator. It's a layer 7 performance characterization tool. It was born in the same Petri dish as Envoy. It was born kind of adjacent to Envoy written in C++. It's been quite helpful as we do a lot of performance related analysis. This one gets me pretty excited as there's some, there's work that is ensuing around adaptive load control and I hesitate to begin to describe that work because Ken will have to shut me up. But visit the URL on the adaptive load control stuff, I think. Yeah. And it still has the coolest image or icon of you consider all the projects in the CNCF, I think. Yeah, I think so. So, we've been talking a lot, you know, so in the service mesh working group, we do the focus as a service mesh. There's a lot of things to cover. Ken just mentioned security, which we ended up talking about performance so much that like there's security, there's performance, there's observability, there's uniformity, like there's like conformance. Are each of these service meshes conforming to service mesh interface? The set of APIs that that project defines. Well, there's tests that have been defined and instrumentation tooling in the mesh re-project for validating whether or not the service meshes are adhering to the specs. SMP service mesh performance that we were just talking about. One of the outside of it being a spec and outside of mesh mark and publications, research publications coming out from that project. So, too, are a few of the contributors working within the CNCF labs, which behind the scenes is the Equinex data centers. So, they've been using mesh re they've got automation in place to provision any number of double 10 different types of service meshes and deploy sample workloads, generate load, do a performance characterization and retrieve the results of those tests. And those have been running for a little while. So, we're about up to 40,000 test results collected. Some analysis is being performed and some of those results are being published on the dashboard there. Ken was mentioning the patterns, about 60 patterns defined now. I suspect there are others that Ken was saying, please come and contribute, please come in, you know, sift through the catalog. Probably missing some, you might have some that you'd like to define. So, the initiative, well, it might not seem that way up front. The initiatives are definitely intermeshed. They're definitely related. And as we go to wrap up, Ken, if we didn't do it already, we're calling for participation, we're calling for people to come and join us. Join us, you know, if you can make the meeting times, we meet twice a month. So, that's a great place to come over, participate, be impactful. Or, if the meeting time isn't conducive to your sleep, there's a slack and there's also the mailing list. So, don't hesitate. If you saw something that sparked your interest, the folks that are involved are quite amenable to a suggestion and looking to collaborate. And so, Ken, I don't know if I missed anything there. No, anything else to add was, we do have the meeting minutes listed on the slide and being that we are an open source project, we are open to anyone to come and take a look at what we've been doing and working on. And Lee does a great job of collecting interesting topics for the meetings. We get a good set of individuals, a core group that we put together over the last several years and happy to have you join us and get involved. At any point, we're happy for that participation and you can learn more about us with those meeting minutes and kind of what we talk about and who's involved. Ken, question for you. I get this a fair bit, but you were saying, hey, there's the meeting minutes, come jump in. Clarification, do individuals, people that want to come participate, do they have to belong to a CNCF member organization to have a seat at the table to come and add to the agenda? No, absolutely not. We are very open to participation from anyone who's interested. You don't have to be a member company, you don't have to sign any blood with the Linux Foundation to come in and join us. And we do ask though, there is a, I don't think we have access to them, I get in trouble for pulling this up, but we do have, in the CNCF, we do have like a code of conduct. And so we do expect you to follow the code of conduct and if you're not a member organization, you should treat everyone with respect and that kind of, all that really good stuff that we want to make sure we're doing as people, as humans on this earth, we don't want to be, we have enough mess to deal with without having to be mean to each other, right? So as long as you're amenable to being a good human, we want you on our meetings and contributing in any way you feel possible. Oh, nice. Well, that's it then, I guess go off and be good humans and come and participate in TAG Network.