 All right. Welcome to our bi-weekly SMI community meeting. Today is June 10, 2020. And if you haven't so yet, please add yourself to the Google Docs that our British Chair already got on the chat. All right, let's see where we are. Thanks for subscribing, Bridget. And I think the first action item or agenda item that I say is blog posts suggestions. Right. So that's issue 36. Yeah. So we had, um, I added this one because we had talked a little while ago about getting community members to write blog posts. And I know the fine folks from solo were interested in writing one about a service mash hub as an example of an implementation. And I talked to Betty this morning. You did. I see it's on the call. I talked to Betty this morning and she was saying, yeah, they still are interested in writing that. It looks like a few people commented or gave ideas about what would be a reasonable community implementation of blog posts to wit if we had two people approve as per usual, but they were not from the same company as the blog post. I just wanted to kind of circle back and see how everyone felt about that. Oh, any comments? That's good. Do we have anything actionable around that? I think since it looks like we're getting kind of lazy approved, people are either yes or I'll write up guidelines in the blog, or in the smispect.io repo and then make it very easy and clear for, you know, give people a template to submit an issue to put a blog post up. Does that meet with anyone's questions approval? That sounds pretty good to me. Was the, so to recap the only requirement is well, you know, that the topic is relevant and that the, and that the blog is reviewed and approved by two independent parties. Yep. I, and I'll write a little bit more clearly, but I did write in the issue, some proposed guidelines that people seem okay with. And basically that it can't be an ad for a commercial product. You can mention your product, but like it has to be interesting and actionable by people who are not buying something from you. Got you. Make sense to me. All right. Cool. Cool. All right. Let's see what's up next. Looks like. The top level spec update. It's per request 169. And that's again, you bridge it. Oh, I added it, but I don't have a ton of more insight. It was actually. Defan wanted to put this in. So maybe if you can. Clarify Stefan. Can't hear you. Sorry. Sorry. We're talking about the spec for request. Yeah. So we, we decided we want to do that. Normalize our custom resources. So they are conformant with the Kubernetes. Object interface. I think it's called. So yeah, we need one more approval for that. And this poor request blocks the HTTP route one. Because I don't want to create a poor request to add a spec to the peer out until we decide that okay, everything should have a top level spec. If we could get someone on board today to merge this one, then we can, I can move forward with the route. We should also discuss if we have time today, how, how are we going to deal with this kind of breaking change in the SDK? Sounds good. I approved it. And I am definitely want to discuss how to deal with that in the SDK. Because it seems like the versions are completely like all the previous versions are going to be completely incompatible. And you'd have to delete all of your objects. Because this isn't even something that the conversion webhug could really handle, right? Yeah, so. Well, it's alpha. Yeah, my proposal is to just make a new release of the SDK. Create a release note. A document on the release note. Again, the change and link back to the spec. Make people aware that we are doing this. I'm, I'm in favor. I also don't see another path forward, but we probably want to just make sure the Lankity folks are okay with that. And the console folks are okay with that since both of them have implemented traffic target and traffic spec. So we can just ask on the channel. No, I don't think so. I mean. Oh, Lankity doesn't have traffic target and spec. Oh my bad. But they do have specs, right? Because do they support? Actually. They don't support the one alpha three of traffic split. So they, the spec doesn't matter. Console is the only implementation. So we probably want to just run this past them. And get an approval. And also the route change will break console as well. Which HTTP route change. If we add the spec date. Oh yeah. To the traffic specs, right? Yes. Yeah. Okay. Well. All right. Let's follow up in the channel and try to get it resolved today. And that's late for the console folks. But anyway, I couldn't find. The one thing I was looking for was a PSMI implementation for console. So I'm not, I'm not sure console actually uses SMI. What is just that project to. There's a showcase that. This is how SMI could work with console. Oh yeah. Did you see the adapter, right though? Yeah. of it on the console docs, nor in the console connect hand chart, so whoever is using it must be aware that this is purely experimental. That's fair. Yeah. Do we have anyone here on the call who could comment on that? Who knows about the status there? No. OK, so what do you suggest to do about it now? We need to follow up on the Slack channel. OK. And most likely, it'll just be fine. I don't think anybody is going to disagree. I just want to make sure everybody's aware. So it's fine. I can ask Nick and take this up on the Slack channel. Oh, cool. Any other comments on that one, or can we move on to the next agenda item? Going once, going twice. All right. The next one is by Kaliah, conflicting HTTP headers from traffic split and traffic spec. Would you like to expand on that? Yes. So hi, everyone. My name is Kaliah. I work with Michelle at Microsoft. And so I was wondering about V1 alpha 3 for traffic split, because that allows you to specify HTTP headers for that. But so does traffic spec. So what's the guidance for if the user configures a traffic split with headers that conflict with the ones in traffic spec? So traffic split does not support headers. It supports a reference to a spec where you specify headers. OK. I think Kaliah meant traffic target, right? Oh, traffic target, sure. So if your access control policy has one thing, and then you create a traffic split which references a traffic spec, but that actually executing that traffic split is not possible because of the policies you define or the specs you define in traffic target and reference and trap and target, how should an implementation deal with that? Is there any guidance around it? Have we ever talked about it? I think that's a big question on our mind. Kaliah, you have anything else to expand on that? No. OK, so the use case is you create two HTTP route groups with conflicting headers in them. Yeah. The reference one in the traffic target and the other one in the traffic split, right? Yeah. OK, so the traffic split will dictate how traffic flows and the target will block the traffic for that header. And don't see where the conflict is. I think one of the questions I personally had was should we mention in the spec anywhere what happens when this conflict arises should we explicitly state that, in my opinion, it should be the implementation that handles it. And so if there are errors that need to be bubbled up or the user needs to understand that this split is not possible or the errors that are coming because of this conflict between the split and the access policy, that, in my opinion, is up to the mesh provider. But it would be helpful if I think the spec talks about this type of behavior. I don't think that we explicitly state that these are independent functionality that don't necessarily, we don't handle them overlapping. And their overlapping or conflicts between them overlapping is something that the provider has to deal with. Yeah, I think it goes back to the status discussion. I think that's right. So let's imagine on the traffic target, status, sub-resource, the implementation should report that this is blocked on the traffic split. It should report, OK, this is blocked by the traffic target. And this way is not specific to the implementation. Anyone can look at the status and see that, OK, this route is blocked by this other definition. Yeah, on create, that would be a good thing to make sure the implementation deals with. And then I guess when a policy is applied, does that mean that the status of all relevant traffic splits get updated too? Because I think on create, it makes a lot of sense to update the status. But if you want to deal with conflicts all the time, it just might be heavy burden on the system. Maybe that's just what you need to do. I'm guessing going to change. How often will you change a HTTP route group definition? Let's say it makes sense to change it every time your API, your app interface changes. Maybe you add a new route or you add a new HTTP method to a certain route. That is where a route group could change, right? Yeah. It's not every single deployment of an app will reflect in an HTTP route group change. But yeah, I get your point, you can. Or even a traffic target change. Yes. But if a traffic target blocks a split, then you'll get a non-authorized error, right? If the split points to, let's say, unknown pod or a crashed application, then you'll get a 500 something. And it's from the HTTP response, it's clear that, OK, this is unauthorized and this is, I don't know, timeout or 503. I forgot the same. So I'm guessing from a user perspective, it's clear that, OK, you get an unauthorized error and you can handle that. Yeah, that's a good point. GRPC also has a code for unauthorized. TCP is hard in this case. But then you don't have headers for TCP. So it's not an issue, UDP either. Yeah. I agree, we should mention something in the spec about this. So is there anything actionable around that? Can we create an issue or somehow someone taking up on that? Kalia, do you have an issue already open? No, I don't have an issue open. Do you have any thoughts around this? Me? Just from the conversation. Yeah, did you want to add anything? I just wanted to give you an opportunity. OK. Yeah, I think I'm still just trying to follow along. So I think from this discussion, there's a lease between us to find. And I haven't heard any objections. But I think we agree that we should mention what the behavior may look like when there is some conflicting SMI policy, when there's a conflict between policy and traffic split. So there should be some message in the spec. I think we both agree on that. There's an open conversation around updating the status of the traffic split object to reflect that there might be a conflicting policy. Or if this is not something that we can create at this time, like we just need to have that conversation and the issue about around what specification we want to define around updating a status for the traffic split object. And then the third thing is just outlining what the behavior should, what the behavior normally would be. You would get 503s or any other, we can outlight any other behavior that we could look out for. That would be common to any implementation. Just so if someone comes across this issue, they know what to look out for. And then is there guidance on, so are we just not going to resolve it from the implementation perspective? I think we have to get more thoughts from the other maintainers on that. Maybe since we've had this conversation, we can ask them to watch this discussion or fall on the issue queue and give us more thoughts. I don't think it should be something that the spec should define behavior for. I think that it should be an implementation-specific thing or the implementation should handle conflicts. But I do think we should call it out on this spec and clarify what good patterns might be. OK. One second. I don't think conflict is the right word for it. So imagine I have a HMP route group and a traffic split definition inside, let's say, my head chat. And I'm deploying my app to do canary deployments or whatever like that. Then a cluster admin comes and say, hey, I don't want traffic to be routed with this header and it blocks it. This is how firewall works. You can enforce something. So in my mind, if not the conflict is a policy enforcement, even if you have some route definition that lists a bunch of headers, a cluster admin can say, hey, from all these headers, I'm allowing only these ones. Because maybe you cannot change the route group because maybe that route group definition comes from outside your organization, from a head chat made by someone else, for example. Yeah. That's a good point. If you think about network policies, you can then your app, hey, I want to call this other app, then the cluster admin says, no, here is a network policy and I'm blocking it. In the same way with SMI, we say I'm allowing this kind of header. Yeah. So it may just like it's not necessarily a conflict. So we might not have to work around it. We just have to call attention to it. Yes. For sure. OK. All right. Cool. We have seven minutes left and we are at the end of the agenda as far as it's in Google Docs. That's really good. So any other business, anything else that's not on the agenda anyone wants to raise something or suggest something or volunteer for next time moderating or scribing? Yeah, I have a question. Yes. Please go ahead, Stefan. Is Microsoft working on a SMI implementation? Yeah. I think so. I don't know how much I can say, but yeah, Microsoft is working on implementing someone. Related question. Is there any company anywhere in the world that is not working on an SMI implementation? All right. Google isn't. That's OK. We have an adapter for them. Speaking of which, I really need to review some PRs on that adapter. Yeah, that's a good point. The state of the adapter is that it is it functional right now? I don't know that it is. Yeah, it is. Why wouldn't it be? I haven't tested it. Yeah, I haven't been given. Yeah, I had feedback recently that I think it was with traffic metrics that it was working. Yeah, that's correct. The adapter doesn't deal with the AC adapter doesn't deal with the traffic metrics. It just deals with split in policy. But yeah, metrics doesn't work because Istio's made some updates in the latest version of how they do metrics. And so we do need to update the traffic metrics implementation for Istio. Don't believe that works. And I was trying to carve some time out to do that. But to be honest, I had a really hard time, along with some of the other folks on my team, understanding SMI metrics. So I've been trying to figure out how to iterate on the document, make it a little more clear how to actually implement it. And I think some of the instructions on the metrics repo are also out of date, considering they're home two instructions and not home three. And then the API server runs, but it doesn't have any storage mechanism in the pod. So you're just getting a JSON response that is in the form of the traffic metrics object. You can't really query Kubernetes to get the object back. There's no CRD or anything like that. And I'm just curious about the background of some of these decisions. So I'm going to try to carve some time out to work on the metrics stuff. And if anybody else is interested, let me know. Yeah. My understanding is that the SMI metrics library is just a transformer between Prometheus and the external metric API in Kubernetes. So every time Kubernetes asks, you ask the Kubernetes API, hey, give me a metric for a pod. Then this library, what it does, it translates that API calling to a Prometheus query. And it returns the result in the Kubernetes metrics object. So I don't think it needs a storage because Prometheus should be the storage. Yeah. Probably some caching. So it doesn't bombard Prometheus. You should have like 100 HPAs doing the same calls to the same thing. It's more like in-memory caching could be used there. I know the metrics server, the Prometheus adapter, has something like that. Does Flagger use metrics at all? Or does it SMI metrics? Or does it plan on using SMI metrics at all? Who? Does Flagger plan on using SMI metrics at all? I've been thinking about this. And it's quite limited. I mean, most people use Flagger to write their own custom metrics queries. Like, OK, you have latency and error rate that comes built in with any service measure ingress control. Most people are interested in, hey, how many connections is the canary? Has the canary opened on the database? It's over a limit. There is no good or any kind of business metrics. So right now Flagger can call out to Prometheus, TechDriver, and Databug. And people are asking for a new relic and so on. So it's a different type of different scale of metrics. That makes sense. I've been trying to, I think, getting metrics on edges is really valuable. Also, we've been looking into how to do that. And I think some of the docs need to be updated because you can't really do edges with services. So that's just something to look out for and something that needs to be updated. I think metrics needs a little slash a lot of love right now. I see we're out of time, so I'll stop rambling. All right, we have the top, well, not really the hour, but half hour. So thanks a lot, everyone, for showing up. And thanks a lot for subscribing, Richard. And see you in two weeks' time. Thanks a lot. And it'd be great if people wanted to volunteer to leave the call. Michael did an amazing job being tapped at the last minute. Otherwise, I will pick a victim next time. Thank you. Bye.