 Hey everybody, welcome to our Q&A session. We've had a lot of interesting talks already this morning from Istio project members about new developments in the Istio project, new ways of using Istio and particularly about our roadmap. This is our opportunity to hear back from you, hear questions around the ambient mesh that we've been talking about quite a bit so far this morning, as well as just around Istio in general. With me, I have the Istio TOC members and I'll go ahead and ask you all to introduce yourselves. Why don't we start with John? Hi everyone, I'm John. I'm a staff engineer at Google. I've been working at Istio for almost five years now. So I'm excited to chat with you all and answer some questions. Let's go to Louis next. Hi everyone, I'm Louis Ryan. I'm a CTO at solo.io. I've been with Istio since Istio was a thing and excited to be here as well. Len, how about you? All right. Hi everybody. I'm back. So excited to see you guys. My name is Len. I've been working on Istio where I was also at IBM for five plus years and then I joined solo about two years ago. So I'm leading the open source development at solo right now. Eric, last now the least. Hi everybody. My name is Eric van Orman. I'm a senior software engineer at IBM. I've been working on Istio now for close to five years. Recent probably the most recent addition or maybe Mitch was the most recent addition to the TOC. So welcome everyone and we're here for your questions either you ask them in the chat or maybe in the Q&A. Yeah, we have two ways for you to send questions. There's a Q&A tab at the top right side of your screen on the event management platform. There's also the IstioCon channel at slack.istio.io either of those are great places to post questions and we'll be monitoring them throughout the session. While we wait for the first few questions to trickle in, I've brought a few of my own to sort of warm us up. So Istio has long championed the sidecar architecture. That's been something we've been founded on for some time and we've defended. Why are we now shifting to this idea of shared proxies? Who do you want to feel that way? So this is about learning. If you're going to build a software solution as impactful as a service mesh that does all the things that a service mesh do, you're going to learn a lot of things along the way about what is a good fit and what works for one set of users will not necessarily work well for another set of users. We started out doing sidecars because sidecars were the easiest and quickest and most consistent way to get the set of properties that we wanted the service mesh to have. A huge part of when Istio launched was the ability to put empty last traffic routing and observability into one package. And then we spent about the next five and a half years tuning, optimizing, making enterprise-ready that package. But at the process of those five years, you learn a lot of things along the way. You also learn how to do things a little bit better. And it's probably fair to say that the discussions about Ambient were going on probably about four or five years into Istio. So maybe even three years ago, we were having Istio up to discussions. And so I think most of the questions that people get today about Ambient are, is it as secure? We don't really get questions about can it observe the same set of traffic. We don't get questions about can it perform the same routing and operational functions for traffic management purposes. We just get that basic question, is it as secure? And it's a perfectly reasonable question. We wouldn't have done what we've done with Ambient unless we thought and had tested and validated and checked with a lot of people that we could do something that was both better and as secure. And there was a blog post done by Justin and a few other folks last year talking in extensive detail about the security properties of Ambient and why it's as secure at least and in some respects more secure than the Sikr model. Now they're different. So there are some variations between these things. But assuming the whole existence of Ambient is predicated on being able to answer that question in the affirmative. If we can't say that it's as secure, we would never have done it. It just wouldn't be a thing that would have been crazy. We would have looked for other opportunities to get the other things that we were trying to get with Ambient, which is a better operational model, a cheaper resource footprint, which are the kind of observable outcomes for the end user. We would have gone after those in a different way. I certainly encourage people to read the blog. I'll take a link and I'll put it in the chat for people to take a look at it. So then if you think about, so assuming we can answer that question, then why is Ambient better than the Sikr model? And it's really a blended model. And if you look at, talk about this, if you look at software and you're trying to optimize the system, the first level of optimization you're going to take for any system is, I'm going to rewrite it. So I'm going to take a Java application. I'm going to go rewrite it and go or Rust. And I'll yield maybe an order of magnitude improvement in performance or resource consumption or something like that. But often that's not worth the effort. Generally you could try to optimize the system. You want to yield two orders of magnitude improvement. And what that usually means is re-architecting and also sharing. You've got to find a better way of reusing data, reusing code, reusing the resources to yield more than one level of improvement. And so that's why Ambient is also new architectures to get to, in many cases, two orders of magnitude improvement in a lot of dimensions. So that's why Ambient has done the way it's done. But that first question, that security question, was the critical one to unblocking the decisions around everything else. And Ambient is, we use a node local proxy, which is shared but has the same security properties as a sidecar for L4 traffic. And then we share the L7 behavior, which is the much more complex thing. And we put that in the network and putting it in the network allows it to be shared. And that architectural shift yields those improvements. Anyone else want to pitch in on that one? Yeah, I would just touch in on this. A lot of what we said about the learnings. I had a similar question myself is that, not why are we shifting away, but why didn't we do this six years ago? And I went and dug up a super old document way before I joined East Geo. I don't even know if it was called East Geo at that point, comparing what architecture should we use. And you can see that at the time we just didn't have as much experience with all the models. And some of the stuff that we're building on top of wasn't present back then. So we had considered a lot of these things, but we didn't have the expertise that we've learned from running service mesh and production for six years to kind of put all the pieces together to have something that meets that, or does a magnitude improvement that Willie mentioned that will maintain the security properties and other things in that regard. So it's pretty natural, I think, for a project to get years of experience and then kind of rethink things and improve on them. I would add, I think our users really give us the feedback in a sense. I remember seeing the crazy tweets about, look, East Geo is using 90% of my resources in my cluster because of all the psychos. And I remember user being complaining to us, look how frequent we have always CVEs regarding the psychoproxy. And then now I have to kind restart everything to pick up the new psychoproxy. So all these operation pain point and also the cost resource utilization, you guys been complaining about really help us, I think, to land ambience to where it is. And to Louis point, definitely security is the first thing. We have to make sure it's as secure as psychop, not better. So it was designed in mind in the first place. Eric, do you want to add anything? No, I think that's all a pretty good discussion. I think we talk a little bit about where we were and I think like any of the other projects out there, at some point in time, you might decide that where you're going isn't quite the best fit for where you should go. And instead of continuing to make things along that route, it's the sidecar route, take a step back and figure out what the user really wants and what they need and sort of make a small pivot to do that. And I think we've done well with that. I think with we talk about the ambient mode, sidecar interoperability. So it's not a one or the other. It's not like you're going to have to make the big switch. You'll be able to migrate, hopefully, very easily and get to the new mode if you decide you need it. So I think that's good. Lynn, it sounds like what you were saying is that this is not so much a backtracking or a reversal, but more like the natural progress of a project. I think of the Apollo missions proved out that manned space travel to the moon was possible, but the space shuttle made travel into low Earth orbit something like one tenth of the cost per kilogram because of reusability and optimization. Is that sort of what we're seeing in Istio now that move towards efficiency? Yeah, I would say definitely. I think a lot of people we talked to are extremely excited about ambient simplified operation, right? The factor they don't need to change anything in their application is amazing for the longest time. I think people are excited about a surface mesh doesn't require them to change much code of their application by simply dragging that sidecar without having language specific libraries. But now with ambient, we're definitely taking to the next level where you don't even need to make any modification, not even drag that sidecar with you. So that is huge, along with the cost saving perspective, which we've done a lot of study. As solo, I believe we've seen 90, even up to 95, 99% of the resource saving, especially compared with the provisioned resource. So we're going to do a little bit more resource calculation in the community with larger scale deployment. But from our initial testing, it does seems to be trending to that resource saving trend of saving at least 90% of your resources needed with ambient. Those are some really impressive numbers. So if you're just joining us, I'm live with the Istio Technical Oversight Committee. We are having a Q&A as part of our IstioCon event. And so if you're on the event platform, you can put questions in the Q&A section at the upper right corner of your screen. We're also taking questions from the IstioCon channel, which is on slack.istio.io. And we have our first question here from Samuel. Is it possible to run sidecar model and ambient in the same mesh and also in a multi-cluster mesh? Anyone want to jump on that? So the intent absolutely is to be able to run both sidecar and ambient in the same cluster at the same time and have them interoperate. That obviously represents some complexity and we haven't validated this for all these cases yet or tested every variation of it in even in single cluster, but that is being actively worked on. And we have to do that. Upgrade is impossible without the ability to do that. And we need users to be able to upgrade. A lot of Istio users out there and also a lot of Istio users out there. And anything other than that is probably not going to allow them to move to ambient. Some users can probably take out the cluster and do blue-green based upgrades at a whole cluster level, but not everybody can do that. And we absolutely recognize that. As for multi-cluster and also multi-network, which often go hand in hand, it's kind of the same state. We're doing active work in the project right now to do multi-cluster and multi-network. They're not there yet in open source. So that's just being actively worked on. There was a lot of discussion in the working group last week about that in a couple of different areas. And so that should come out in the next couple of releases. Anything else? We've got a follow-up question that is specifically for Eric and John related to that. So Eric, I'll throw this out and you can kind of combine your responses if that's all right. I have a question from Beth. She asks, in the Istio roadmap session, you mentioned a list of items that have to be in place for ambient to reach the stable release, including multi-cloud networking, trust and security hardening, observability, and ambient sidecar interoperability. Can you drill down on those things? What still needs to be done in each of those areas? I think there's a lot of work that's going on across the board in that space. I mean, we have a weekly ambient work group. I guess it's not a work group. An ambient development meeting where we dive into some of these things. We also have a document that's out there that's talking about some of these things and trying to identify owners and sort of the state of them. Yeah. There's the link there. Thanks, John. That talks about a lot of it and identifies people. Anything you want to add to that, John? Not just so much what you said. Most of the stuff that we're talking about, it already exists in some form in what we have shipped for ambient, but we need to make sure that it's at a very high bar of quality before we go until existing enterprise customers, hey, you should go adopt this in production. That's a very high bar to meet. So, while if we were a brand new product, we may be pushing more users towards this, we kind of have to be better than the Istio side credibility. Otherwise, we're kind of doing just service to our users. So that's really what this is all about. It's making sure that things are really, really solid. They're not going to cause weird issues, edge cases, fully tested, et cetera. Included ducks. Our mantra in the Istio project is that if you didn't test it, it doesn't work. So there are things that you may find, incidentally, that work with ambient like sidecar interoperability does have some degree of functionality that you can get out of it today, but we don't have thorough testing on that. And we're not confident that we should be telling you to rely on it until we've been able to test that thoroughly. So that's kind of what we're doing here. Thanks, Beth, for the question. I have another question from, I think the name is Subankar, I apologize if I've mispronounced there. They ask, what are various tools available to do a cost analysis for Istio with sidecars versus ambient mesh if we have to highlight the advantages to our org? Yeah, that, Louie, go ahead. No, no, no, go ahead. Sorry. I was going to make an Excel joke. Why don't you tell the joke first then? That's probably more interesting than what I was going to say. Yeah, do some math. No, you should go ahead then. Okay. Yeah, so I'm just going to chime in. I mean, we're not perfect on this measurement of ourselves, right? In the community, we're trying to do more work on this, but we've done some initial work. So that typically involves, you know, the design, which application you are going to use for the testing, right, and be able to drive certain load for the testing, and also define, you know, what functionality do you need from ambient, right? What deployment architecture are you looking at from ambient, right? Do you just need a zero trust tunnel, or do you need a waypoint? And does each of your application needs its own waypoint? Can you with the namespace-based waypoint be reasonable for you? On top of that, we have also developed a simple Grafana dashboard to help you visualize when you put the resources together, right? So like what we did was once we define these other applications we're going to use for these scales, and these are the loads we're going to generate. We first run them with PSYCOP for a certain time period, and then measure the CPU memory resources in the cluster related to these applications and related to your control play. And then after that, we do a similar load test, really assuming these applications are not running in PSYCOP, but running ambient, and then we look at the resource utilization in terms of CPU memory, and then that's how you can tell, you know, what is the actual resource saving from that perspective. The other thing I wanted to encourage you to look at is not only the actual resource saving, but also what are the provision resource requirements that you had to provision for PSYCOP, right, for these applications, and then what are the provision resource requirements for ambient, for zero trust tunnel and potential waypoint if you need any, and then compel the saving. Because what we find out is PSYCOP typically needs more reservation of resources, which is what you pay, by the way, you're not paying for your laptop based on your usage, right? You are paying for that whole laptop, even though you're using only 5% of your laptop, right? So in terms of paying the cloud provider, it's very similar. You pay for what you reserve, and not necessarily what you actually use. So I highly encourage you to look at the resource reservation between PSYCOP and ambient as you do the cost analysis. I can also link to some of the blogs we had from Solar, which by the way, we're trying to publish it still.io, so you can see how we were able to do some of these calculations in terms of resource. Thanks, Lynn. We have another question from Samuel. The sidecar model allows a good level of troubleshooting, but it's mostly envoy logs. Will ambient provide a similar logging and troubleshooting experience? That's a good question. So ambient can kind of compose the two layers, right? The waypoint is itself an envoy that's similar in most ways to sidecars in terms of debugging. So the experience there will likely be fairly similar in that we have logs as kind of the first phase and we have some tooling to inspect the kind of state and metrics and tracing for other types of debugging. So a lot of that will largely stay the same, I'd imagine. For the z-tunnel layer, the node proxy, it is its own implementation. So a lot of it is inspired by envoy debugging. Like we have a similar config dump API. We have logs. We have all those same sort of things. We have the easterocuttle support for a lot of the debugging tools. So I expect it will be fairly similar. That being said, we'd love to hear feedback on how we could approve that debugging story. We have a lot of opportunities to make changes now that we're re-architecting. Thanks. Thanks, John. And thanks, Lynn, for sharing that blog post. You can find that in the chat channel off to the right side of your screen explaining a little bit more about ambient mesh resource savings. And that's from a speaker who's going to be coming up a little bit later in our conference, Greg Hansen. All right. We have about eight minutes left. So, John, I heard you talk in your session about the need to get feedback from users. Can you give us an idea of A, why this is the right time, or if this is the right time, for our users to be kicking the tires on ambient mesh? Or maybe they should just hold back and wait until it's GA. And then secondly, which users are those that should be really trying this out? Do we need the power users to be doing this? Is there a user who maybe doesn't have any business trying this out or not to be waiting? Yeah. Now is kind of the perfect time for trying out ambient because it's stable enough that you can use it without it crashing immediately and just not working at all. But it's not so stable that we can't make changes to it, right? Like if you came to us and said, hey, I had this issue with side power approach and we need to go make this giant backwards and compatible change. We maybe could address it, but it would at least be much, much harder, right? It's very hard to change the behavior of a product that's already been rolled out in production to hundreds and thousands of users. Ambience different, right? We have, we can go make breaking changes on a daily basis at this current point in time, but in a few months from now, that will slowly start getting harder and harder. Now, who should be trying it out? It's really everyone. In particular, like people that aren't currently yeast users are really valuable for giving feedback, because one of the goals of ambient is to extend the reach of Istio so that we can target users that historically haven't used Istio for various reasons from cost to complexity to compatibility. The other thing is like power users as well are great because they're obviously using Istio and have oftentimes very intense demands for various functionality, but also just a typical casual user as they make up the bulk of Istio usage. So really everyone in that's here or even not here would be great to try it out. Of course, where you should be trying it out is not in that production environment where if you have an outage or security issue that it's a big issue as it's currently in the alpha stage. But trying it out in development environment, you know, we would love to hear feedback from anyone. Thanks, John. I have one last question here from Sanjeev. Existing brownfield applications may already have been written to use TLS libraries. So we're finding from existing deployments that it's primarily greenfield apps that are written to not use their own TLS libraries. Instead, let the infrastructure or the service mesh handle it. What kind of apps are we seeing adopt Istio? So I would actually challenge the assumptions here a little bit. We see a lot of enterprise use cases of brownfield applications. And the more brown the field, the less likely they are to be doing TLS in a good way or at all. Like an application written seven years ago using SSL 3.0 is a security problem. SSL 3.0 has known security vulnerabilities. Its negotiation of ciphers has problems. If it's being done at all and in truth, when you look inside most enterprises that have been around a long time, have a large tenured IT portfolio, there's a lot of just plain text stuff and VPNs or other things were being used to try and secure stuff. Or very, very expensive firewall products were being used to try and make sure that bad things weren't happening in the plain text realm. So there's a big kind of bimodal distribution. There's large, long tenured enterprise IT like think a bank that's been around for like 200 years versus a company that started in the last 10 years and started developing and is greenish. Even in the greenish world, we still see a lot of plain text. If you talk to a database or you talk to a queuing system or you talk to some other piece of infrastructure that you bought, they're very often not doing TLS. And most people are actually just doing TLS at the edge of their network. They're not actually doing TLS internally. So there are definitely places that very consistently do TLS internally in libraries. GRPC and users of GRPC do that very consistently. And we see that with other consistently used modern frequently updated REST client libraries. But that's not the average. It's not even close to the average. And that's obviously fixing that problem is pretty valuable. Like putting bad TLS on good TLS is also a good thing to do, by the way. So we see some of that as well. Anyone else want to chime in on that one? Yeah, one thing I would just also say is, you know, that's what kind of apps are we seeing adopt use to. One of our goals that we've done an okay job with sidecars and are doing a very good job with Ambient is allowing you to run on any application. So even if your application does TLS in the application, you can still use Easter on top of that. And you may have two layers of TLS, which maybe sounds crazy. It's actually not so bad from a performance standpoint, right? You lose a few percentage points of CPU usage, which you can later, you know, at your leisure, maybe change your application to offload the TLS entirely. But it shouldn't block your use to adoption. That's a good point, John. And I think for a lot of people, the cost of updating a brownfield app to stop using TLS is going to be massive compared to the cost of double encrypting your traffic for this brownfield application. So that's an excellent way to move forward for those users. Yeah, I mean, nobody thinks twice about putting in TLS over an IP set based network. That's double encrypting, right? There's no difference here between that and Ambient. So I think that John's answer is the right one. Don't sweat it. Let Ambient do it for you. And then you can decide what to do with your own TLS at your leisure. And I was going to say, the other thing, and I know Louie has mentioned this in the past is, you know, if Ambient and service mesh becomes part of the infrastructure, right? You know, it's not even an app question at that point. It's just, you know, it's part of the infrastructure. The apps are the apps. Yes. It's probably a good mental thing to think of it. It's no longer really the applications adopting Istio, right? The network admin or the Kubernetes cluster operator is the one adopting Istio. The apps are just writing policies to get specific behaviors out of the networking infrastructure they want, like routing, right, or circuit breaking. But they don't own the infrastructure anymore at all. It's hidden from them. By the way, that's exactly... Sorry. I was just going to say that's exactly why it's called Ambient. Right. Yes. It is magically in the air. All right. Well, we're out of time. Thank you all for joining us for the Q&A session. And I'll leave you with a brief parting question. And it's been a burning question for me for five years. Which way is the Istio boat sailing? Is it going from right to left or left to right? It is. The small sail on the front is a spinnaker for those of you in the familiar with the sailing terminology. So it's going in the direction of the small sail. And I think that means it's going... Looking at your shirt right now, it's going from... That way. Oh, this way. Okay. Yeah. This way. That's a fun one. Thank you all for joining. And I look forward to seeing you in Chicago at Istio Day North America. Bye, everybody. Thanks, everyone.