 Okay, let's go ahead and get started. My name is Tim, I'm a principal software engineer at Lilt and this talk will be about lessons from scaling AI powered translation services using Istio. And I'll be co-presenting with Malini. Hi, I'm Malini Bandaru and I'm actually representing Iris here today who's worked closely with Tim but she couldn't be here this morning and I'm a principal engineer at Intel and I'm a cloud native architect. Great. Oh yeah, so I'm a principal software engineer at Lilt and so what is Lilt? Lilt has to do with AI. Lilt is a platform for contextual AI and translations. We work with large organizations to have lots of content they need translated. We help them with that and we have predictive translation suggestions in context learning, fine tuning over time and these sorts of features you can imagine there's been a lot of AI hype lately and so we've had to adjust to that and we've had to move quickly to get the features out that we want to to serve this new AI demand. And so if we just go to the start here, this is sort of our initial architecture before Istio came into the picture. If we just considered sort of a single feature say that we had to build, there's a backend team involved they're exposing APIs over RabbitMQ, there's a front end team, they're consuming these APIs over RabbitMQ and they're doing additional work to expose that functionality over REST API and if we needed to do any sort of custom routing or create new subdomains or anything like that, we had to get the infra team involved. And so if we were just for a single feature like this, there were many JIRA tickets, all of these teams are located across the globe in different time zones and it was just really hard to, you know, the problem seems simple. It's like I just want to expose this API endpoint. Why do I have to create three JIRA tickets and get up at 7 a.m. to talk to someone in Europe to do this? And this is something we wanted to change. We wanted to be able to iterate more quickly. We wanted to be able to build these sorts of features more independently without all of this inner team collaboration that needed to happen. And so this is sort of what we landed on. This is how Istio came into the picture here. We wanted to use Istio as essentially the API gateway. So we wanted Istio to sit on top of essentially all of our traffic and we wanted to be able to route to different back ends, things like this. This is sort of just a normal sort of more microservice based architecture. And so in this sort of environment, you know, we have a bunch of app teams here. And one of the main goals of what we wanted to do is we wanted to make sure that the app teams could self serve any routing or domain provisioning that they would need to do over the course of building a feature. And we wanted to make sure the infrastructure team was able to manage Istio installation where all of the configurations were able to help out if anything went wrong. But the main goal was empowering app developers to self serve all of these needs. And so some of the things we did to accomplish that is enabling individual teams to own authentication and routing. And so as part of this in the Istio API gateway, we developed a external authorization provider. And so each application team didn't have to worry about authentication. They were able to just define their own authorization policies and secure their API endpoints and their services. And also one of the important features here is inner service communication being more secure as well. And so I guess just quickly some things that we learned through the course of developing this is that it's hard to change the engine while the car is running. So we wanted to be able to introduce this to our teams, to our clusters, to our infrastructure without having to sort of stop the world. All of the teams are busy building features. We don't want to interfere with the work that's already ongoing. And so one of the important things for us and one of the reasons we decided to go with Istio in the first place was that it allowed for us to gradually adopt features over time. And so we were able to slowly but surely install Istio into our clusters. We were able to find sort of self-contained use cases for it and prove it, you know, prove this functionality on a small scale and show it off to other teams. And that really started to get more momentum around using Istio. And part of our rollout to this was to make sure we had adequate documentation in place for these different types of stakeholders in our organization. So application developers, more platform developers or infrastructure developers that we had tailored, you know, documentation tailored specific to what they cared about. And we had, you know, for like the app developer use cases, we made plenty of example sort of helm charts that people could reference and copy into their own projects. Yeah. And in general, we were able to roll out Istio and gain more use of it across our teams and across our application. And we weren't yet. So just one of the other key learnings was, you know, to introduce new things rather than replacing old things. And so gradually over time, you know, we're able to just introduce new services, new API endpoints that made use of Istio and made use of this new technology and just sort of phase out the old things or keep them around, you know, if we need to keep backwards compatibility. Yeah. And so some important aspects of the little platform, you know, is, you know, there's a couple things here, low latency suggestion responses. And so, you know, people use little to help inform the translations that they that they that they provide. And it's important that these, you know, network requests are, you know, have low latency, and we don't want extra network hops to slow down requests. And also we care about data security. In terms of, you know, inner service communication, we want to make sure that any data that is sent across the network is secured, is encrypted. And, you know, these two things, you know, if you're, you know, we don't want to compromise on data security, and we don't want to compromise on on latency. And, you know, there are there are some new cool features that have been developed in Istio that make it so we don't have to compromise between those two. We can get low latency and we can get security. And so I'm going to hand it over. And yes, you can present. Thank you, Tim. So little is a real world use case. And this whole engagement between Intel and lilt is to help adoption. So you have cool technology called Istio, but how many people are using it and their real application developers busy with their features? How do they leverage it? So this is really a collaboration engagement. So at this point, we'd like to tell the lilt folks like there are the wonderful things in Istio that you can leverage. And one of them as part of our next steps of slow adoption, because remember, a lot of this is cultural, like Tim said, it's a running engine. You can't change everything. So some of this has to be paced. So the next thing we'd like them to use is the crypto multi-buffer library. It essentially provides boring SSL for RSA and it uses Intel's AVX 512 instructions. That's really vector operations that allows eight simultaneous channels of encryption. And with this, you can get significant improvement from 23 to 25% latency reduction, which is super important for a translation task. You have something coming up online. Either it's in a Zoom call or a Teams call and you want to see live what somebody is saying in another language. Or it's text that's popping up on your screen and you want to just translating as you're scrolling down. Another important thing that it does provide is more queries per second. So it's about 30% improvement. So with the crypto multi-buffer library, we get significant improvement. What else? There's a lot of routing and filtering happening in the whole system. And this chart just kind of tells you what's happening. If you were to use Envoy, there's a listener, the network filters, HTTP filters, there's RBAC and so on and so forth. But something common to all of them is they can be leveraging Hyperscan. And there is a solution, an implementation that's even better than Google RE2, the regular expression processor that Hyperscan can solve for you. And that provides you significant improvement too. So you can get about 16% reduction in latency. So that's awesome. And more queries again. So 20% improvement there. And last but not least, security is important. For all your communications, whether you request to keep the request safe, you want to protect all those MTLS keys. And this is where process-based isolation with SGX has been integrated in Istio and Envoy to provide you that protection. So at the Istio gateway, you can use SGX. At the Envoy sidecar proxy, you can use it. And then you can even have a trusted certificate service inside your cluster. And they can all use SGX. Oh, wait, I do want to show you something here. So when I say trusted and secure, what does this mean? If you do get secrets in Kubernetes, it just pops up the secrets. It's not really very secret. It's in the plane. But if you do execute it where your services are running in SGX, they just return you blanks. So that's about all the other things that Liz can use coming soon. And it's a cultural thing. So he has to pace all his engineers, give them enough documentation, handhold a bit. So we handhold with them. They handhold with their developers. And that's how we bring value. And I can see us also going towards ZTunnel and Ambien soon as they increase. Currently, they have about 100 pods, several services, but latency is their big issue that they do want to keep low so that they can translate in real time. And with that, we're open for questions. Any questions? Yes, please. Yes. Same thing. SGX can protect all your keys. Yeah, yeah. And in fact, that's the best way to use it, you know, MTLS. It's good. I mean, it's really just same encryption, right? But you just keep your key and that's part of the initial negotiation. I certify myself to you, you certify and then we use our keys. So it's just the initial handshake. So that's a very good question. It's available at scale in Azure cloud, but not as available in other clouds. You can have it on-prem, Intel's Ice Lake servers have it, and then it's SPR and all future products have it too. But it's process based so you can have a very small trusted compute base. It's efficient and fast. Yes, it's definitely grown now. There are no size limitations. And you can dynamically also grow the memory you allocate to SGX. Okay, so the question on this side was, are there any limitations with SGX in the past? There were the amount of memory you could allocate to an SGX enclave is limited, but that limitation has since been removed. Further, there is a library OS called Grammine that you can just attach to your thing, ensure that everything works with the Grammine library OS. So it's much easier to use SGX. It's no longer the case that you have to use the ISO or the SDK. You take your application, combine it with the library OS, ensure that it works, check it and then go for it. And the onboard proxy and all works with SGX. It's been confirmed. Yes. So attestation is right. You know, there's a few steps initially to set up SGX in your cluster. There's a step where you register your machine, get certificates. Then there's another library that goes with it. But once that's done, you have a few options. You could use an external attestation service like Intel's trusted authority, or you might even use one provider by your cloud provider. It's all a matter of how much you trust, whether you want to keep your CSP inside your trust compute base. You could use MAA, the, you know, Microsoft Azure Attestation Service or Intel's trust authority. What come? Any other questions? Thank you. So we're very thrilled that a real world machine learning app is using Istio and wishing little to all the best. Thank you.