 Hello, everyone. Thank you for attending this lightning session on Confirient Canary Deployment to production with STO. Today, I will be sharing a case study or the use case of using the STO service base at our financial platform, Oyster Financial. So I am Raju Dawardi, Site Rediivity Engineer at Oyster Financial and a Google Developer Expert in Cloud, as well as building JBox and the Cloud native community in Kathmandu. So talking about briefly on Oyster Financial, so we are a Mexico-based fintech startup, which is mostly targeted for the freelancers, startups, and small and medium-sized businesses so that they will get a banking account and a debit card within a few days. And we recently raised a large street down in Latin America and were distributed in multiple time zones of the US, Mexico, Nepal, and India. And we are a mobile post-application platform. So starting from the early days, which was late 2018, we started with five services, which had grown up to 18 in the span of more of one and a half years with both GRPC and the SCTP services. And from the early days from the day one, we started adopting STO service base. We set a lot for managing the traffic routes between those applications, keeping our cluster and the services secure, communication between them using the MTLS or the ingress gateways, ingress gateways, and also with the integration of the application load balancer of AWS, we gained more security in terms of the firewall as well. So talking about the deployment and the releases, we created a new, we triggered a new build of the Docker image when you create a new Git tag and role. Any of those services, whether it be GRPC or the SCTP, run inside the STO service base. And there are multiple naming spaces for managing each of the tenant of those applications. But they can communicate, but we still keep it secure by using the policies between them and have a very good health-seeking mechanism for taking the liveness and the readiness probes. And there are two cases, like for solving the internal users and for the live users, for getting us more confident during the release time. And for that, in the early days, we built two stacks of our application, one for the live users, which is called green, and another one for the internal users, which is called gray. So that means we had two instance of the same service, one for the live users and another for the internal users, the green and the gray. That was not quite of solving things for us, because we had to run two stacks of our application that was not only the resource consuming, but also we felt like we are dealing with them in this way, and though we were following the microservice architecture. And even by testing in the gray segment, that didn't give us more confidence for releasing to the live users. And also, it felt like we were using a kind of staging environment. It would have been better if we can plug in any of the new service to the green environment. So for that, we started applying the studio virtual service routing based on headers so that before releasing the app to the green users or the live users, then the internal user can test the rest of the green version of the services, but can also use the specific gray version. So in this case, we can see the internal user can test the gray version of the user service along with the green version of the rest of the services. So we can directly plug in any of the test version of our application without impacting the live users. And when we are confident, then we release that. And also further the integration of the third parties or our partners that used to work mostly with the dev and the staging environment. But that was not too expected for in that way to run inside the production environment. So we had to do a lot of digging to get the proper response from the third parties or the partners. So in that scenario also, by testing the endpoints for the internal users before releasing to the live users that gave us very much flexibility and confidence in us. So for that, we use the Istio routing rule of the virtual service and the destination rule. The virtual service basically with the mobile apps sending the header. So in this case, that they need gray B1. So the header will be sent from the mobile to the API and to all of the services. And based on that, the routing will be done with that to be sent to the gray version or the green version. And for the destination rule, we have the two deployment with different label with the instance that will separate that will be the green or the gray release of the app. So when the internal user do the testing and we're really good to go, we don't release the version to all of the live users. But we split the traffic to 30%, 70% or 25%, 75% and we gradually roll out to the rest of the users based on the error rates and the monitoring matrix as well as our own golden signals of the business metrics. So for that also, the virtual service with the integration or the user combination of the header logic as well as the weightage of the traffic to be splitted that helped us for not releasing the whole of the stack of the application or the full version application to all sets of the users. So during this transformation, we faced few of the challenges because the header has to be sent from the mobile to the API and to rest of all of the services. But in some cases, while doing the GRPC call, in some of the asynchronous calls, some services didn't send the header to the unknown service and we handled that very nicely. And we started adopting the event-based services. So let's say if one of the services has to subscribe to Kafka topic or publish to Kafka topic, then if that has to be rolled out to the gray segment, then we started creating two topics for the one purpose, one for the green and one for the gray topic so that the gray version of the service will listen or subscribe to the gray topic and another to the green. So also, let's say there was a case like the gray version of the event-based service has to call to the gray or the green version of the unknown service. So in that case, the event-based service inject the header based on the inherent variable presented in it. Also, the syncing with the mobile app team because the mobile has to send the header and we need to be in sync with the mobile app development team as well as the back-end development and the operations. So from that, we started keeping all of the configuration to the version control so that we all were in a very good sync. And for gaining the extra confidence, we made a very good use of Kiali, which is present as an add-on in Istio as well as Yeager and Prometheus. So with the use of Kiali, we are able to scan our Istio configs if they are outdated or if they are not good or if there is any problem in those configuration. And also, the routing things like service is calling to a service and what is the latency? What is all of those traffic handling between those services? Though that is for a small depression of the time offered by the Kiali. And also for keeping traces of our application, we used Yeager and by using the Yeager as Istio add-on we didn't have to implement all of those tracing lines to each of our microservice, which are very good overhead. And so that took us a lot of engineering resource also, the time and the resource, both. So and for the monitoring part, we built a central monitoring dashboard with the use of Grafana and pulling the Prometheus metric from the cluster itself, but also from the service mesh or from the Istio service mesh. And by setting the threshold of the monitoring, we get a very good alerting things to our on-call system PESA duty as well as Slack. So that will be proactive in terms of the error response or the latencies and those golden metrics of the business logics as well. But while doing these things, there was a lot of yamls. There were the yaml file or the manifest file of the Q&A days as well as the Istio routes and all of those things. And it is very hard to keep them centralized. So we started using Helm for managing all of the Istio configuration as well as the application, the deployment service and all of those things so that we are confident that either all of those configs are applied or none of those are there. And also development team started taking the ownership of adding the in-marlin variable or tweaking a few of those configuration inside the Helm. So that took us a very good path to our DevOps model. So that all obviously centralized all of the resource control for the Q&A days part or for handling the resources of the Q&A days. So by the use of Istio, we're really fast enough for serving our business requirement and getting it done at a very small span of time and without impacting a lot of customer at once. So that is a very cheering point for us. So that's it for now. Okay, bye-bye.