 So Vishal, both are from InfaClub Knowledge and Vishal is an engineer and CTO at InfaClub Technologies. He helps companies transform their businesses by using technology and coaching people. And he's also a contributor to Fishing which is a fast and secret service function for Kubernetes. And Vishal is also an organizer for Kuma Kubernetes and CNCF Meetup. He loves good books running to high altitude mountains and it's great. And coming, talking about Nenath, he is a certified in Kubernetes administration which is CK as well as Kubernetes development which is CCAT. And he's also very enthusiastic in technologies around cloud native space. And on the day to day basis, he works and has an interest in mainly tops Kubernetes and observable stack. In the current role, he has customers as a DevOps and SRE engineer to adopt the cloud native technologies. And outside of technology, he likes to read and travel and play in Kubernetes. So in a few minutes, our session will start. So I'll be handing it over to our speakers. Again, to make sure we're gonna have Q and A as well after the session. So be prepared with your questions. We will have our speakers here and they'll be answering your questions. So yeah, thanks a lot. Thank you for being present here. My name is Nenath Desai and I'm working at InfraCloud as a staff engineer. So at InfraCloud, I mainly work to help clients to handle their needs and regards to infrastructure design, scale, modernization and all stability around their application using cloud native stack like Kubernetes and a bunch of other tech stack that normally falls under the CNCF product. My name is Vishal. I'm CDN founder of InfraCloud Technologies. My interest lies usually around Kubernetes, serverless, open source and then all design and all scaling. Cool. So you have definitely heard the classic saying that move fast and make things. I think this was popularized by the Facebook view around 2014 or so. And this is pretty popular, saying that as a product or as a company, you should move fast and you know, break things. But I think we are living in a world which is changing now. We definitely want to move fast without actually breaking things. And we will learn today how to achieve this in the context of software with progressive delivery in context of the software development, right? So before we go further, let's understand the word progressive. What does it mean? So if you take the literal meaning of the adjective progressive from the dictionary, it means something that happens gradually or in stages. And I think this is a crucial thing to keep in mind as we progress into talking about progressive delivery. So one thing I want to clarify, after I'm done with this, progressive delivery does not mean you stop being continuous delivery. And in fact, if anything, progressive delivery rests on the foundation of solid continuous delivery and you know, builds on top of that, I would say. So what is progressive delivery? I think this diagram sums up the concept kind of very nicely. The idea is to develop software in production or if not possible in production, as close to production as possible in an incremental and casual way. And this could mean you could release it to a set of users or sometimes to set up testers, but the goal is to do this in production because anything else is not like in production. And you must be wondering why, you know, that is so important, right? So testing in production like environment, right? So let's take a couple of scenarios slash examples. We have tried this approach, for example, companies try to set up the whole thing on a developer's machine, but it is anything like real setup. The issues comes from resource exhaustion or lack of resources on your developer machine or sometimes you can't run an Elastica or Kafka in a production grade scenario on your dev box, right? And on top of that, you definitely don't have the data available on your local machine. Now, other scenario is, you know, you try to test in production like environments like staging, for example, right? Staging is often touted as production, but it's not the same as production and if you're in industry, you know, where compliance is huge, for example, the power PCI, your database is never the real database or size of database. It's like a smaller replica of the actual database, right? Similarly, you may not have observability in the production environment the way you have in other environments, for example, right? Lastly, today it is not just for application anymore, you know? It's all systems which are sometimes very unique to production. For example, your configurations, the infrastructure as code, the kind of traffic that you have, it's very hard to replicate the way you have it in production. It's almost hard to replicate in any other environment, right? So no matter what, you're not testing the way you're testing production. That also means some behaviors you will see only in production, not in other environments, right? And that's where this, you know, funny adage comes in a picture that it works in my staging. The way we talked about in probably previous decade, it works on my machines a lot better. Cool. Now we are living in a fairly new world where things have changed quite a bit in the way we deploy, in the way we observe, the way we manage and stuff like that, right? But the way we do software testing, you know, whether we're doing everything together and try to do a testing on that whole entire setup hasn't changed a lot. And it still, you know, follows the very traditional practices in a way I would say. So you just don't need to stop testing, you know, because you're in production. In fact, testing in production is actually happening literally by users on continuous basis. But we need to leverage the state of the art tools, the state of the art that is available in production in a way that variable tested without impacting, you know, anybody else. For example, you can do canary, you can do traffic shifting or shaping and stuff like that. So you of course have seen the classic meme that, you know, I don't always don't test my code, but when I do, I test it in production. Now, this might have sounded like a meme probably a couple of years ago or maybe half a decade ago. But technically I would say it's reality. Today we can test in production in a very sophisticated and a very, you know, interesting manner of this. Now, before I hand over in the next slide to Nenad, Nenad did a progressive workshop as part of the KCD Bangalore that happened a couple of months ago. And he extensively talked about Argo rollouts. I highly recommend you suggest, I highly recommend you check it out and, you know, learn about Argo rollouts specifically. Over to you Nenad. Yeah, so thanks Vishal. So now we have understood how progressive delivery could be a game changer to build a robust and resilient system in a true sense. So next question would be how to do it and what are the ways to do it? So next couple of slides, we will be talking around the same. So one thing I want to clear the air around is deployment is not released. These two terms unfortunately we use interchangeably, but they are not same. Deployment is basically creating a new version of your application or deploying a new code versus release is basically all about letting the traffic from your end users to receive and respond by the new version of your application. And so as Vishal has already stated, right? So these deployments can be in production can be tested in three ways. You can test it at deployment after deployment by using some other form of integration test or you can do it after the canry or any other kind of release. And you can also do in post-release section where you are continuously monitoring it and checking by some other way of let's say doing testing either AB test or Chitao's test in post-release part of the same. So yeah, and progressive delivery is not always about canry or blue green or AB form out. It's much more than that. So you must be wondering like how we should do this deployment where we are deploying but not releasing to the end user. So there are different techniques available for the same and let's look into them now. So first technique I would say is the feature flag. So this is essential technique where basically you inject a flag on top of every functionality or code block that you develop or you have written. And by some other way, you allow users or by yourself to enable that flag and after only enabling that flag the new version of your application is available for your users. So sometimes you might have seen the case where you have been logged into the application and it shows you at the right on the top saying we have our new version available. Would you like to check it? And once you click on it, it takes you to the newer version of the application. That's where feature flagging is used. Apart from that, there is this very popular and I think most of us have heard about blue green technique where what you do is ultimately you have let's say like a version 34 of your application. You want to move to the new version, let's call it as a version 35. So you will create exact replica of or you can say exact infrastructure with new version of your application. You do all your end to end testing and once you are confident enough that my new version is working fine, then only you move to the new version and cut down all the old version of your application. Apart from that, there is another technique which we can say a canry. So this is quite massively been adopted as part of progressibility, but do you know why we call it canry itself and not anything else? So the term canry deployment comes from a old coal mining technique. So these mines were always containing some or other hazardous gases that could kill the miners. So canry birds were most sensitive to airborne thousands than human and so miners used to use them to detect these things. And so similar approach is with the canry deployment where instead of putting the entire end users in dangers like in case of big bang deployment, we instead start releasing our new version to a small subset of users. We see if they are able to do all their activities correctly or not. All the functionality is working fine or not. And then gradually in incremental ways, you move to the next version and next possible implementation in your rollout to all the end users. Apart from that, there is a new technology or I would say technique called as a A-B testing. So let's say I'm a part of a product development team itself and we would be having always these kind of huge debates where one person is saying, we should have our login icon at the right top side corner. Someone is saying, no, we should have it at the center or someone saying at somewhere else. So when you have these kind of debates and let's say you are not able to understand what should be the approach that our users would highly likely to add up. And that's where what you can do just like in blue green deployment, you can create another set of your application and you roll out both the applications versions, which might be having the login button at the center, another version may be at the right and you distribute it geographically and then based on the adoption percentage from your users or you can say amount of plan being converted, you can figure out which version is likely been adopted and liked by users. And then you can go for that version and cut down the other version. Apart from that, there is this also a technique called as a shadowing or dark traffic testing or you can say mirroring. So where what we do normally is we copy the user traffic and send it to your old version as well as new version of your application. So while the new version will also receive the traffic, but it will not respond to that request. Only your old version is responding. So these dark launches or you can say shadowing kind of testing does help you to find out the issues which your functional or end to end testing will not be able to figure out in a control environment. Okay. And apart from that, there is also this technique called as a tab compare, which is a kind of internal testing mechanism, I would say. So what we do normally in these kinds of techniques capture your production traffic and route it to the new version as well and compare the result of your old version as well as new version. And based on the response, decide whether should we move forward or not. And apart from that, there is this also a method where you can use canary with traffic shifting mechanism. So you can use service methods or engine X kind of English controllers where you would gradually send traffic as part of your canary release. And at the same time, with the help of your internal teams, maybe you can either pass some custom headers to your engine X controller and do all the testing. At the same time, you would be able to use the user traffic as well to determine whether we should promote ahead or not. And I will let now Vishal to take further. So we look at various techniques where we can do progressive data, right? But this is technically a lot of work and it requires fair amount of preparation. So first thing I would say is in the context of your organization, this takes time. And if you look at this paper from Facebook for folks, they only try to do it only for stateless services in the beginning. And even that, that took them probably a year to get it right and you'll get it perfect should speak, right? Only after they got it right over the first year or so, they actually wrote it out to other services and other areas of the application. So it's important to spend the time and get it right rather than hurrying into it, I would say. Beyond that, it also takes preparation because it's important that if you're doing something like this where you're testing production, you don't want to have any unintended consequences on other users who speak, right? For example, if you can't make sound decisions about whether your new version of the system is producing more errors or less errors, you probably will not be in a good shape to make that call, right? So you need to have a good observatory system. Similarly, you need to have a good rollback and roll forward system. You need to have the right schema changes and backward comparability built into it to speak, right? So such changes will need that take changes, but also you as an organization or a team are culturally understanding the trade-offs and making the right decisions over here in the screen. Now, let's look at AWS rollout example that the public talked about. So typically our application goes through four phases. We have source code, then we build tests and eventually take it to production through some sort of pipelines, right? But if you look at it in a zoomed in way in each of these areas, how AWS does it for their services. So in the source and build area itself, right? They have automated pipeline for every piece of things that get deployed, whether it is infrastructure or application code. And it goes through a bunch of unit test, build, static analysis and all that stuff, right? So there is fair amount of decisions being done in source and build phases as well. Now, if you move to test stage, for example, in the test stage also, there are at least a few pre-prod kind of environment that it goes through. And every environment has its own purpose. For example, alpha or beta might be more about understanding the functionality, whereas gamma is almost like production, where your monitors and other alum are as good as production and you get a pretty good sense of how good this service is performing, right? Now at AWS scale, when you prepare to production, which is the next step in the logical movement, you don't want to again affect because your users are globally spread and they're using at various time zones and stuff like that. So the production route actually follows a fairly detailed wave-based approach basically, right? You move slowly from one stage to other stage and based on certain decision points in the previous stage, you decide whether to continue to the next stage or you kind of pause at that stage and roll back and stuff like that, right? One very important concept there is bake time. And I think this is a fairly bake or soak time as it is called in many companies slash industry, right? The idea is that a lot of times when you deploy a change, the effects which might be good or bad are not immediately visible in five, 10 minutes, right? Sometimes it might take 20 minutes, sometimes it might take an hour, right? So based on how far ahead you are in the pipeline of progressive delivery and how huge the change is, how huge the impact could be, your bake time could be sometimes an hour or sometimes it could be 12 hours before you decide to move to the next stage, right? And then you have some matrices that you have thought through and designed beforehand where you take a call whether this is a positive sign which will move ahead and those matrices are meeting, you know, they're required kind of anticipated levels basically, right? So having talked about this, I'll let Anilad continue and talk about some of the tools in the space. Yeah, thanks. So now let's move towards which tools can help to do some kind of code quarantine, I would say part of this progressive delivery. I mean, which tool can help you to enable progressive delivery at your requirement? So there are currently a couple of tools available like Argo CD, Flagger, Spinnaker, LaunchDarkly or Optimizely. Flagger and Argo CD are quite a major one being used, I would say. One is developed by Argo Community and another by WeWorks teams. My personal opinion is both tools are really cool. Both tools have really good features except that Argo CD comes with a GUI. If someone need understanding maybe, okay, which tool I should give, I should go for. I would say if you have Flag CD or a system already in place, maybe go for Flagger. If you have been already using Argo CD, then maybe go for Argo Rollouts being a part of the same segment. And so I have just put this example here, which is a part of my workshop that I recently conducted, hands-on workshop around how you can do progressive delivery using Argo Rollout. So I just wanted to demonstrate how you easily can convert your existing deployment objects into the rollout. All that you need to do is first have a Argo Rollout controller installed on your Kubernetes cluster. And then let's say if you have a deployment, instead of deployment, you will convert it to kind of rollout and instead of strategy as a rolling update, which is a kind of default, you can change it to either canry or blue-green. And now in terms of service, if you have one service, you will create two services. One will be handling traffic from canry and one will be handling traffic from your older version. Okay? And as I said, right, when you are doing this canry, you might need to do some kind of testing. So either these testing can be performed manually, but best and preferred approach is to do it in automated way only and let system decide whether I should now go for the new version or not. And so for that, Argo Rollout has provided this feature called analysis template where you can customize, create a job or you can use pretty popular tools and matrices generated from tools like data doc from Meteors and so on and decide whether your newly rolled out service is working correctly or not. And these are some of the best practices, I would say, that I learned from my experience as part of InfraCloud team where we happen to help a healthcare industry best client to achieve the professional delivery using canry and traffic splitting technique using Argo Rollout. So make sure to run it enough times and make sure to understand that this is not what you are gonna get correct in one shot. It need to be done as a part of a iterative process. Okay, and yes, in case if you want to do some more deep dive with hands on, please do check out the recording from KCD Bangalore event where we have done this workshop. And these are some of the blogs, I would say that you can check out definitely to get some more in-depth understanding around how progressive delivery can be worked out. Yep, so that's it from our end. Do let us know if you have any questions and we would be happy to help you. Thanks for your time and we are happy to take any questions.