 Ooh, look at your glasses. I love them, man. Well, thank you, Karun. Super excited to be here on the main stage with you guys. Thank you so much for joining us. I wanted to get the show started with some T-Mobile branding. That's why you see the T-Mobile branding. But let's get started with the main show here. Ever wondered what's this thing with the Joker? Are you trying to scare people out? Not really. That's not my intention. I'm sure everybody here at some point has seen the movie The Dark Knight, where Heat Ledger played the Joker and he did absolute justice to his role as the Joker in this movie. It's a shame he's not here with us today, but like they say, a wise man would live on forever. And notice what he talks about chaos, and he calls chaos a fair event. So let's talk about chaos, chaos engineering for Cloud Foundry. But before that, hang on. Violent, chaotic Joker, and now peace and charismatic Mahatma Gandhi. What's the connection? I don't understand this. I promise there's a connection to all of this. First off, I'm not a crazy guy to relate Mahatma with the Joker here, but Joker talks about chaos. The Mahatma talks about customers. And coming up from a family of medical professionals, my dad who's 72 years old, psychiatrist by profession, never loses focus on one element in his life. And for him, that's his customers, and his customers are his patients. So I always wondered what that perfect definition of a customer is. And in that search, I found this definition of a customer, and Mahatma summarizes by saying, like, the customer is everything. Indeed, the customer is everything. Moving on. This is our charismatic and energetic leader, Brian King, senior vice president for Technology Services Division and Operations at T-Mobile. In fact, this inspiration leader will not stop talking about his customers. Every stage he is on, he opens up with this mission statement, which is focused on delivering an awesome customer service and awesome customer experience to his end users. In fact, we're so proud of this mission statement that this year, we've launched this new campaign called Powered by TSTNO, where it's all about delivering awesome capabilities to our customers. And we fundamentally believe that somewhere, somehow we're making an impact on how we deliver capabilities to our end users. Thinking about Cloud Foundry, it's really that enabler for developer productivity. I'm sure you all know about it, where it focuses on helping your developers rapidly build, test, deploy software in the cloud. Our ecosystem with Cloud Foundry at T-Mobile is easily the largest in the world. We have 13 plus foundations, roughly 39,000 containers, 700 million daily transactions happening with 3,000 plus mission critical applications and over 100 plus project teams. Did you say 39,000 or 36,000? Yeah, good call. So, when I did the slide deck last week, it was around 36,000. And just recently, as of last night, we've just shooted up to 39,000. That kind of shows the revolution that the container strategy is having at T-Mobile. These numbers are not convincing. Focusing more on agility and agility redefined. Cloud Foundry is all about these rapid changes that it's bringing to our ecosystem. It's all about faster apps, more frequent changes, in fact, all day time changes, roughly 1,000 changes that have happened over last year, fewer incidents, zero downtime deployments, fail fast and fail forward, elasticity. These are some of the various benefits that Cloud Foundry's brought to our ecosystem. Everybody here knows that DevOps is a buzzword at many companies, but at T-Mobile, it's no longer a buzzword where we fundamentally believe that you write a piece of code, you code it and you own it all the way to production. Now, you may be wondering what's going on with all of these connections here. I spoke about chaos, I spoke about customers, I'm talking about agility redefined. But for me, the one thing that keeps me up is what's gonna happen with chaos? How do we focus on that delightful customer experience? Great. So T-Mobile is not a single application company. We have 3,000 plus applications belonging to multiple internal customers running in a shared foundation. So performing a chaos attack on the infrastructure is definitely going to create customer impact at multiple levels. Hence, we came up with the concept of performing chaos attack at the application level. So what it means is you take an application belonging to a customer, do a specific targeted chaos attack on that application and also its dependencies. So we have Turbulence++, an open source, it's based on a wrapper on an open source solution called Turbulence. And Monarch is a brand new open source solution from T-Mobile that we are using it for the application level chaos attacks. Let's zoom into the infrastructure level chaos attack. So Cloud Foundry is a collection of multiple virtual machines. So in this context, application instances are the containers they run inside the Diego cell. So what happens if one of your virtual machines goes down? What happens to the application instances or the containers running in it goes down? And what happens if a process running inside a Diego cell, say rep process, which is responsible for maintaining the lifecycle of the containers running in that Diego cell goes down? And what happens? Like Cloud Foundry is based on timeout mechanism like it's designed by default with a timeout. Now what happens if you're injecting a latency between the go-router and the Diego cell? All these sophisticated and very targeted, specific, proactive attacks can be performed with Turbulence++ today. Now let's look into the infrastructure level chaos attack. Like I said earlier, you can pick up a specific application and do a targeted, sophisticated attacks on your application and its dependencies without impacting any other application running in that Diego cell or in that cluster itself. That is what we mean by application level chaos attack. Now I would like to explain this with a demo. So if we have a front-ending web application UI, that's a Spring Boot application. And it is backed with an Apple and a Samsung microservices. Both microservices are back and with MySQL, a shared database instance. Now what can go wrong in this? Let's say the latency is injected between Apple and MySQL database. What happens to the UI behavior itself? And what happens if the Samsung microservice goes down? Imagine these are single instance of microservices. What happens if the Samsung instance itself get killed? And what is the UI going to look like? And what happens if the connectivity between MySQL database and the Samsung microservices blocked? So now all these three attacks can be injected with Monarch. Now we wanted to make sure the UI looks absolutely fine even when these attacks are happening. So that's the reason we are using Circuit Breaker, a Spring Cloud service on our UI component. So whenever the Circuit Breaker gets triggered, you will see the next generation, Apple phone and Samsung phone. I'll show you how it looks like. Can we go into the demo please? So we are using CFCLI and we are listing the applications. As I said, there are Apple and Samsung microservice with extra S and then there is a UI component to it. So these are the three services running, three applications deployed. And now these are three services. You have Circuit Breaker bound to only UI component and you have database service of Type MySQL and Apple and Samsung are the applications that are bound to this database. And service registry to discover all these three applications, they talk to each other. So you have all these three applications bound to it. Let's go into the UI. So this UI is a very simple UI. It's going to list random Apple and Samsung product by click of a button. So this button on the top, when you click on it, it makes a call to two microservice, Apple and Samsung service. And that's how it keeps listing all the Apple and Samsung service. Now, the response time is the time that it takes for like from the time at which the request is gone and the response is returned from the microservice. So you can see the response times here. So the UI looks good, right? So now let's get into the first attack. The first attack, we'll be doing it with Monarch is first of all, we need to discover where your application is running. So the first attack is on the Samsung service. So we use a discover API method is written in Python. For that, we have to provide where that application is deployed by providing org name, space name and the application name. In this case, it's Samsung service. So the first attack is on Samsung. So it takes a while to load the app object. Once you have the app object loaded, you are now ready to perform a block service attack on database component. So you are blocking a traffic from Samsung to P list hyphen DB, which is a database component. So when I trigger this, again, it takes a couple of seconds and then the block service is initiated by Monarch. And when you go into the UI, you see the next generation product of Samsung. That's the next generation. I love it. So what is happening here is a fallback service is getting called and circuit breaker is put into action without impacting the Apple microservice. There's no impact on the UI whatsoever, but still it's falling back on the circuit breaker. Now rolling back this attack is very simple. Use unblock underscore services and takes a few seconds and then once unblocked, you can see that the UI is back into action. There you go. So the Samsung is again back into action. Now keep an eye on the latency times there, the time taken by these two services before I get into the next attack. So the next attack we would be performing is on Apple service now. So the same thing, we go and discover where the Apple service is running. So we provide org name, space name, and app name. Once the discovery is complete, we're going to, yeah, just let's look at the UI. The response times are pretty good, like it takes 100 milliseconds, probably, yeah. Now I'm going to introduce latency by using a method called manipulate network. I'm going to inject 100 millisecond of latency with a standard deviation of 10 milliseconds. So when I do that, the app object is now manipulating traffic into the Apple service. As you can see, the latency of Apple is four times more than the Samsung service. So that's how you can simulate introducing latency at the service level or the service to the database level. So let's get into the third attack. Again, we just do the rollback of the latency attack by unmanipulate underscore network. Now, crash random instance. What happens if the Apple instance, the only Apple instance is crashed? How is it going to impact the UI? So I'm going to show you the CF events on the Apple, doesn't have crash events. And the UI looks perfectly fine. Now I'm going to trigger the crash random instance of instance count one on the Apple service. So when I do that, you go to the UI. That's the next generation Apple product. So it's falling back onto the Apple service. And the circuit breaker is into action. And Samsung has no issue at all. That's the beauty about Cloud Foundry, Circuit Breaker. So Cloud Foundry is self-resilient by default. So when an instance is crashed, it will bring back a new instance in its place. You can see the CF events of the Apple, the crash events are done by the Monarch. And if you see the application of Apple itself, you have the desired state and the actual state count set to one, which means the Apple is back into action and you have the UI. So that's the third attack. Now, can we go back to the slide, please? Yeah. So what do you think about this Monarch? Do you want to show this to your internal team? Yeah. I'm actually very impressed with the toolkit itself. So thank you so much for the demo. In fact, I want to take this a step further and talk about some of our Chaosim customers. I call them Chaosim, respectfully, and no offense intended. Chris and Rob are here somewhere with us. If not, they've been parting hard last night. Oh, I see you here. So these gentlemen are fond users of Cloud Foundry. They have several of the development teams who are our fond customers. And these are true inspirational leaders that are doing bigger, larger things with Cloud Foundry at scale. And one of the questions that often comes back to us is, what else can they do with Cloud Foundry? How else can they achieve better application resiliency? So I'm very excited to take Monarch back to them and say, hey, this is how we can achieve better application resiliency. So are you with us? Thank you, guys.