 kind of gets started. So thanks again everyone for joining. Before we kind of go over a couple demos from the community and a little update on the landscape, is there anyone new on the call that would like to kind of introduce themselves and say hi to the group before we move on? I'm here. This is Tammy from Gremlin. This is the first time I've been able to join. I've been traveling a lot so the time didn't work but yeah I'm in San Francisco right now and yeah great to be here and I do chaos engineering at Gremlin and I also previously did chaos engineering at Dropbox where I got a 10x reduction in incidence and no high severity incidence for 12 months and before that I did it at the National Australia Bank for like many years since really 2009. So yeah great to be here thanks for having me. Awesome great to bump into you again Tammy. Anyone else? Cool all right moving on. So on slide five so you know we've been discussing this kind of landscape and kind of fighting over a little bit how to categorize the different tools out there. You know I was getting a little bit frustrated so I just decided to basically start as simple as possible right so I went out did a pull request to the cloud native landscape and started with four projects that I could essentially find an SVG logo for of high quality you know base and also kind of information on crunch base and so on so different requirements that we have in the in the CNCF landscape so I issued a PR if there are other projects that you want to add please let me know. I'm more than happy to kind of add them I think we'll start with kind of a flat structure first and then later on we kind of go fight about how we want to sub-categorize things because we had definitely a bit of difficulty kind of trying to figure out how to do that so hopefully people are okay with this approach in terms of starting simple and iterating but like kind of left to open it up to the feedback from the group before before moving on. I guess silence is there's no disagreement. I'll have a look after the meeting. Okay yeah no it's pretty straightforward I just started with four and we'll kind of go from there eventually once those are in the interactive landscape we could distill kind of we have a design team that could kind of take that and break that apart in categories but I first kind of want to just collect the information out there. I think simple is a good place to start. Cool yeah I was getting a little bit frustrated after just kind of working on things I just needed to get something out there so you get iterating. Cool so yeah but I'll continue to do that and give you an update in a couple weeks on that but hopefully that should get emerging soon. In terms of community presentations we have two things today one from Michael to talk about fire drill and the other one from our... Cool thanks Michael. All right I think it's you next Uma so feel free to steer once Michael stops sharing the screen. Perfect. All right thanks Chris let me share the screen can you see my screen? At least I can. Thank you. I think Kathik and I are going to talk a little bit about quickly what litmus is and we're still in the early stages. Litmus is we've been toiling for some time on you know what to call it mass. You can call it as a tool or our vision really says you know get all the tools in the open source the best tools and then use them together so we call it as actually a framework and it's a framework for chaos engineering right now for Kubernetes so tagline is chaos engineering for stateful workloads on Kubernetes. Stateful workloads really means that that's the kind of you know tough area right now all the problems will come the moment you bring in the stateful applications and the underlying storage and networking will play a major role in the stability of a stateful workload and at the moment it gives a set of e2e tests as a ansible playbooks and each litmus test is a playbook that runs inside a container right and in today's demo we're going to show how a litmus test looks like on github and also introduce chaos into my sql app and see what litmus does how litmus helps in introducing chaos and seeing whether the application works as expected or not. Primarily litmus is supposed to be used by developers and DevOps groups in their Kubernetes clusters CICD pipelines and sometimes before you put things into production for example a new Kubernetes cluster being upgraded how do you make sure this Kubernetes is going to work for your DevOps teams right so that's when as a DevOps architect you can put certain litmus tests and run it in a pre-production pipeline and then observe then roll it out right this particular test has got three configurable variables one is you know how do you display the logs and what kind of storage you want to do underneath and take underneath and then in oversurcast type that that you want to really apply and let me quickly show the anatomy of the test the project is hosted as a SAP repo and their OpenIBS organization there are multiple tests here we are in the process of moving much of our e2e tests on to litmus framework and but right now I think we got a couple of sets of tests that are moved on to litmus already to get actually started with litmus it's just a matter of cloning the project you get all the tests and then modify the tests according to your need and then really just run the test using the kubectl command right so every test will have a file called run litmus test so that really kick starts the actual litmus test let me go and see show you so for example my SQL application or the Percon application we got two tests here one is storage benchmark and the other one is data persistence and the test will really have the actual set of tests to be done and run litmus will have the configuration file on how to control your test and some of my SQL is really the application part if you see here we are right now using the underlying storages OpenIBS storage you could perhaps use Rook or Portworks whichever following the Kubernetes way of attaching the persistent volumes to the pod and we're currently supporting for this particular test three types of chaos one is through Pumba the Pumba itself I think we heard from Alexi last session that Pumba is a CAS tool that can introduce two types of guys one is to introduce network latencies and the other one is to actually Docker stop start type of thing so in this test we're going to kill an application pod using Docker stop through the Pumba APIs and this test also can be used to introduce different types of CAS you know you can use AVEC where Kubernetes teams will be set up by litmus so that the pod gets affected and then see you know what happens to the underlying application similarly the node drain that really means that the litmus job will go and kill one of the nodes right that's the kind of configuration parameters that we provide so just to give before we do a quick demo I want to take you through the setup that we have also the demo flow and so we have set of nodes and there is this this is a Kubernetes cluster on Google Cloud GKE engine and we have a couple of nodes where there are a gpd is configured as a data source to this nodes and we are expected to run an application that uses this data so what we do is we run a litmus job that does the following so it launches a litmus pod which does the real test and it launches the my scale pod and make sure that the underlying data connectivity is done through open ebs or whatever is configured as part of the test and then it launches the CAS framework Pumba and then it introduces CAS right and my scale pod will be watching for you know the test whether it's running fine or not the moment you introduce CAS it gets killed and Kubernetes relaunches it right so it's the same process can happen again and again you keep introducing CAS you can configure how many times you want to do this and after the end of introduction of CAS you go and really verify the data right that's the fifth step that litmus does okay the CAS has been introduced and our pod is scheduled somewhere if it's not at all scheduled then that's a failure even if it's scheduled is it really connected to the data and am I seeing the right tables underneath right and that's that's really what it is and then it cleans up one by one and then you know all the litmus pod and then it gives you back the node in the same state the cluster so the idea here is litmus with litmus the DevOps teams can really take from end to end a given test in an easy manner into the pipelines right let me quickly go through this test so I got um so I'm on a one of the Kubernetes nodes on which litmus will run and eventually the logs are going to be put into the nodes currently I've selected to publish the logs onto a local node in my litmus test so we are going to see the litmus test results coming out here and as part of the demo I got um three windows here to show one is where I'll be watching what are the pods in the litmus namespace and also I'll be observing the logs coming out of a litmus pod and then really this window I'm going to kick start the test so let me just show the test again I've taken open ebs as a class and actionable is just puts all the logs onto the study out and guys type I'm taking pumba here um so with that um let me so I'm running this test and watching the litmus namespace it already started and this is the container that it's creating and now I'm observing the logs on this window as you can see um it's in the mode of it already deployed the application and it's it's coming up and you can see the open ebs volume controller and three replicas are already deployed and once the mysql.com application comes up then you will see uh pumba also getting launched and kaya is being introduced and then you can keep watching how parkona um behaves uh in the meantime let me go and see start creating um yes it did create so you will see our test result.json file here when the test is completed um so as you can see the application is running uh right now in a the entire test we configured it to finish in about two to three minutes um and you can see that there are some uh test uh data being written also the pumba is being launched and that's the kaya's tool being used for this particular test and then the moment uh pumba comes up it starts introducing the kaya's which is nothing but kill this pod um the application pod and then we expect that pod to come back up so this um this pod has gone um into error state and we are waiting for um kubernetes to reschedule it it's rescheduled back and again it's it's in that state getting killed and coming back then once that's done um we have just for the benefit of uh keeping the demo shot we put uh less duration uh for the kaya's in a real test you would want to see um sometimes kubernetes puts the node um pod back onto the same node right so the idea would be the best practice in this case is now introduce some same quest 10 times and observe it um going across multiple nodes and finally uh see whether the data is persisted or not right um so it's coming back and um here we'll we'll be able to um so the pumba is going down that means the kaya's introduction is done and um here whether the data is persisted or not it did all uh checked and uh put that into our result file this is our most primitive way of recording the result as you can see that the test has really passed um that's the kind of introduction of kaya's and verifying the data is already there not um one typical way to do this is whatever we did we could we could have put this into a pipeline and then you know repeat as many times as you want and um the ansible jobs can be configured uh automatically to alter the configuration parameters of the playbooks so these tests are really written to be used in a friendly way by uh the devops teams so that's the quick demo uh hopefully um it made sense to you in the equations yeah i've got one do you feel like uh since you've developed uh litmus implemented and started using it it has helped you thinking about open ebs uh resiliency in general i mean has it have you changed your your ways of approaching that in open ebs oh yes um in fact litmus is born out of open ebs right so we started writing this e2e test and then we started introducing kaya's as part of the development of open ebs and then we thought uh okay you know we are using multiple types of tools and we should all put together a project and um you know even if you have a stable product out there for the end users our end users so while they put it into the production the applications they need to read on the same test again right and that's when we thought um uh we'll open source it and then make it as a more of a infrastructure that puts everything together our framework that puts everything together yes so we are following all this inside open ebs all right yeah so it's a community project again um as as i was saying in the next uh a few weeks to months um more tests get moved into the litmus framework and um we would like to see um you know various application developers or users getting um their expertise into this test and then using this test to their own needs or requirements yeah all right any other questions thank you cool thank you man carthik so uh not too many things to you know before we close out the meeting so uh just to you know if you go to slide 14 just calling out on essentially mainly the white paper uh you know there's been some discussion and iteration there so um i encourage the group to continue to do that uh and of course the landscape um i have the initial pr out so um if you have another project you want to add there please do and kind of once we build that up we could have more discussions about breaking those apart into kind of subcategories um quarter kind of wrap things up a gentle reminder on slide 16 um the first uh chaos conference is happening in san francisco hosted by our friends at gremlin um so we'll be there and it'll be great to have some folks uh also show up there we're also going to be doing a chaos engineering track at kubecon cloud native con in seattle in december um the chaos engineering uh working group will be entitled to essentially two talks uh at kubecon so i'm gonna try to figure out uh how to best uh divi that divi that up with the group but essentially um you know i'm looking kind of for introductory content and maybe kind of an overview of the kind of different tools out there with demos but uh we don't have to figure that out right now um but uh something to keep in mind other than that any other questions i'm always seeking volunteers for community demos so if someone wants to volunteer next time please let me know and our next meeting will be the second week uh or second tuesday of august at uh adn pacific july august 14th any questions thoughts volunteers for next time if you if you if nobody else i'll try to come up with something with skills we've got some some cool stuff coming up but i don't i don't know if i'll be ready pressure pressure makes diamonds yeah so pressure makes diamonds it'll be good deadlines are good yeah exactly yeah and then i don't know if um there's someone from lyft too i'd be kind of curious to see um if lyft would be willing to talk about some of the stuff they do especially um with envoy has some of the baked in uh ability to kind of do chaos testing so yeah that's me exactly um there's there's a couple things that i could demo i'm not i'm not sure what i need to who i need to talk to before doing these things but okay there's a red line test that we that we run across all of our services we kind of adjust the load balancing weights through envoy discovery service yep um we also do fall in the direction uh through envoy but yeah i'll uh i'll look into that yeah no worries i i know pete morelli well so i i'll just tell him to give you permission it'll be all good so fantastic okay cool yeah no it'll be good because not a lot of people know about those envoy features so it'd be good to kind of disseminate that a little bit more sound good all right any other questions otherwise we'll wrap it up and uh you know i'll tentatively slot sulbane and uh zack depending if they have content for next time on august 14 okay okay cool take care everyone and i'll see you next next time enjoy the rest of your tuesday thank you very much thank you thank you have a good one everyone bye