 Hello, everyone. Welcome to Kukong. I am very happy to have this opportunity to talk here. Today, my topic is Metcalfe, Clawcalfe, and Neary-Ether, depth DB into Calfe Match. Before my sharing, I will introduce myself. My name is Stephen. You also can call me Chen Wen. It's okay. I am a maintainer and the founder of the Calfe Match project. And present, I'm working at Pincap and also my Calfe and Neary and the Calfe Match journey also started in this company. Maybe some people are not familiar with this company. Pincap is an open source infrastructure company. This company has open source many well-known and grass tools such as TADP, a distributed base, and also include the CMCF Graduated Projects, TAKB, and also included Calfe Match and so on. And present, I work here at Pincap and this company as a touch leader of the Calfe and Neary team and also practice Calfe and Neary on TADP and TADP cloud to improve their stability and so on. And next, as most people know, and this IT service has become more and more complex because they maybe use the different distributed systems and then maybe use the clonative architecture and so on. And clonative architecture improves the scalability and flexibility of the applications. But they also involves many challenges such as they maybe have more unstable network request and they maybe have more disk failures and they have more power failures and so on and all kind of failures. Failures, false false can happen anytime, anywhere, anywhere. So many forms can't be avoided. We can't avoid many forms such as the unstable network and the disk failures and so on. And just the writing test and the debugging for this issue is very hard. When we develop distributed database TADP we also have the same issues and we want to find a solution to improve the distributed base stability. So finally we find Calfe and Neary. And here is official definitions of the Calfe and Neary. It says Calfe and Neary is about breaking things in controlling the environment and the source of well-planned experiments in order to build conditions in your application to withstand top-land conditions. And by these definitions the keyword is experiments. This requires users when they do chaos experiments they can work hard to figure out the cause of the problems and to find the potential of the problems rather than just conducting a simple test. The other keyword is controlled and well-planned because chaos and Neary is not about breaking things randomly without a purpose. Because we use ChaosMatch to find the potential problems rather than to use it to affect the normal use of the systems, normal use of the applications and so on. And the beginning of my journey of the chaos and Neary, I also made some problems because in our testing environments we have many different clusters we have many different versions that is we all need to test them and I find that it's very difficult for me to measure and schedule so many chaos experiments and also we find it's also difficult to inject some low-level failures on this database because some many tools such as PC and tables, fields and so on these tools couldn't be used in communities environment directly. So we need a tool that can provide some low levels for injection such as systems and kernel levels and so on. And then times about three years ago I researched many open source tools but no one was able to achieve this goal. So finally we decided to develop these tools by ourselves and also when Neary is ChaosMatch. What is ChaosMatch? The answer is actually changes all the time and it is relented to the evolution of the ChaosMatch. And the beginning of the ChaosMatch journey our ChaosMatch goal was simple we just want to have a full injection and measurement tools that can run well and these tools can provide the low level for injections and we just use this to test TADB. After we did this for a while we realized that these tools can be general tools. So we open sourced it and Neary is ChaosMatch. After we open sourced ChaosMatch it immediately attracts the attention of the community and we receive many feedback and also receive many requirements and with the help and the push of the community ChaosMatch gradually evolves into powerful Chaos and Neary platforms including use of use web UI to design the Chaos scenarios you can use it to define your application data check-ins and also we have more various times of the full injections and so on. During this journey ChaosMatch also join the CNCF and in this year we also accept patients projects of the CNCF and the projects and the projects ChaosMatch also and the goal of the ChaosMatch is clear and our make Chaos and Neary is our goal and also this goal also directs the fortress evolution of the ChaosMatch and also directs the ChaosMatch version 1 and directs the ChaosMatch version 2 and present ChaosMatch is also very young and we have many steps to achieve this goal. Next let's dig deep into ChaosMatch and here this is the ChaosMatch architecture and you can see you can use the ChaosMatch and the Kubernetes plugin because ChaosMatch all ChaosTimes and workflow times objects is just designed and use the CRD to define these ChaosTimes in such as protocols, network chaos, stress chaos, hour chaos workflow and so on and if you are familiar with the Kubernetes and you're familiar with the operator mode you will be familiar with ChaosMatch and then because ChaosMatch is very the architecture is very easy and very simple you can see that ChaosMatch includes four components the first one is ChaosDashboard it's a web UI for user to manage and obduse ChaosDashboard and the second is ChaosControlManger it is the core components and it's used to schedule and to manage the ChaosDashboard and also this component also includes native workflow engines and the ChaosDemon is executing its components and it will be drawn in a demon set and will be deployed in every Kubernetes nodes and this is ChaosD, ChaosD is sent to the ChaosDemon it's also executing its components but these components will be deployed on a lot of Kubernetes nodes and it rejects these components to the ChaosMatch and then you can use this to by this way you can define you can inject failures on not a Kubernetes target and Kubernetes targets on a unified dashboard and this sorry this picture shows the whole workflow of the ChaosMatch and you can see then this workflow can be dependent in three parts and here is the first part and here is the second part and here is the third part in the first part it's about the user and users can use the Kuba apply this command or use a ChaosDash board to input to apply your Chaos experiments or workflows or Chaos workflows and all is okay and if you define your Chaos experiments in YAM files you can use KubaApply and if you want to use WebUI you can use ChaosDashboard you just input all the forms and then they will be submit this Chaos experiment to the Kubernetes API server and then ChaosControlManger will be received what's the event from this Kubernetes API server and maybe we will receive the creatives or update or deliver the events when ChaosControlManger should receive the event and then the process and schedule the Chaos experiments and then we will set the injection requirement to the ChaosDemon I have introduced this the ChaosDemon is the cutest component and it will be run in Demonset and it will be deployed in each Kubernetes nodes and then this component ChaosDemon will receive the requirement from the ChaosControlManger and then there will be into the target port some network name space or into the PID name space and if your Chaos experiments is a network Chaos API ChaosDemon will go into the the target port network name space and to communicate with your network interface such as to set the TC rulers, set the IP table rulers and so on if your Chaos experiments is a stress chaos ChaosDemon will be into your target port PID name space and then we will be stressed we will start stressNG programs to burn your CPU, to burn your memories and so on ChaosMage defines multiple CRD times based on different four times I also introduced this and the previous slides such as this is a portable port chaos and the network chaos and so on stress chaos overflow physical machine chaos and so on and here is a simple example of the port chaos this defines it in YAM files and it is a very simple this Chaos experiments simulates the randomized TEEQ report will be killed and here level selectors will select all TEEQ reports with this level and the mode this represents the Chaos experiment will select one random list from the output then means the selectors and here this is the best Chaos object if you want to define some schedule rulers for Chaos experiments you can use the schedule object to define this and it is rules this schedule object it likes the schedule rules it likes the cron job in communities and this schedule defines the port chaos experiments with cron rules and this object will create a new port chaos update every five minutes and when you apply this Chaos experiments we can check the result on Chaos dashboard and also you can check the results on your application monitors and like this in this the monitor of the applications like this you can find this will apply this schedule object you will find the QPS will be dropped to will be dropped and then instead it is returned to normal and there will be this behavior will be happens every five minutes next Chaos mesh also include a negative workflow engine this is decided for Chaos mesh and also used to design the Chaos scenarios to measure a group of the Chaos experiments and the standard checks of the application here is also an example of the workflow object which this this meta list contains three parts so names the entries the entries is the entries of the whole workflow and also include a template a set of the templates and the present Chaos mesh spot five times of templates in Serials, Parallels and Chaos, Suspects and Task these five times of the templates Serials and Parallels templates and the rooms to run the real-time real task and the Chaos templates is used to define the specific Chaos action this will be translated to different Chaos times such as such as protocols and network calls and so on Task templates is used to define the custom tasks such as we can use this to define the standard check tasks and so on and here is a very simple very simple examples and Chaos mesh and this workflow is very simple and it is used for you to define your Chaos workflow and you also can define your workflow on Chaos dashboard you can define your workflows in your Jamf files you also can define your workflow on Chaos dashboard by the forms and so on and Chaos mesh also defines the multiple selectors for users to define the scope of the carry experiments including a namespace selector, level selector annotation selector, node selector and so on this selector represents the scope of the carry experiments is the pose with this level and also and you also can see these levels on Chaos dashboard and here you can see the namespace levels selectors, level selectors and notation level selectors and so on Next, security is another key key point of the Chaos mesh For Chaos mesh, we implement authorization mechanism specs on Kubernetes RBSC users can create a specific service account with limited authorizations such as you can create the main rules you can create a viewer rules and you can use and then use the tokens and then this token was generated by the service account and then you can use the token to lock into Chaos dashboard and then you will use these users and this user will have the limited authorizations and by this way you can protect some important namespace and you can use this to limit the scope of the Chaos experiments Next, demo time For most of the user can try Chaos mesh either early and quickly Chaos mesh website provides interactive tutorials and these tutorials is contributed by Chaos mesh community members and also is built on Codacodas and this time I will use the interactive tutorial to show this demo Okay, let's start this demo First, you can we need to open our Chaos mesh website You can see then you can find the interactive tutorials on Chaos mesh website You can click here to start these tutorials Yeah, you can see then this tutorial will introduce some information about Chaos mesh and next you can click here to start this demo Yeah, maybe they may need some time to prepare some environment And in this demo a test community cluster will be corrected by Codacoda And here you can wait a moment And here is the step of these two tutorials You may have six steps You can follow these tutorials to start your first Chaos experiments And here is a share command You can click it This command will show the community cluster information You can click the community cluster and here you can click this command to check the ham version Because we need to use the ham to install Chaos mesh on this community cluster Yeah, you can see then the ham version is version 3 Next, continue When the community cluster is prepared we need to install Chaos mesh And here we use the ham to install Chaos mesh And also we recommend to use the ham to install Chaos mesh First one, we add the repo And then we set the version And here we use the latest versions on version 2.1.5 We click here to install Chaos mesh Wait a moment This command will be installed Chaos mesh on community cluster Maybe we need some time because we will put the image from the ham And we use the command to check the status of Chaos mesh Yeah, you can see then Maybe we need some time to make sure all Chaos mesh components is running And you can see then we include Chaos controller managers and Chaos steamer and Chaos dashboard Wait a moment Yeah, you can see then All components is running and before you continue the next step you need to make sure all components are running Yeah, and here you also when you have installed Chaos mesh you can clean this link It's always the same link to access Chaos mesh dashboard You can see Wait a moment Maybe we need some time to access network Maybe we need some time to access network Maybe someone with my network Wait a moment Let's check the status All components is running And here you can see the Chaos mesh dashboard is open And we can access this by this link And here the first we will have an overview of the Chaos mesh And it includes how many experiments how many schedule or workflow is running And here is a workflow You can clean here to create a new workflow And here you can clean here to create a new schedule object And here you can create a single Chaos experiments and all is okay For the same time We use the YAM file to define the Chaos experiments to start this demo And here You also can read this document to learn how to use Chaos mesh And Chaos mesh includes what times Chaos times and so on And continue, just continue In this demo We also need a target application And for this demo I will install an application We call it a web show And this application is a web UI You can see the network delay from this port to the system And this network delay will be recorded on this application And first, and this command This command to install this application Yeah, and if you apply the web show deployment, this application is wrong in deployment and here is to get to install the service before we can access this application And here you use this command to check if the application is status And we need to make sure the application is running and then to continue the next steps Maybe it needs some time Because this command needs to impose an image from the dock hub Maybe the dock hub image needs some time to pull And here Wait a moment Yeah, you can see that The web show application is running And also you can see this And also this you can click here to open So to access this application You can see Yeah, you can see that This application will record the network delay between the application port to the from the application port to the hub system You can see that the latency is about 1 millisecond And then next I will start network calls to inject the 10 millisecond And here you can see this You can click here to check the network calls definitions You can see then this network calls This jump file defines network calls And the action is delay This selector is web show And the latency is 10 millisecond and this will inject 10 millisecond on our target application And also you can use this selector to make sure this selector will be selected on the target port We can make sure this This selector will be selected on the web show port And here for the same time we can just use to apply this jump file to communities cluster And then they will be inject the network delay on our target application You can see then This network calls will be created And then you can this command to check this This network calls has been created on communities cluster And also you can check the results on the web show UI You can see this The network delay turns to 10 millisecond And this represents the calc experiment is working And also You also can add notations on your calc experiment object to stop your calc injection And to pause this And here you can click this This command will be add notations Notations is web show network calc delay and end this Notations is a network object And the annotation is this experiments calc match then stop on ORG And here is annotation And you can check the result You can see there's the normal The network delay is turned to normal And you can check the results and check the calc experiments and to observe the calc experiments on calc dashboard You can see then This calc experiment is paused And also you can click here to get the detailed information And here we will show the events about this calc experiment on calc dashboard And here is the definition Here is the basis information And also you can study it And also you can activate it And all is OK And if you want to study it You also can use that to remove the annotations You can use this one This one will remove the annotation from the calc object And you can check the result You can see then the network calc will be restarted and has been restarted by the command And also you can just use a code delete and to delete this network calc You can see check the result on the web show application This one will delay and turn to normal And also you can check the result You can check the calc experiments on calc dashboard All is OK You can see then the network calc has been deleted When you delete the the experiments, you delete the schedule of workflows that will be achieved here You can also have other demos about the schedule object and also about the protocols demos and so on If you are interested about calc match, you can try this It's very easy and very simple You just open calc match website and follow the interactive tutorials And you can try calc match And also you can start your first calc experiments And it's very cool OK, next let's turn to our slides OK, next is about future plans In the future, we will continue to focus on ease of use of the ability such as we plan to provide more comprehensive data inspection mechanisms And we will provide a more simple and more powerful report And also we improve the ability by the event logs and metrics and so on About security We also plan to develop a new component to first recover calc experiments which are used to program the calc experiments And also we also plan to provide a plugin to expand complex calc times Users can use this plugin to develop their own calc times such as ruby mq calc, ready calc and so on We also will build a calc hub for users to share their own calc times and to share their workflow Next is about the cost of the new users on calc experiments Next is about calc match communities And maybe some research you can find on our communities And I think calc match website is a good plan for users to start your calc in earrings You can find the document on our calc match website You can also guide the user to how to install calc match and how to quickly start their own calc experiments We also have interactive tutorials for you to create your experiments online and in calc match to quickly experiment calc match I also show how to use these tutorials on my demo You can also find some user calc on calc match blog And here is also our calc match YouTube channel You can find some We have a monthly community meetings We will be uploading the videos on this channel You can find the meetings record And you can find this You also can join the calc match channel The whole community is very active here You can find me And also you can find many community members And if you have any questions you can discuss with us on this channel And all community members are very happy to discuss and share their calc match journey with you And here is our github github And we also welcome you to create an issue on the calc match github If you want to If you have some feature requests You also can create an issue on this github And also you can just submit a pool request to help us to improve this project Calc match is still young I welcome to participate in the evolution of the calc match And I welcome all people to join our communities Okay And it's all my sharing Thanks very much And also thanks for your time