 Thanks everyone, welcome for joining this session as 210, so I'll just go ahead and start. Our session is about the Milded Cluster Github's application delivery, and it's about the integration between Argo CD pool model and OCMs hub and spoke pattern. So hopefully you are attracted by the Argo CD pool model portion, and we will go ahead and discuss them in more details. So before diving in, I do want to give a quick update about ourselves. So my name is Mahee Chen, I am from the Open Cluster Management Application Life Cycle Squad. Our OCM project focuses on a lot of components, and we are mostly working on the application deployments. So I am a full stack software engineer. I work on both the UI portion as well as the backend portion, but today I will be focusing mostly on the backend. And Xiangjin is our team lead, and he is a principal software engineer, and we work on the same project. If you are interested in our code base, feel free to check our work app pages, it's listed here. We also have our LinkedIn profile linked in the schedule, so feel free to contact us on LinkedIn. So you may wonder, what is OCM? OCM stands for Open Cluster Management. It is a CF done by project, so community approved, hopefully, and it does multi-cluster, multi-cloud, cumulative cluster management. So that you can deploy application policies and stuff to manage cluster in a scalable environment. So we are using the hub and spoke architecture, which I'm going to show you the diagram shortly. Lastly with OCM, you will be able to get a centralized view of your entire fleet. So on this diagram, and this is our hub and spoke pattern, you can see on the hub cluster, we have the placement resources that's going to deploy a placement decision. So placement decision is going to bind cluster sets, so that way you are able to select a set of clusters to deploy your application or policies to. Placement itself has predicate so that you can perform label selector or claim selector, and in the prioritize, you can see how many of clusters are actually selected. So on the top right, we have a YAML template that shows the placement decision. You can see I'm using the cluster set global. With OCM, you will get a default global cluster set that's going to select all managed clusters that you got. So you can narrow that down to just using a label selection. So saying I only care about cloud on Amazon, so you can do cloud in Amazon or any other provider connections. Another resource that I briefly covered is the managed cluster. So you can perform OCM managed cluster or KubeCTL get managed cluster, whether you are using ownership or not. So that way you can get a list of managed cluster that are specified to certain cloud providers. So as of today, ROCD mostly just use push model, which is quite simple. On your hub cluster, you get your ROCD operator and it's going to create an application set that generates a set of applications. So ROCD's application controller will then come back to the remote cluster and deploy the resources from the application to the remote clusters. So it's just like one way from the hub to the managed clusters. What we are proposing and which is currently merged by the ROCD community is the pull model. So you can see that aside from the ROCD icon, we now have the OCM icon, which is the proposed cloud thing. So we now interact in between ROCD hub and managed clusters. So same thing on the hub cluster, you still have the ROCD operator and it's going to generate the applications. But instead of connecting to the managed cluster and deploy resources directly, our OCM propagation controller actually see those applications and they're going to create manifest work objects wrapping the application objects as payload. And our agent on the managed cluster is going to see them and pull the application from the hub to the managed cluster. So as you recall in the previous slide, there's no applications on the managed cluster. But now we have each application on each managed cluster. One prerequisite for that is you need to install the ROCD controller on the managed clusters. There are a few different ways to do that. First, you can use OCM's policy template to generate that to each managed cluster. Or you can use what Dan Garfield just demoed this morning. Thanks for coming. Now you can use RGOS of RGOS to install those operators on each managed clusters. So there are different approaches that you can perform. And then RGOS CD controller will see those applications and perform local cluster deployment. So why do we want to do this? It does provide a few advantages. The first one is that it's going to scale better. We do have some skill tasks that's just going to cover shortly. And we also increase the security by no longer requiring the credential to be stored in a centralized location, which is a hub. Lastly, it's reducing the impact of a single point of centralized failure. So if your hub failed, you still have the application on your managed cluster and they can still perform local deployment. Now I'm going to pass over to Xiangjing for more details. Okay, so here we can see this is the whole RGOS CD promotional architecture. So that looks very complicated. There are a lot of clothes. But let me try to break down into three major parts. The first one is located here. So we call it a propagation. And actually we have the local propagation and the remote propagation. I will talk about it in detail later in the next slides. But before that, I just want to highlight the purpose of this propagation is to deploy the RGOS CD applications to the managed clusters. So we are not deploying resources to the remote clusters. We just deployed the RGOS CD application templates to every managed clusters that our placement decisions already selected. And this is the first major components located here. And the second part is the local or deployment. So once the RGOS CD application is deployed on each of the managed clusters here, and then the native RGOS CD application controller will be reconciled. It will be a pool resources from the Git repo and Helm repo and it will also deploy the resources. But just for this given a managed cluster. So they're not talking to all the other managed cluster. So as Maggie just mentioned, so one managed cluster, we have just one RGOS CD instance running on that managed cluster. And see here, the last major components that's showed here. So all of these components there are just for one goal. So after we deploy all the RGOS CD applications to all managed clusters, and then we need to find a way to fetch the RGOS CD application status from all the managed clusters. And then we collect it into the database and then we have some aggregation controller finally to aggregate all the status, including the overall status and the per cluster condition status. And then we combine them into one final resource. That's the new series we introduced in the pool model. We call it multi cluster application set report. So here, so that's also located in the original RGOS CD application set that the user created. Okay, so let's talk about some details in each of the major workflow. So the first one is the two publications. So here is the user. Firstly, they create the RGOS CD application set in a hub namespace. And then at this point, the RGOS CD application set controller will be reconciled to do this local propagation. So this is the total functionality provided by the RGOS CD. So we're not on it. So after this propagation, then we can see, let's see, if are the clusters numbers is the 1000s, then this native RGOS CD application set controller will be creating a 1000s RGOS CD application templates into the same namespace. And then here, the second propagation is owned by us. We introduced a new controller. We call it a propagation controller. So that will be a reconciled to do the further propagation. We call it a remote propagation to propagate this RGOS CD application templates from the hub to a given managed clusters by using our manifest work API. So the idea is that so using this manifest work, then we are able to specify a cluster namespace that we want to deploy a specific workflow to this given managed cluster. So here, let me show you some details. So hopefully you can see it clearly. See, this is the one sample for this manifest work that's actually supported by our open cluster management community. See here, two highlights, one is the namespace. So these namespace is actually the cluster namespace. So as Maggie just introduced in the OpenShift cluster management introduction, then every managed cluster, we have a cluster namespace created on the hub. So if you want to deploy something now to that managed cluster, then you have to create a manifest work here to specify the cluster namespace. And see under the spec, we have two major sections. The first section is the manifest configs. And actually, we're defining some feedback rules here. So the purpose is, so once the Agosti applications is deployed to the managed cluster, then we're hoping to fetch some of the status from that application status from the managed cluster. So here, by using these feedback rules, then we're able to define it using the JSON pass. See, as you can see here, so we are actually fetching two status. One is the health status. The other is the sync status. So those are the overall status reported by the Agosti application status. Yeah, this is the first section. And the second section is the workload. So that means that this is used by the manifest work. Then we will define which manifest we want this manifest work to deploy to the managed cluster. So as you can see here, this manifest, that's actually a regional Agosti application that templates here. So there's nothing special here. But one thing I just want to highlight here, because of the poor model, we actually are leveraging the Agosti instance running on each managed cluster to do this local deployment. So the destination here, that it has to be the local built-in server URL that's provided by Agosti. So if you are familiar with Agosti, then you probably already see this server URL for multiple times. Okay, so once this Agosti application is deployed to the given a managed cluster, then the next step is totally owned by the native Agosti applications. So we're not on it. So we just deliver the Agosti application template to the managed cluster, then the Agosti application controller will be reconciled to our poor resources from the Git repo, Helm repo, and then do all the local deployments. But just on that managed cluster, and they have all the status reported on that managed cluster either. Okay, so the last major part is, okay, so once we have all of these application deployed and on all the managed clusters, and the next step is we need to find a way to collect all of these multi-cluster application status from all the managed clusters. So here we actually have three major components to support these status collection and aggregation. So the first one is, so we are leveraging our OCM search component. So each search collector is running as agent on each managed cluster that's managed by our OCM components. So it's able to collect all the Agosti application status from each managed cluster. And then it saves all of these status to the search postgres database running on the hub cluster and for the later query. And then the second component is the resource sync controller. So that's provided by our promote components. So we are actually running periodically a search query for fetching the resources list and the possible error conditions from all the Agosti application that deployed on all managed clusters. And the third components we are using here is the aggregation controller. So I just want to mention that these are aggregation controller and the resource sync controller. So they're all running in the same pause as the sidecar container. So the reason is once the resource sync controller is able to fetch the application status and then it's generate intermediate layer of these status into a YAML format files into these shared warrants in the pod. Then these shared warrants will be picked up by the aggregation controller. So from here, see these aggregation controller is also combined two status. So it's actually a fetch the overall status from the manifest work. So I just showed it. So we specify the JSON pass for fetch the overall status including the health status and the sync status. And the second one is also able to fetch the per cluster condition status from this intermediate YAML output. And then finally by combining all of these information. So we're able to generate final multi cluster application set reports here into the same application set namespace. So this is how the application status collection and aggregation works here. So here I also posted a YAML file here for the reference seed here as I just mentioned. So our resource sync controller is used to fetch all the cluster conditions from all the managed costs by using the search components. Then we will generate this YAML format result into the shared volume in the pod. So one application set is actually have one YAML format output here. So here is the first session is that we can we are able to get this resource list from the first application we deployed to the first managed cluster. And also we have all the cluster conditions. So we have the cluster name and we have the sync status, health status. So these are sync status and health status are there empty because we will use the manifest work to get this overall sync status and the health status. So but here we also have this conditional list. So that will be used for reporting any possible error conditions that this application could have. So if there's no error conditions then this conditional list will be empty. Okay, so I think those are the some of the instructions for this architecture overview. Okay, so let's talk about some prerequisite to be able to use this promo model. So we actually have four prerequisite but I just want to say those are almost all of these prerequisites that they are required by the ARGO CD. So in order to use the promo model then we have to follow the ARGO CD way to set up these conditions correctly. So the first one is so we need to install the ARGO CD operator or the OpenShift GitHub operator. So while they're actually at the same thing. And this OpenShift GitHub operator that's running on the hub and all the managed clusters. And it has to be installed in this fixed namespace the OpenShift GitHub. The second prerequisite is the on the hub cluster. So we have to import all the managed cluster secret to the OpenShift GitHub's namespace. So that's also required by the application set controller because if there's a no cluster secret located in the ARGO CD server namespace then this managed cluster won't be identified by the application set controller. So the first local propagation will fail. So it won't be propagated the ARGO CD application to that managed cluster. If it's not, there's no such cluster secret in the namespace, okay. And the third one is so this is also required by the ARGO CD application controller. So right now if we want a deployed something out to the managed clusters into a specific namespace then we have to set up this particular label to let this namespace are managed by the application controller. Otherwise, so there's no RBAC control. So all the deployment will fail. Okay, then let me go back to Maggie to talk about the other prerequisites. Yeah, so I feel like it's important. I just want to say that we do have content meetings with ARGO CD community. We attend their meetings quite frequently. We propose and we have PRs and we have iterations of PRs. So it's currently in the official ARGO CD documentation and you can find the use of it that it's specifically for open cluster management with the pull integration. So you will need to have the skip reconcile to perform the pull model on the application. So now I'm going to show you an actual, oh, sorry, I think someone was taking a picture. Sorry for injecting. Okay, so now I'm going to show you the official demo. It's also a recording because I don't want to show something surprisingly. But before I click on the play, I just want to point out that we have the terminal, okay, I can't re-merge it. So essentially we have the terminal for the hub and to manage cluster. So I'm going to perform the application side deployment on the hub cluster. So now I'm listing the manage clusters and you can see I have demo zero, demo one and my local cluster. I am now applying the application set itself and you can see the placement and applications that are created successfully. Let's take a look at the application set itself. So let me scroll up and pause right here. So you can see that I am using a placement called bgdkappplacement that selects the demo manage zero and demo manage one. I'm going to show you the placement shortly. But I do want to point out under the metadata annotations there are some OCM specific labels. So such as the OCM manage cluster that's going to inject the manage cluster name. OCM manage cluster app namespace which was the one we defined as well as the script reconcile and you can find more documentation on it on the official argue doc. So now if I, oh, aside from that you can see the other spec like destination projects where it's just what you would usually create application set. We didn't do much variations on this spec. And let's take a look at the placement. So for this placement I didn't use AWS. I'm using environment to demo and you can see I'm using the global cluster set and I have two clusters selected and it's also reflected under the status. In terms of the placement decision you can see I do have a list of decisions which shows my two cluster names. Now let's take a look at the application and you can see they are deployed on the hub but our actual managed cluster deployment are not from the hub. We are going to have an application on each managed cluster itself and you can see that they are created by OCM's propagation operator. So resources are also deployed. You can see the replica set deployment everything was in the healthy steps and including route as well. So let's take a look at the application demo itself. It's the same Argo application templates. It has a list of resources and what the status of those are. They are all healthy which is great. And on the help cluster let's verify those status by OCM's multi cluster application set report. So this is a resource that's created by OCM and you can see that we do have a list of resources and under the cluster status they're all healthy. Resources are listed here and we have this overall status by saying we have two clusters selected and both of them are healthy and they are all synced. So that's the success route. I mean in general you would have progressing stock or actually felt status. So I do want to show you that. So now I am deploying an application set that's going to get stuck. So you can see it's actually in a sync state and progressing on the managed cluster. So take a look at the application on the managed cluster. It should go into reflect what you just saw. So even though some of the resources are synced you can see we have certain resources that stock in a progressing state and we do have the messages. They are waiting for some rollouts and they will never end because we intentionally made them to wait for something. So we do want to make sure the multi cluster applications that report also get reflected. And as you can see under the summary we now have two clusters, none of them are healthy and they are always in progress. So if you scroll up you still have a list of resources listed but the health status are now stuck in progressing. They're still in sync so that's a good portion. By the way all of the applications are using the same placement so they are both deploying to the two managed clusters. So now I am applying the third application which is going to fail. On one of the managed cluster I just want to verify that they failed which is the case it's no longer in sync. It's now missing health status and it's out of sync. So on the application itself you can see that if you scroll up their health status are in missing state and they are out of sync. I do want to point out that one limitation we have is that even though those resources failed they ideally they should still get reflected on our hub which is the multi cluster applications that report but that's not the case. So if you look at the resource list we only have a service kind which is not true because you can see that we have deployment that failed. So that's one of our limitation slash bug that we should enhance. So I just want to point that out but aside from that we have not healthy and not sync status which is true. So under the condition we also have those failed messages reflected from the managed cluster to the hub cluster. That's pretty much it for my demo portion. So let me go ahead and discuss the limitation. So I already covered the number two which is that we should have all of the resources from the managed cluster to the hub which didn't happen nowadays for the failed state. One, another limitation we have is that resources are only deployed on the managed clusters. So with the pool model they are not deployed on the hub cluster. And lastly in the pool model we are excluding the local cluster as a target managed cluster. So there are a lot of different ways to reach us if you're interested in the pool model or want to enhance the pool model. So we have the GitHub page, we have website and docs. If you want to interact and reach us we have Slack channel, YouTube channel, mailing group and community meetings. You can also reach us on LinkedIn if you're interested as well. Johnny, do you want to tell about the skill test? Yeah, okay, so I just want to mention there is, so as Maggie just mentioned there are some limitations that we already identified. And that's partially because the limitations for our search component and our manifest work limitations. Firstly, so right now our manifest work is the feedback rules as I just demoed. So it's only supported to fetching the string data type. So it's not supported to fetch the list data type. So that's why we have to rely on the search components to collect those list resources and the list of the error conditions. But the search components, so one of the major things is that it has to be find all the resources that actually are deployed on the managed clusters. Which means that it has to have the UID. Otherwise, those filled resources won't be fetched by the search collector. So with these are two limitations that we identified and we actually have a roadmap plan that's under the consideration. So I just want to share it here for you guys. So our manifest work, we already have a proposal to enhance our manifest work API. So by that way, then the manifest work, we'll be able to fetch the list of data as a raw JSON. So with that way, then we just need to fetch all the overall status and the per cluster condition status from the managed clusters by fetching the status from the manifest work status directly. Then if our data is ready, then we probably will drop all of these search components dependency in the future releases. So that will provide us more efficiency. The other one is I just want to spend maybe 30 minutes to talk about some of the performance tests. So they're still underway, so we're not finished, but we just want to prove that this advantage of the scalability by using this core model. So actually, so we run this case, we actually deployed 20 before that. So I just want to share our two numbers that our thanks to our GitOps OpenShift GitOps operator team. So they actually talk about two numbers that right now are the native Agocity they have. The first one is the total numbers of the application that they can control. So they found that once the numbers of applications are using the push model is over 1,000, the whole Agocity application UI will have some performance issue, so they're not stable. And the second number is, so they found that once one Agocity instance, they are deployed resources to over 100 clusters. So they also are hit some performance issue. So here we already run some performance tests using this core model and our scenario is so we deployed 20 application sets to our 500 managed clusters. So the total numbers of applications that would be the 10,000 applications in total. So using two and a half minutes, as soon as we get all the managed clusters reported in our final multi cluster application set report. So those are the rough performance status I want to share here. Okay, so that's it. Thanks. Any other questions? Can you, yep. I saw on the demo for the application set deployment that you had an annotation for targeting a managed cluster. If you don't use that annotation, are you basically saying I want it to just go to all managed clusters or is there a way to have an application set just go to auto all managed clusters by default? Yeah, yeah. Like if you does not specify the predicates, it's just going to use a cluster set, which is the global cluster set that's going to deploy to all clusters. So it's just like if you want to select a set of the cluster, that will be a way, but you can definitely have the choice to deploy to all clusters. Right. So maybe to be more specific, oh, sorry. Yeah, we can't be close. Yeah, essentially you can delete everything under the spec, to be honest. And you can also delete everything under the predicates. It's just an option for you to narrow down the list. Thank you. No problem. So on a scale of one to 10, where one is your pre-alpha and 10, this is in production and it's bullet proof. Where would you put the OCM or another question is do you have customers that are using it today in production? I think we, do we have any like in real customer? I think it's still under development and that's why we want to be here to raise some awareness. So we are going to develop more in next release, but this is, I would say in alpha state. And you can see we have a lot of prerequisites, which is not ideal. And I am looking forward to collaborate with all the more so that maybe we can reduce the number of prerequisites to, maybe just one. Okay, thank you. Right. And so for this promote feature, so we are planning to release it in our downstream project called ACM, that will be released in this June, so in 2.8. So that will be published along with that release in June. Is there any question? If not, I think, thanks everyone for joining. We are looking forward to see more contributions in the future. And thanks Argo. Thank you. Thank you.