 Hi, good afternoon Welcome to my presentation first of all It's gonna be a little more high-level presentation than the one you have seen before you have been in this room In my presentation, I will share with you the experience I have collected together with my team in building and implementing the applications which were deployed across multiple Either public clouds or in a hybrid cloud model. So where we had some private cloud and public clouds connectivity My name is Yaroslav However, you can call me Eric as well I'm with Red Hat for almost seven years now However, I must admit with shame that this is my first time on Defcon So I'm very happy to be here And before we get started, I would like just to ask you who of you is from Red Hat in the room Okay, so we have like how good So So this journey with This kind of architectures have started for me around five years ago And if you can recall this was the time When Red Hat has released OpenShift 3 Actually, some customers were involved earlier in in OpenShift, so they started together with us some Implementations of this kind of architecture just shortly before OpenShift was released and in this presentation, I will try to share with you some findings or some interesting aspects of Building this kind of architectures So when we started five years ago, it was like kind of a totally new space for for everyone however now if you Look on for example, this this kind of research You can see that actually nowadays most of the customers who are Willing to deploy some applications in in the public cloud They will use some sort of either multi or hybrid cloud strategies so it now it's more like the mainstream and also the The tooling the software I will talk about has matured over these five years to To make this kind of projects much easier than than it was in the past So that there are quite a lot of good reasons why why customers are using this kind of architectures There are three major group. I would say the first one is really to the cost management. So Public Cloud is not is not for free In fact, it's it could be quite expensive in some scenario. So So this this kind of architectures might be a good way to to contain that kind of cost There are some, you know, some some technical reasons like I'm increasing availability reduced latency and so on and also there might be some compliance or regulatory Requirements especially related to data management and data transfer across different different geos So these are the most typical topologies that I was working with within this five-year journey So the most project they they started in a way that people who are experimenting with public cloud they typically started to put some Non-production workloads into into public cloud and and later on they they moved Also with production But typical the starting point for many customers was to run some non-production systems in Systems on environments in public cloud and running production in-house The second the second use case was where We wanted to build a solution or system that was distributed across multiple geo So if you if you were if you are working for some global companies or you are involved in some global projects Sometimes, you know, they have some applications that they want to be available across the globe So they want to serve customers from Europe, Asia, US and then typically this means that They might want to deploy the instance of the applications across the geos in different clouds The other use case is about scaling so there are some types of applications which typically Requires some limited resources, but there are some points of time some peaks or some other events where they need Dramatically to scale up and this is also a kind of good case for For leveraging a public cloud where we have this additional resources available on demand And the other The other scenario is when we are layering our application. So this is typically Done in the way that we have some data layer that sits in our private cloud, but then we have some Some API's or some front ends that could be deployed in Public clouds and they communicate with the backend that are running in in the private cloud so this is more or less the kind of Setup for for my presentation and this is what I will be talking about in the in the In my presentation so that is essentially two parts One is more for developers. So how to how to Architect the application how to manage CICD and application deployments across multiple Or hybrid clouds and the other one is more about the operations of how to manage things like networking data application Management monitoring and security One more question for you. Who of you is more like developers and who is the operations Okay, so we have like 50-50 so good mix I Guess there are also some DevOps who do both All right, so let's get started with the application. So as I mentioned in the beginning This journey started at the time when when Red Hat was releasing OpenShift version 3 based on based on Docker at the time so The containers are really the the key component of those application architectures and in this kind of projects there was always a requirement to Containerize the application in order to achieve maximum portability Which is needed for this kind of multi or hybrid cloud deployments And of course we had like two two types of Situations so either we wanted to migrate existing application or we wanted to develop Completely new one So in regard to existing applications, there are basically three strategies that that works Where well in a in a real life. So so the first one in the start first strategy is called Rehosting and this is where we simply try to leave and shift our application So in essence what that means we take our application as is and we containerize The layers or the application itself depending what is the architecture whether it is monolith or it has some more layers And this this work well this works well for for some some types of the application so Especially for technologies that that are kind of modern and not really legacy stuff For other applications other possible Strategy to to migrate was the Strategicals re-platform and this is where we basically Keep existing application as is But we try to build with the containers any new capabilities that are Introduced to the application over the time and those new capabilities are built using the containers and and they are Designed to run in Contemporized environments and then of course we need to build some kind of integration layer as you can see on the slide between our existing system and the And the new new layers new components that we introduce This is of course more complex approach than than the previous one, but still Works well in in many scenarios and then we have the third approach which is Refactoring and these are typically the most complex projects where we where we are really Taking the effort to rewrite our application into Into the new containerized Architecture so this might also mean that we migrate from monolithic architecture to microservices or we do any other needed Architecture changes so these are typically big and complex projects that that Take a lot of time and Narcosly Especially compared to the previous two approaches When we talk about developing new applications We we standardized on Cloud native Approach so what it really means that we of course Standardize on the containers as a runtime Microservices as the architecture the API as a standard of communication between a microservices and DevOps as a process is to to manage the CICD for for applications And When we had a when we had a containerized Application we need a platform to to manage and and deploy the application in our multi or hybrid cloud environments and of course the The platform is is OpenShift so In OpenShift by the way, who of you is familiar with OpenShift? Okay, so most of you so No surprise So what OpenShift give gives us it gives us Flexibility of the platforms And it gives us basically the consistent user experience across different different Cloud so if we combine OpenShift with with portability of containers We have pretty good foundation to to be successful in in deploying our applications into Into multiple in multi or hybrid cloud environments But containerization However, it works in many scenarios. It Doesn't work in in all scenarios There are still and there will be still some workloads that are running in virtual machines and we find it really Complex or too expensive to to take the effort to to migrate those workloads to To containers however on the other on the other hand we want to We want to leverage the benefits of having OpenShift as a consistent platform for for running our applications and There is an effort in the community to to build a platform that will that will let you Run a virtual machines natively in the OpenShift At the moment it is still a community effort But if you are interested in this kind of use cases I encourage you especially to have a look on QVIR projects also on Metal Cube, which are the the main projects, which will Provide us the capability which will provide the capabilities for Kubernetes and for OpenShift to run natively KVM based virtual machines So there is a hope and that also container natively container sorry natively virtual machine based workloads will be able to To leverage the value the benefits of OpenShift Okay, so so we have we have the Containers we have the the the platform To run our application now the first thing we need to to implement is CICD So our first attempt to CICD for in multi or hybrid environment was really to Leverage existing tools and knowledge so in our case this was a Jenkins And the only only change we made initially in our Pipelines was to to introduce the multi-cluster deployment of images that were that were tested created and tested during During our pipeline execution This this worked for us pretty well, but at some point We had number of situations where The deployments to to some of clusters that That we that were a part of the platform Has failed And because this step was part of our pipeline this this this man This mean really that that our pipeline has failed and We started building some you know some some solutions in Jenkins to To handle this kind of situation, but but really at some point we realized that this is pretty much the road to nowhere because There will be always some new issues with the networking that We didn't cover and you know Jenkins is not the the platform to to replicate Container images across different different Clouds or environments So this is where we Actually, maybe a bit of coincidentally At the the same time pretty much the same time Red Hat made the acquisition of coro s and with coro s we acquired also Container registry called quay and quay Provided for us One functionality, which was really the one we needed which is the geo replication So with quay geo replication as you can see here on the on the picture you can basically deploy image to one instance in your Distributed quay environment and then Quay will distribute the images across all registered Or connected Instances so this was the way How we moved forward So We started to leverage quay Which was deployed across Multiple clouds in different geos and quay was we Give a quay responsibility to to replicate the images across the Across data centers Another challenge here was to Was the management of the application configurations across different data centers, so so We had open shift cluster in each data center deployed and We And we needed to to replicate application configurations across those clusters in a consistent manner And now it's probably pretty obvious that that's for this kind of challenge You will leverage github's but when we when we When we have been facing that problem Three four years ago. It was not not that obvious so Probably now, you know what what was the idea behind github so We keep our application configurations Which are yam based in the g-tripo and then we have an engine that can replicate those configuration across multiple registered Kubernetes or open shift clusters And at the time we made the decision that we will leverage We will leverage Argo as the engine And Argo actually gave us a very very nice functionalities so first of all it With you can register multiple Clusters that can be deployed Anywhere across different Data centers clouds Geos It of course synchronized the application configuration with the g-tripo What was also very important it give us gives us possibility to Make some overlay configurations which could be specific only to some data center. So if you think about Kubernetes or open shift application There are a number of configuration that are specific for your For your data center. So things like some credentials. So typically config maps secrets Ingress configurations. They are they are specific to to to every to every cluster or application instance So we we needed and and we got it with Argo We needed the ability to make some overlay configuration specific for for the cluster Then of course Argo can Synchronize for us the configuration To each of the cluster and what is also very very important and very useful about Argo is that this last thing so Argo will also Monitor the configuration of your application on each of the clusters and will detect if someone locally makes some changes to the configuration And will revert that change to to the configuration that is defined in g-tripo. So this This also makes You know it reverse a little bit The way how we manage the configuration Because we don't manage the configuration via Command line or web console of Open shift or Kubernetes, but we will average as a single source of true the G-tripo Over the time So gtops Might become even even easier because We have Our community have introduced concept of operators. So with operators you can basically package together the application configuration and this makes nowadays some application configuration easier so there is You can you can transfer some Or you can package some application configurations into operators. So so that your Your app configuration becomes becomes easier And there is a there is a community effort to to build kind of a standardized solution for for For this use case so for Federation for federated deployments and of applications, but this is still kind of Work in progress so early days and I Don't see like this is moving forward fast So I think still the the approach with the gtops is the the best way how we can manage application configuration in multi Or hybrid cloud environments Okay, so now The second part Maybe a bit more related to operation operations and infrastructure So the networking so So so from the networking point of view hybrid and multi-cloud introduces number of challenges The first challenge is how you manage the traffic ingress So you have Instead of one cluster with some ingress load balancer in front you will have like multiple Clusters each of them will have their own local load balancer And what what you will need here you will need some global traffic manager Which will be able to distribute the the traffic between Between this multiple Clusters so GTM typically is a DNS server so it doesn't really So the traffic don't do not really go via via GTM It is a DNS also some with some advanced capabilities related to geolocalization for example, so It might as a DNS it might give you different IP address IP address of one of these load balancers depending on from where you are Sending your request and this is quite quite important in some scenarios, especially if you want to Distribute the the traffic According to a geo from of the of the of the requester The other the other challenge is The service mesh in multi and hybrid cloud So who of you is familiar with service mesh? Okay, some of you are so service mesh is very useful in in a situation when you have when you have the micro services architecture with service mesh you can manage all the aspect of the network communication between Between the the micro services so Think of security think of some Policy based routine Also, it gives you some very nice observer monitoring or observability But on the other hand, it's it is quite heavy and complex component of of the Kubernetes or open-sheet architecture And Service mesh has gives you at least three options how you can how you can Deploy it and manage it in multi cluster environment So the first two options You can use if you have Completely separate networks between your data center. So for example if you if you have If you deploy your application in different clouds in multiple clouds This this might be a case And The third up so the third approach requires that you the third option requires that you have That you have a single network between the between between your clusters So so there are some some tweaks in each of the approach so so it's It's not easy really to analyze and decide with which option to use In my experience, we typically use they the the second the second option where we Wanted to avoid to have multiple control planes in in the mesh So we wanted to keep service mesh configuration in single location and we wanted to To be to have the service mesh managed in the central way and this is how it looks like in a This this image shows you this this scenario So in regards to networking there there is some Also some some community effort to to make it easier to connect multiple multiple clusters So the one project to where we also as a Red Hat Contribute is Submariner With these projects they try to build kind of VPN That runs on on top of open she sorry on top of Kubernetes so that you can easily connect different networks and You don't need to go deeply into the Infrastructure layer to to open some tunnels and so on There are other projects lighthouse and Coast Guard, which are also aiming at Building the networking For multi cluster scenarios Easier, but this is still still community effort. So it's not yet available Next thing data data replication so Data replication basically Has two complexity so the first one is the technical one so how we can how we can Send the data in efficient way between You know between different continents Even or between different networks the other one is about the cost so the cost of Data replication between public clouds can be very very substantial. So You cannot underestimate that factor and you should always Think and analyze whether this won't make your Architecture or your use case And unacceptable So in regards to data replication there are basically three Three possible scenarios, of course if if you don't have to the best way is to do not To do not replicate the data But if you have to you can theoretically consider three options. So either you will use some Infrastructure layer so storage layer replication solutions or You have two options on the applications layer. So either you replicate the data using some application-level technology or you leverage the Data partitioning and and you don't really replicate, but you just partition your data across different clouds The infrastructure-based synchronization of course especially in public clouds scenarios Might be impossible, especially if you are thinking about some hardware level replications however, there is a Development and progress in building software defined solutions to replicate the data the one which which is now part of Red Hat offering is is is Nuba Cloud object gateway. So these solutions let you Replicate the the object storage Between different different cloud or private and public cloud The kind of downside of the solution is that it it Exposes the data using the S3 API. So this is not feasible for any kind of application You need to your application to be to be able to leverage S3 API But once you have this this You meet this requirement Nuba can replicate for you The the object storage across different Different clouds in a transparent away from the application point of view So but this is this is quite new so Even even in Red Hat we just Introduced this as part of of our new container storage version for which was released. I know week or two weeks ago so what we typically used to do in In the past in our project we used to leverage some application based replication solutions So this is an example of the multi-cloud architecture of Red Hat single sign-on Platform which is based on the key clock project So this Platform as you can see is using two persistent layers so one is relational database which in this example is a Is a mysql database with with Galera for Multimaster replication and the other one is data grid So this is in memory cache to to offload some some data from from the grid and as you can see here Both Galera and and JDG they they they have built in Cross data center replication functionality Which is built using some? Custom protocols running on on layer layer 4 so TCP or UDP and This is this is typically the most physical scenario to implement that data replication in in Whenever one of your data center is in a in a public cloud because As I said before You won't be able to replicate data using some some infrastructure solutions Similar approach is to use the messaging platform so AMQ Offers the interconnect functionality which lets you Exchange the the messages between the distributed data center so doubt so that users can Send the message into one data center and and some consumers might consume the same messages in In different data center and this is again based on some layer for replication solution which is built in into AMQ interconnect So the next thing is management so nowadays multi cluster multi cloud management is very hot topic so many vendors Started to build or to offer some solutions in that area So in case of in case of red card, so what what we what we did and what we are doing So first of all we have significantly Improved The process of installing and upgrading OpenShift so I'm referring here to to OpenShift 4 Which nowadays you can Install using the full stack automation strategy where basically You need to To provide like five up to ten parameters and you will have a cluster deploy it automatically in Let's say half hour There are also two Hostet offering so one Dedicated which is so which is hosted on the AWS the other one is Hostet offering con Azure So from the Installation and upgrades point of view There is not much more effort to to manage the Multi cluster OpenShift environment versus some some single Or private cloud environment We are also offering in a software as a service model the OpenShift cluster manager console But this console is not really for the to manage the the platform is more for us to offer you and provide you some services like subscription management or Or Updates some proactive support services and I think this platform is built to meet this this this kind of use cases However Because we we joined the the IBM family There is also there is one more a product which looks like will be our Go to platform for the the multi cloud management or multi cluster management It is IBM cloud Multi cloud manager The significant part of its functionality is the management of Kubernetes clusters and it already offers quite advanced capabilities in regards to the Visibility of your cluster so you can register Multiple clusters to the console and and and monitor them from from the metrics from single instance It also has some some nice functionalities related to the Workflow deployments, so this might at some point Compliment what we are doing currently with GTOPS It offers you also some Functionities for the two operations so updates patching and so on and I think that over the time this will be Integrated with OpenShift for capabilities This platform itself can be deployed on top of OpenShift, so And I'm expecting that soon you will hear more about that That product as our solution for multi cloud or multi cluster management Both in on-premise and in in SAS Offering offerings Next thing is monitoring so OpenShift comes with building monitoring stack so based on the Based on the Prometheus and Grafana And That they are basically two scenarios how we can leverage that monitoring stack in in the multi cloud Or multi cluster environment So the first approach Which I Have experience with was to Deploy a separate Grafana instance Which Which Where we define some dashboard that connects with Prometheus instances running on on on separate on separate on different in different Clusters or clouds The other approach which is Pretty new Is to leverage the Prometheus federation capabilities so You can have a Prometheus instance that will that will replicate The metrics from will pull the metrics from Prometheus instances running running on your on your clusters and then use Grafana to To create some some dashboard dashboard however In these two scenarios the the big concern was really a cost of Getting the data from especially in public cloud scenarios getting the data from the public clouds to To to to Grafana or here from one from in these two other I Was involved in one project where we did some tests to compare the volumes of data For some dashboards that were built in Grafana At this time in this project I Can also this is a general rule, but we find out that Using Grafana to to to collect metrics from Prometheus Was cheaper than Then then letting Prometheus to to to to To federate with with other Prometheus instances But I cannot say if this is a general rule. Maybe if you somehow optimize Reconfigure Prometheus you will be more efficient with the second scenario And the last topic is a security So very important thing So open shift is by design very Or very secure platform, so on every layer starting from the operating system you heard today may be the presentations Which cover this this operating system level security Throughout all the all the layers open shift you have Built-in security controls that let you run it in a very secure matter Also from the container images point of view there is very nice Container scanning functionality which is part of quay Container registry But what was really amazing part in open shift was to the ability to to define some and enforce define and enforce some multi-cluster security Polices That would be Enforce and Deploy it and enforce across multiple clusters at once so We have been Playing with open policy agent Which is pretty cool solution to define some security policies It runs as an admission controller. So if you know admission controllers They are kind of low-level interceptors of requests to Kubernetes API. So They have access to all the data that is sent to the Etcd database so with those admission controllers you can easily Analyze Anything any data any yaml content that is sent to to the API but Open policy agent has no multi-cluster support. So if you will use that Solution in multi-cluster environment most probably you will need to leverage JitOps tool or any other Replication tool to To manage in a centralized way The the security policies The IBM multi-cloud manager the tool I mentioned In regards to management has also quite nice compliance module where you can define Compliance policies and with that It has also built-in capability to replicate the data across across multiple clusters so this is also If we look on the products available on the market, I think this is the The best solution for for this challenge So I I leave I put a couple of links for you. So if you want to go deeper With some of the Topics I briefly discuss here are the links I Leave the presentation Here for for you to download so Feel free to to to learn more about the the stuff and I think that's Pretty all from my side Yeah, I'm almost Taking my time, but if you have any question, I think I can have no one or two Anybody say again, can you repeat or I don't hear you Download page Well, I uploaded this presentation to To this to the conference schedule page, so I think if you go to To to my presentation in the schedule you should have some link. Hopefully any more questions For data transfer it you know, it's it's also a matter of optimizing But at the time when we did some tests it was probably two times more expensive To to let Prometheus Get the data from the other Prometheus instances versus letting Grafana to execute the prom qr queries to Prometheus Okay, I don't see any more questions. Okay, so thank you very much for your time