 So welcome everybody and thanks for joining. I'm Mikhail Skargakis. I've been working in Red Hat for four years. And during the past year, I've been part of the OpenShift to Azure team. And together with the Microsoft team, on the other side, we've been building OpenShift to Azure. And, you know, the goal of this presentation, and after you leave this room, I want you to have a good grasp of what the service that we are building is. A quick walkthrough through the agenda. I'm going to give a brief overview of the service. The user experience I'm going to walk you through. Note that most of it is still a work in progress, especially the portal stuff. We're going to dive into the architecture of the product. And there are going to be some terms. Azure OpenShift related. I'm going to explain them. I'm going to touch briefly on our integrations that we have in OpenShift with Azure. The teams, of course, that put the effort behind bringing the service into the market. The support that we're going to offer to our customers. A lot of timelines and a couple of references for you. Before I get started, a quick show of hands. How many of you are familiar with OpenShift? Please raise your hands. Cool. So about four-fifths of the room. How many of you are familiar with Azure? Raise your hands. Okay. Much less. No worries. I'm going to explain things as we go. So first, the overview. Our main goals with this service is two things. We want ease of use, of course, and scalability for our customers. We want to make it easy for them to stand up clusters, to manage clusters, upgrade or scale up, scale down or even delete. And of course, we want to do it in a way that is scalable so we want to match customer needs. This is the first OpenShift service offering in the public cloud that we are offering jointly with Microsoft. So the clusters are going to run on customer subscriptions with SLAs. Right now, we are on private preview one. We offer no SLAs, but once we release to private preview two and then GA, we are going to offer SLAs. Red Hat is responsible for most of the workload and that's fair because we build OpenShift. We know how to operate OpenShift. The architecture is highly opinionated. So some of you already know that OpenShift is one of the most complex projects and there are lots of configurables. So we try to minimize down to a bare minimum the configurables that are exposed to our customers. So it's going to be easier both for our customers to understand how they can stand up a cluster and for us to maintain if we have lots of different clusters with as much similar configuration as possible. And we are compliant with a bunch of different standards. The user experience, yeah, so assuming you already have logged into the Azure portal the first thing you're going to do is create a cluster and note that the portal is still a work in progress. So that's the first view that you will have once you create a cluster. You want to create a cluster and you have to provide us with a couple of information such as your subscription which I think it's going to be automatically put in there. The resource group. So how many of you are familiar with the resource groups in Azure? Raise your hands. Okay, so for the rest of you, resource groups are like namespaces. You can think of them as namespaces in Kubernetes or OpenShift. So it says it's a group of resources where we are going to run your clusters inside. And then you have to provide a couple of details for your cluster like the name of the cluster where you want to have the region where you want to run, the version of OpenShift, and also you have to provide us with the DNS prefix at this point. In the future we are going to support Vani domains. You will also be able to configure networking. So if you have existing networks in Azure you will be able to reuse those and run your clusters inside those networks. How you will be able to view your clusters, of course, all the cluster info. This is a list view of all of the clusters. As I said earlier, I want to make it easy for customers to run lots of clusters. They may have needs for having their own CI clusters, their staging environments, and then their production environments. You will be able to see this under an aggregated view. And you will also be able to see descriptions of your clusters, whether its status, its deployment is successful or not, the version of the OpenShift version that you are running. You will be able to navigate to the web console, the OpenShift web console, a couple of information about the resources that your cluster has allocated. And this is still a work in progress. You will be able to view metrics, alerts, and logs for your clusters directly via the portal. This is a work in progress both on the front and in the back end at the moment. So if you need more capacity, what do you do? You scale up. It's going to be really intuitive. You will just have this slide bar over here, or if you want to put the number directly and you just scale up, that's it. So this is how the user experience looks today in the command line. You do an OZ OpenShift create. That's already there. You provide us with your resource group, the name of the cluster. You also need to provide the location and the NFQDN at this point. OZ OpenShift list, I think last time I tried, this is still not there. OZ OpenShift so for showing a description of your cluster. OZ OpenShift scale, scale up your clusters, and the OZ OpenShift upgrade is still a work in progress. We don't allow upgrades at this point in private preview one. Do you have any questions so far? Because I think right now it's demo time. So the question is if we support more Azure regions like Azure government. At this point I think we support a couple of regions like East US, West Europe. Eventually the plan is to support also more regions like government regions and the more regions will support the better for us. And the plan is to support them. Any more questions so far? You are doing good. So I'm a CLI guy. Let's make it. So I have started the container over here with Azure OpenShift. First thing I'm going to do is we are looking at the resource groups in our subscription. I'm going to create a net new resource group just to be on the safe side. I'm going to give it the name of the resource group. The name of the cluster can be different or it can be the same as the resource group. It doesn't really matter. I'm going to pass the region, the one closest to us, and then FQDN which is the name of the cluster. And there is also right now a DNS prefix, that suffix actually, that we have to deploy the clusters for you. Eventually we are going to support vanity domains. And I think we are good to go with this. Yeah. What do you say, Taipo? I don't see it. Oh. So actually we have first to create this group, right? So I was doing it wrong, sorry. So first just create the resource group where we are going to request our cluster to run. And then I have to create the actual cluster. And this is the az-open-shift-create. And I think that's pretty much it over here. So what happens behind the scenes? So we are going to dive into the architecture right now. I'm going to explain what happens behind the scenes when I just requested for a cluster to be created. No worries about the flowchart. I'm not going to go into deep details. So what happens is I requested the cluster, right? I just didn't know how the open-shift-create. Behind the scenes the request ends up in a Microsoft endpoint. And eventually a service behind that endpoint, which we call the Open-Shift Resource Provider, is going to take care of that service, that request. So the service is built by Microsoft. Or more precisely, we code develop it with Microsoft. And what we, as Red Hat, are doing is we develop a set of code plugins that Microsoft vendors in their service. And those code plugins, you can visualize what they are doing in the bottom of this flowchart. For example, our plugins are responsible for validating the request. They are also responsible for generating all the configuration for the cluster. So we generate all the Open-Shift Configurables. We also generate the ARM Templates. How many of you are familiar with ARM Templates? It stands for Azure Resource Manager Templates. And essentially that is an accumulation of different Azure resources. For example, you have your load balancers in there. You have your VMs, which we use as scale sets. Scale sets, you can imagine them as replica sets in Kubernetes or Open-Shift. So we generate all of the configurations. Once we have the ARM Template, we pass it to Microsoft. And Microsoft will deploy it for us. And once everything is up and running, all the VMs are up, we have a bunch of, we have actually two startup scripts. One for masters and one for all the other nodes. And when masters come up, one API server, one controller manager, and one at CD instance is going to run inside each master VM, so masters come up. From the node side, nodes just run a Qubelet service. And that service has a bootstrap Q config in it. So it's able to know where to find the master and ask to join the cluster. So once everything is up and running, magically everything joins the cluster, and eventually we run a health check, which everything waits for everything to settle down. And once everything is up and running, it will sign off the cluster, and the users are able to use it. So this is how it's going to look like inside the customer subscription. The numbering of the resource groups is a bit off over here. For example, the resource group that I just created earlier is resource group number three. So a user on your right starts by having their own resource group, the request for an OpenShift cluster inside the resource group, and what happens behind the scenes is that on your left side, on the bottom you have the OpenShift resource provider, which is the service that we are building with Microsoft, and that service creates an application, a managed application, which is resource group number two, and that is a way for Microsoft to limit access to the customers, limit customer access to Azure resources, specific Azure resources. And here we are limiting access to the cluster, so users are not able to go and mess with the cluster. They have limited access. And we also have access for Red Hat SRS via tooling. It's also going to be limited access. We're not going to have, for example, system admin. We're going to have a predefined set of actions which I'm going to talk about later. This is how the resource group number three looks like, resource group number one, sorry, which is where all the cluster runs, all the resources of the cluster are. We have on the top public DNS. We use Azure DNS. Then we have the load balancers, one for masters and one for the router pods for your applications. And on the bottom we have the private network where you have all of the different machines running. You have the master nodes where we're on the control plane. Right now we are architecture by default is three master nodes. We have the infrastructure nodes where all of the infrastructure services in OpenShift are running, where we are running the docker registry, for example, the router pods. And then we have the application nodes that are for user workloads. We are using Azure storage, of course. We use blob storage for the docker registry. And that CD, of course, gets SSDs. If you're familiar with SSD, it needs SSDs to work because it needs lots of memory. And we also have, of course, in dynamic provision, we have configured Azure disks for customers. So customers by default, they also get SSDs. Okay, let's have a look in my demo. It's still running. So I'm going to touch briefly on the integrations, which I already have started doing. We have on the top DNS and the load balancers from Azure. We are using AAD, which stands for Azure Active Directory for Azure authentication. At now it's the only way how to authenticate into your cluster. In the future we may support more authentication mechanisms. We have integration with the Azure Service Broker. And so you are able to request for external services and you can access them in your cluster. For metrics, the plan is to use, we already have actually running a Prometheus and Grafana instance in the cluster, but these at the moment are not accessible for our customers to use. Eventually I think it's going to be a self-service where customers will be able to request Prometheus with a click of a button and it's going to be deployed for them. VM Scalesets is what we use for running VMs. Scalesets is a fairly new concept in Azure. It supersedes I think it was availability sets. And of course we are using Azure Store Ads for both HCD, the registry and the user applications. Still running. So yeah, by default your cluster gets DNS routing. So in your web console on your left you will be able to go directly to your application. You are familiar with OpenShift Hardworks. This is based on Azure DNS. We use an AID. This is how it looks like in the web console. We are going to view it later on our own web console. Azure Disk Dynamic Provisioning for storage, as I said earlier, SSDs by default for our customers. And before I move forward with the teams, are there any questions so far? Yeah, so the question is how we are going to handle specific admin tasks like installing operators. So we are going to have limited access as there is in the cluster. So all operators are going to come with, you know, you'll be able to install them with every new version of our service. And we will also be able to, I'm gonna have a slide later. I'm going to explain a couple of different admin tasks that we are going to have predefined for admins. So as administrator, you are not going to be able to run an operator while the cluster is during runtime. It's going to be predefined. And that's because we want to avoid excessive configurations and we want to have everything under control with every version of the service that we release. So installing an operator after the fact is not a pattern we're going to follow. Any other question? So, yes, so the question is related to cross region, if I get it correctly. Yes. So the question is about availability zones. I'm not really sure I understand it. Whether how OpenShift takes care of provisioning of different storage in different cross availability zones. So right now you create your cluster. I'm not sure I understand your question. Right now you create your cluster in a specific region inside Azure. So if that region is a single availability zone, then you are only able to run everything in that region. So as I said earlier, you have to provide us a region in order to create the cluster right now. So maybe you are asking about cross region support. Yes. So everything is going to be in a single region at this point. No, it's high availability across machines. It's not high availability across regions. As I said, I showed earlier in one of the requirements that you have to provide to the cluster is you have to provide us with a region. Yep. So you see the second field? You have to provide us with a region. And I passed West Europe when I created my cluster. I think I'm not really familiar with that. We can catch up afterwards. But I think there is, once you are in a region, everything is handled automatically for you. So you need to set up anything first. Yeah, I'm not familiar with Amazon, sorry. So the cluster is actually up and running. So let's go and have a look. I forgot to request one node. I wanted to request just one node, but anyway. So I have to trust the certificate, of course. So the first thing that you are going to see once you log into the cluster is you are getting redirected into Microsoft's Azure. This is how AD works. So you have to give it access and accept those permissions. And once you accept those permissions, you are getting redirected into the web console. And we are able to create an application right now. So now our application is up and running almost. Let's go back to the other questions for her. So the questions are regarding upgrading the cluster. So I'm going to touch on it later. I have a couple of slides. But for now, the upgrades are going to be both user triggered. So users will be able to request to upgrade to the newer version. But we also are going to do our own upgrades, for example, security fixes, or key rotation if a secret from the infrastructure is leaked. So I have a couple of slides later which we are going to reply. Any other question? So the question is regarding wildcard certificates. So I think this is something we have been discussing with Microsoft. And it's something that we eventually will do. It's still not planned, but I think we have discussed it and we want it eventually. Any other questions before I proceed with teams? Can you repeat that? The default options for storage. So it's basically you get an Azure disk. It's a premium disk that you get, which is SSD. So by default, when you request for a persistent volume, you just get an SSD there. I don't remember exactly any types. But yeah, I can search later for that if you want a more specific answer. Any other question? So the question is whether we support autoscaling. At the moment, no. But I think we may evaluate it in the future. Other questions? So far? The network on the nodes. So we are using OpenShift SDN. So yeah, once you are inside the private network of Azure, we have OpenShift SDN, which are answered by default. So I'm going to move forward with the teams here. So how we work with Microsoft is we have a couple of different ways. We have weekly BlueJeans calls where we sync up. We have our own dedicated Slack channel. Of course, emails, lots of emails. And we do most of our work. Actually, as Red Hat, we do all of our work on GitHub out in the Open. So there's a lot of back and forth in issues and pull requests. And of course, we are doing a question on face-to-face meetings, which are very helpful to move forward with delivering this service. Currently, the teams, how they are structured. Microsoft is mostly Pacific Standard Time. We are spread across different regions, across time zones. And the reason is because we are doing SRE. And we want to follow the Sun model, which is also what other companies such as Google are doing, where an SRE wakes up during normal work hours and they work on this product. Once they go to sleep, others in another region pick up, another time zone, they pick up the work. And it's only working during normal work hours. So this answers your question. We are going to have limited access into the cluster via Geneva. Geneva is an internal stack that Microsoft has for monitoring alerts, logging. And we are going to integrate with it initially at least because it's a very mature stack and it helps us a lot to kickstart this service. Our SREs are not going to have access to system admin for obvious reasons. And we are going to offer a couple of predefined actions via Geneva. So as an admin, you will be able to see the cluster status. You will be able to rotate any kind of credentials in the cluster. You will be able to restore from backups. And you are going to be able to do minor upgrades, minor updates, whether an image needs to be updated or something else, a secret or whatever. You can do that. How would you development? As I said earlier, GitHub. We have a couple of different repos that we are using. OpenShift Azure is the repo where we are developing the production code. It's where all the code plugins that we are shipping to Microsoft are developed. And we also develop a couple of custom images for our cluster. All are found in that repo. Then we have Azure MISC. Azure MISC is an accumulation of different SRE tools and some CI-related tools as well. For CI, we are using Prowl and CI operator. That's what also the rest of the OpenShift engineering teams are using, so we're taking advantage of it. And we have a lot of different tests. I tell you that. We are testing, especially in our E2E suits, we are testing everything. We are offering us administrative tasks, LCD backups and restores, key rotations, scale up, scale down, upgrades. We run origin conformance. We run a couple of other suits that our QE's have been building on their own. Lots of different tests. Let me go back to my demo. So do you guys going to scale up this cluster? Any takers? Okay, let's spend those dollars. So unfortunately, I requested my cluster to... I didn't, by default, we get four machines. So let me prove to you that the cluster is going to be scaled because as an end user, I don't have access to the nodes. Actually, let's sign into the service from the command line. So as an end user, I don't have access to view infrastructure, stuff. I have limited access, so I cannot view my nodes. But I'm here in my namespace. I have all my pods. And you will see that since we only have four nodes at the moment, two out of our five pods are running on the same node. Let's actually delete this build. It's annoying. So I'm going to scale up the cluster now. Unfortunately, I can only go up to five. I wanted to go initially from one to five, but I'm going to go from four to five. It's not that fancy, sorry. You have to pass. And I think that's pretty much it how you scale a cluster. So what happens behind the scenes is, you know, just a new VM comes up, and it's able to join the cluster as any other VM that we request. So apparently, the build pod... It's okay. Let's proceed with the talk. Yeah. How are we going to do support? Initially, so let's go over through the service possibilities. Everything related to the cluster lifecycle, such as installation, upgrades, and monitoring, upgrades actually is in both customers and ours. But most of the cluster lifecycle is false under our peer view. Security upgrades as well and support. It's going to be ours. Red Hat and Microsoft, customers. They have limited access to the cluster, but they are still able to manage their own users and their own quota. And that's able... They can do it because we offer them a set of customer roles, customer administration roles. So they are able to do that. Image registry management. That's with caveat because we deploy and upgrade the Docker registry as a service, but users are still responsible for cleaning up their images, so image pruning is still under their own peer view. And of course, application lifecycle and monitoring is still a customer's possibility. And then external service integration is still customers. Any questions so far? So the question is whether this is going to replace AKS. Probably not. Well, you know, it's a different product. AKS is Kubernetes offering that Microsoft offers on their own, and this is a different service that we offer jointly with Microsoft. Any other questions? How do we manage shared... So shared storage is... From a customer's perspective? Yeah, so this is... You get this all for free by using OpenShift. So for example, when you have a clustered database, you use a stateful set, and you just scale up your stateful set. And your stateful set, when you scale it up, it will request more storage, so more storage you will get. The other part of your question was regarding how we do HDD backups. Shared storage backups. So our customers. We expect this to be handled by OpenShift and Kubernetes at some point. I know there is a snapshot API, and we expect that it's going to be integrated into Azure. So as an end user, you will be able to say, I want to snapshot this application, and this is going to be offered by the product. It's not something that we are going to handle as a service. We expect to have it inside the product. And the other part of your question was do we as a service do backups of HDD? So right now we have two different ways of backup. We're doing backups. We have a Chrome job that runs periodically every one hour and takes a backup of HDD. And we're also going to be taking backups before every action. Before every admin action, we're going to take a backup of the cluster. And if something goes wrong, we have a different process that is going to be exposed via Geneva. And you will be able to select the backup that you're going to restore, and it's going to be really simple for an administrator to do it. And you will also be able, of course, to choose from different backups that I said earlier. And something else I'm missing here. Anyway, yeah, if I remember it very well, I'm going to say. Any other questions? So the question is whether you can skip image pruning. Okay, so the question is whether you can bypass and not use our registry at all. Yes, you can do that. You would have to... So the OpenShift registry, it has this ability to work as a pass-through registry. So you can use it as a cast of other registries. And you can still use... You can still use inside your port specs whatever registry you want to use. So you can... If you don't want to use the internal registry, you can not use it simply. Any other questions? Okay, let's look at how the scale-up went. Yeah, so it's there. So if I go back to the web console and I scale back to four and then back to five, we should see the new port getting scheduled to the new node. Okay, so I think it's better to probably delete it from here because scale-up and scale-down is... It chooses a pod in random to delete. Any other questions so far? So you see that the new pod lands on the new node, the fifth node. How are we going to do support? So it's going to be a similar experience almost as to what our customers and Microsoft customers are dealing with today. So everything is going to start from the portals, the Red Hat portal and the Microsoft Help for every customer. But all of them are going to be redirected to a single ticketing platform where they will be able to request for new tickets to be opened. And there is going to be ticket exchange between Microsoft and Red Hat. So issues fall under each one's correct... Everyone gets the issues they are supposed to work on. And that's, I think, pretty much it with support. A rough timeline. So we had the official announcement back in the last Red Hat Summit. We launched private preview last October. That was with no SLAs. Private preview 2 is just around the corner. I think in private preview 2 we are going to start offering SLAs. I'm not really sure about that. And we are planning for GA by the mid of this year. And a couple of references for you. On the top you have the OCP. This is the VME match that we are publishing. So as part of our service, we build the VME match. And that's what we use in our clusters. And that's in that link over there in the market place, the Azure Marketplace. And a couple of material, at least for private preview. All the documentation you can find in the GitHub Azure OpenShift repo. And there's the interest list for our customers. The last link. And with that I think I can open up the floor to any other questions that you may have. No more questions? The question is whether there are plans to support KubeVirt. At this point we have no plans who haven't really discussed it. If it's requested by our customers and lots of customers request it, we're going to add it eventually I think. But right now I don't think that we haven't really discussed about it. Other questions? Yes, so the question is whether there is plans to support a federation. So at this point we just want to make it easy for customers to spin up lots of clusters. This is a good question actually because if we say that we want to offer support for lots of different clusters, the cluster eventually is going to request federation. Right now I don't think there are any plans and the way we design our service, it may not be possible. I'm not really sure about it, but the same replies to the previous question. If it's going to be requested, we are going to seriously consider it. More questions? So you will get all those from Geneva. Yes, yes, eventually everything. So if we come up with a problem and that problem is we don't have information on how to debug this problem, what we do? We go and extend our cluster status action that you saw earlier. So this action over here. Eventually it's supposed to bring to have all the required information for us to debug issues. Other questions? Okay, thanks a lot for your attention, guys. Thank you.