 All right, everyone. Let's get started with this. Today, I will share with you a little secret. Specifically, I will let you in on what the end users of our own internal private clouds have been doing for the past two years. What they've been doing is they've been using cluster as a service capability, which we've developed internally, to spin up large numbers of complex, clustered environments inside of OpenStack. Those clusters environments around HPC jobs, big data workloads, but also many other types of workloads. So my name is Piotr Ryvachovich. I'm here with Bright Computing, where I'm responsible for looking after our cloud integration, integration with various cloud platforms. Just to give you a little bit of background, Bright Computing has been around for almost a decade now. What we've been doing is we've been enabling system administrators to very easily turn a pile of hardware into a fully functional cluster. Now, what we did recently with all of that experience is we've moved towards OpenStack with our own OpenStack distribution, Bright OpenStack. And specifically, with our cluster as a service capabilities, we've enabled the end users of OpenStack deployments with the power which, up until then, was reserved for the administrators. That is to easily spin up entire clusters within OpenStack and have their own management interface and monitoring interface to those. So let's jump right to it. So we all know clouds are cool, so that's why we're all here, of course. But what I would like to argue during this short talk is that there is a small gap in the way clouds are consumed right now. Specifically, there's something missing in the typical as-a-service consumption model. Well, we all know typical services of OpenStack like Sahara, which enables end users of the cloud to spin up Hadoop as-a-service deployments, or Magnum, which allows you to spin up container environments directly within OpenStack. And those are very good services, and they do very well with what they're designed to do. However, there is a certain set of users which require a bit more functionality. They want better monitoring, better more elaborate management functionality. They want more control over their cluster. So that's the gap which we are trying to fill with our cluster as-a-service approach to consuming cloud environments. So just to have a look at the agenda for today, we'll start off with a brief introduction to clusters. So we'll briefly define what clusters are. We will have a quick overview of cluster-as-a-service functionality. We'll talk about the use cases, both our internal use case as well as the use cases, which other might find interesting when it comes to applying cluster-as-a-service to those. We'll then jump into discussing why cluster-as-a-service can be considered a stepping stone towards enabling your users to deploy anything they want as-a-service and as-a-service monitor on your clouds. After that, we'll have a bit closer look into the cluster-as-a-service in terms of how it's implemented. OK, so let's start off with what are clusters? So what I have over here is a brief definition from Wikipedia. So let's have a look. A computer cluster consists of a set of loosely or tightly connected computers that work together so that they can be viewed as a single system. That single system part is very important, and that's what distinguishes a cluster simply from a collection of random VMs spun up inside of OpenStack. So typically, a cluster consists of one or more head nodes which store the configuration of the cluster and then some set of general-purpose compute nodes. Now to break that down a little bit, those general-purpose compute nodes can be specialized in one way or the other. So we can talk about compute nodes which run HPC workloads, or hypervisor nodes, or big data nodes, storage, so seven nodes, and so on. So there can be many types of clusters, but the bottom line which I want to stress over here is that all of clusters have to be managed one way or the other because that's one of those parts which distinguishes them simply from a collection of VMs or collection of physical Linux boxes. So what I have over here is just a simple diagram of an example cluster. So in the center, you can see the head node. So that's the main node of the cluster. And on the right-hand side, you can see a collection of compute nodes. So in this example, those happen to be OpenStack compute nodes. So we can see some controller nodes, some server SD nodes, and so on. But what makes that collection of nodes a cluster is a set of cluster management demons running on all of those nodes. So that's our approach to cluster management. We run a cluster management demon at every node of the cluster with the central demon running on the main node of the cluster which orchestrates the management and monitoring of the entire system. And on the left-hand side then, what you have is the interfaces that an administrator can use to access the cluster. So they can use a cluster management GUI or a command line interface or simply an API using, for example, Python. So we've covered briefly what are clusters. Now let's have a very rough overview of what providing clusters as a service is. What does that mean? Well, in a natural cluster as a service is a set of components which we've built around OpenStack and specifically around OpenStack heat. So as you know, heat is the orchestration framework of OpenStack. And we use heat to orchestrate the creation of clusters. But we simply add some functionality on top of that. What CAES or cluster as a service allows you to do is, well, first of all, it allows you or your end users to very easily provision entire environments, entire cluster environments inside of OpenStack very rapidly. So it can be as fast as 10 to 15 minutes, depending on your hardware. Those cluster environments right after being provisioned come with the choice of packages defined by the user. So the cluster can be specialized to run HPC workloads or Hadoop workloads, Spark workloads, and so on. I've mentioned earlier that what we are trying to achieve with cluster as a service is to give the end users similar capabilities which were normally reserved only for system administrators. So in this case, all the clusters provision inside of OpenStack with our cluster as a service solution come with a very powerful management and monitoring interface which users can use to gain more insights into the workloads which they are running inside of OpenStack. And above all, all of that can be accessed via a single pane of glass management interface. So let's have a look at the typical architecture of CAES. So what you can see over here at the very bottom are simply physical boxes. So that's the physical layer of your typical OpenStack cloud. We have some physical boxes which run Linux and some kind of management software to manage that all. So in our case, that's our own cluster management software. Now with that, the system administrator, which manages this layer, is capable of spinning up various environments inside of that physical cluster. So those can be physical HPC deployments, big data deployments, or OpenStack private clouds. Now this is where it gets really interesting. So if you have your OpenStack private cloud deployed on top of our cluster management software, what you can enable your users to do is you can enable them to create their own clusters inside of OpenStack. So you can see what's showed up over here is simply a virtual layer, a virtual cluster created by the end user of the OpenStack cloud. So again, you can see the virtualized hardware, Linux, management software, and whatever software a user wants or needs on top of that. And of course, you can have many of those, as many as you like, in fact. And you can even go as far as to create your own OpenStack deployments inside of that OpenStack deployment, which is a really cool thing for testing and development. So this shows you who is responsible for which parts of the components. So you can see at the bottom, like we said, that's the administrator which is capable of giving some of the users access to some of the components deployed on the physical layer, like, say, the physical HPC deployment. And then the virtualized clusters running inside of OpenStack, those are managed completely by the users which have created those using cluster as a service. Now, what's really cool in that is that all of those components are capable of dynamically bursting into external public clouds. So what that means is if the administrator or the user runs out of the resources which are located on-premise in their own private cloud, they can very easily extend the pool of available resources with the resources available in, say, Amazon EC2, Amazon VPC. So what you see over here is a screenshot of the management interface which is available to the end users or the administrators right after creating a cluster. So on the left-hand side in the left pane, you can see resources which are currently defined within the cluster in the main pane. You can see the currently selected resource. So in this case, what we are looking at is we're looking at a virtualized cluster which has been provisioned with Hadoop and Spark. Next screenshot, so in this case, similar user interface, but in this case, we have a cluster with Kubernetes deployed on top of it. And this screenshot covers the monitoring capabilities which are available to the end user right after creating the cluster within OpenStack. So we can see that the clusters basically should get created with a whole set of different metrics which are already tailored to the components which are part of the cluster. So we can have over here HPC-related metrics, Hadoop metrics, Spark metrics, Set metrics, or OpenStack metrics. What's also interesting is that it's possible to extend that collection of metrics which are your own metric collection scripts. So if you want, you can simply create a simple script which probes a number and plug that into our monitoring framework. Overall, this gives you a lot of insight into whatever you're doing inside of OpenStack. So our use case. So as I've said before, well, we're a software product company. Our product allows the administrators to deploy and manage clusters. So naturally, our primary use case for this situation is to test our product inside of our own private cloud. So that's actually how this entire project started. So in the actual long time ago, what we used to do to fulfill that for development, QA, and support, we simply used physical clusters for all of those tasks. But that's obviously not manageable. So at some point, we've moved towards simply using KVM and a set of scripts to dynamically create clusters inside of a small KVM cluster. But again, that quickly became, does that really scale? So about two years ago, we've moved towards using our OpenStack integration to deploy our own OpenStack private cloud. And we do all of our development, QA, and support on that cloud. Now what then turned out is that many of our customers and prospects really like this idea. And they wanted to have a similar functionality inside of their own private cloud. So what we decided to do is we decided to productize that approach. And that's how cluster as a service came to be. So again, some more screenshots. So over here, we can see a screenshot of the management interface to a Spark deployment running inside of OpenStack. Over here, we have a comparison of management interface of OpenStack versus your typical Horizon dashboard. So Horizon dashboard is, of course, also available. But what we also do is we allow you to manage the most commonly accessed functionality of the dashboard of the Horizon from within our cluster management interface. This overall gives you a single pane of glass experience to both your physical layer as well as to your virtual layer. Well, other use cases, there are many, of course, from training environments, R&D deployments, virtualized labs, security and compliance. So that's quite interesting. What you can do is you can create, if you're processing sensitive data, you might want to create, say, isolated HPC environments inside of OpenStack, each running specific set of workloads for a specific tenant so that you're quite sure that they are nicely isolated from one another. So like I said before, cluster as a service is only a stepping stone towards a bigger picture, towards enabling your users to provision anything as a service. So in our case, we use cluster as a service to provision Hadoop, Spark, Seth, OpenStack virtualized environments, but you don't have to be limited to that. In fact, you can provision any type of cluster environment you want, any type of third party clustered software or your clustered software. So the way it works in our environment, our end users first specify the Linux distribution they want to have their virtualized cluster based on, so say, REL7. After that, they select a set of packages which will get pre-installed on all of the virtual nodes of that virtual cluster. So again, many combinations are possible depending what an individual user needs. So it can be, say, a set of Hadoop packages or a set of Spark packages, or maybe a combination of Seth and OpenStack packages for virtualizing OpenStack. After that, the user simply has to select the version of the cluster management software which will be used for managing that deployment. And this is an actual command line which our users use on a daily basis to spin up tens of different clusters every single day within our own internal private cloud. So we can see over here, we are selecting the OpenStack flavor for the head nodes, for the compute nodes. We are selecting the number of the compute nodes, the name of the cluster, the Linux distro which is to be used, disk layout, maybe some custom script which will customize the cluster after it has been created. And there you go. All the user has to do is press Enter, wait a bit, grab a coffee, and when they are back, there are clusters that are waiting for them. They can simply log into the management interface we've seen before and start working with the cluster. So that was a command line interface. We also feature a horizon dashboard integration which allows you to easily create your clusters directly from within horizon. So anything as a service. But as you can by now, I hope, clearly sees that by combining cluster as a service functionality with whatever third-party component you want, you can very easily arrive with anything as a service solution. So in the case of combining cluster as a service with Hadoop, you end up with Hadoop as a service. But in principle, it can also be OpenStack as a service, SEVA as a service, HPC as a service. Possibilities are limitless. So in the case, for example, of HPC as a service, the clusters will be pre-installed with various HPC libraries like API, OpenMP, or CUDA, various tools and compilers, and the workload manager of choice. So again, we integrate with all of the major workload management systems out there. In the case of a Kubernetes as a cluster approach, the cluster will be provisioned with a pre-configured Kubernetes deployment to which a user can log in and immediately start creating Docker containers. So we've had a high-level view at cluster as a service now. Let's have a bit closer look at the actual implementation, how it ties in with OpenStack. So in a nutshell, like I think I've mentioned, cluster as a service is simply a collection of components written in Python which wrap around OpenStack and specifically around heat, so OpenStack orchestration framework. Internally, we chose to use SEV for storage purposes. That's because it provides a very nice copy and write functionality which you can use to snapshot volumes or instantiate entire clusters from copies of previous volumes. However, in principle, that can be any storage system you want. The way it works is, well, the user runs a script or uses the horizon dashboard to kick off the creation of the cluster. Our cluster as a service then constructs a definition of a heat stack which then it pushes to heat for heat to orchestrate the creation of the networking and of the VMs. That creates the main node of the clusters. That creates the head node which either gets installed using a pixie boot and going through the full installation process or gets simply instantiated VA copy and write functionality. After that, we've created individual nodes of the cluster. Those nodes pixie boot off of the head node and get provisioned with software image and software packages which are at that point already pre-installed at the head node. And the last step is to simply customize the cluster by either deploying OpenStack or Hadoop, SEV, or whatever else has been requested by the user. Yeah, that's it. Any questions? OK. Yeah, thank you very much.