 OK, I think it's about time. Let's get started. First, thanks for coming. This talk is about Compass. It's a system helped us to streamline the OpenStack deployment process. My name is Shuoyang. I'm a principal architect for cloud computing in Huawei US R&D Center. Before joining Huawei, I worked for Google for four years in their infrastructure team. So I first would like to discuss with you guys what problem we are trying to solve. I think in this conference, there are a lot of similar topics. Hopefully after this talk, I can give you my vision and our vision what this system is special. And hopefully, I'm giving the talk to seeking for help. Together, we can hopefully build a system in the long run. So to talk about the Compass, I need to bring the context. When we started this conversation, when we started this Compass design, we decided we have a pretty ambitious goal. Our goal was to deploy any complex distributed system into a general purpose commodity service. And with that goal, I think our primary design decision from day one was build a system that extensibility is our first priority. And having said that original design goal, we were not trying to limit ourselves to deploying OpenStack. But our first working system, we streamlined OpenStack deploying process. And I will show you in our demo. And this will be Open Source. Totally, we will open source all our code 100 Python. And as I said, our design goal was to make the system extensible. We have our architectural design as modular as possible, at least in the philosophy level. And we have our implementation have this real plugin-based implementation. And to give you a concrete example, we have the system to deploy our OpenStack cluster onto Huawei's hardware. But we add 200 lines of code to have this HP hardware plugin. And that works fine with us. So you get a sense of how cheap it could be to extend our system. And we have successfully deployed several doc for the cluster. And this is our VK page. Before talking about this system itself, I'd like to bring up a concept. I think this is called the data center as a computer. It's a pretty, you know, I think we share the true vision from this industry veterans. This is a book published by several Googlers. And I think the last author in this book is also a pioneer in software-defined networking. So let's look at what it looked like for a computer in the 90s. In the 90s, we have CPUs. We have disks as the storage capacity device. We have NICs as the networking devices. And fortunately, at that moment, we had an open source operating system called Linux. But if you really go back to the 90s, the experience of using such a system is not as pleasant as today. So tools like Lilo's Grub were invented. And nowadays, if you really want to use a Linux computer, then you're bringing live CD. Everything works like a blink. So let's fast forward to today's data center. We have a similar picture. We have Huawei's, oh, sorry. Any server, commodity server, as the data center CPU. We have the storage server as the data center storage capacity. We have the switch as the data center level networking gear. And equally, fortunately, as of today, we have OpenStack, another truly open source data center level operating system. But what is missing? I think nowadays, I will show you in a later slide, nowadays, several vendors are trying to work out a streamlined deployment system. And I think it's still a very vibrant area for people to work on. I think I attended several talks. And when the speaker asked people, say, how many you guys use any particular tool to deploy this OpenStack, I see less than 10% raise their hand. But I think that's the reality. And that's why. And also, I'd like to say why Huawei wants to work out a system like this. Because Huawei is a full hardware vendor in the data center solution portfolio. It was ranked as number one for storage revenue growth. Number two as x86 server revenue growth. And the need to say, Huawei has a pretty strong sense in networking gear space. So OpenStack, we love OpenStack because OpenStack enable us to build a truly distributed data center solution. And as I said, this is a pretty vibrant area. So several vendors are working very hard to work out a system to solve this problem. I think this is a great thing. I think the competition is a good thing for the customer. I can quickly go through this. Kuba from Dell, I think originally. It's a pretty pioneer effort. I think it probably in the industry the first effort to solve this problem. But I think that's a Ruby web application. And to be fair, I think it's, from my view, it's a little conglomerate. I'll say what I mean by that. And then the second system, especially in this conference, people are talking about triple O. I think this is a great idea, a very attractive idea for OpenStack community itself. But in my view, do we want to combine a general problem into a particular distributed system? That's some debate we can have. And then Fuel is another great web application. It's a popular based system. And Dell Stack, I think any OpenStack developer in this community must have used this tool when you write code. But that's a pretty, I think that's pretty like a simple system test idea, at least in general in this community's view. So let's think of this. As I said, I keep saying data center. By the way, anyone knows which company's data center are these pictures showing? Yeah, great. The bottom picture is showing Google's data center. And the upper one, do you know? That's a rack space data center. All these are for my Google search result. Anyway, so let's really think of what can be automated and what is not. So first of all, I think all the tools I just mentioned, including Composite itself, is trying to solve a problem that zero touch software deployment process. But before that, you need to have a rack and stack process. You need to weld them up. You need to bring the rack to the right place. If you guys are interested, I heard of several Robert start-ups are doing this stuff. Furthermore, I think it's a big furthermore. If we really have some AI breakthrough, we can ask the AI system help us to design and config the whole data center layout. So how many of you guys attended my other talk yesterday? Can you give me a shorthand? OK, so less than half. Then if that's the case, I'd like to show you guys a real video clip recorded by our first system deployment. This is the diagram view for our physical diagram view of our installation. And I'll show you what has happened. Sure. Yeah, that's a great question. So these are the commodity servers, the Huawei server. And this is Huawei switch. And this is the router. And all we need is network connectivity. And this is a VM of Compass VM. Manage the plane. Yeah, that's the managed plane. Yeah. Data plane is on? Data plane is separated well. I can show you in this video. So as I said, first, we haven't automated this Rakan step. This is our R&D data center. And this is the network layout. So Compass is a RESTful server. And we have a purely client-side UI. And then this is what Compass is trying to automate. As I said, this is a purely client-side web UI. It's a wizard-based process. And it continues to consume the RESTful API to enable this whole process. And this is basically saying, oh, you need to remind the operator you need to have the right wire connection. This step shows us that Compass provide a capability that it can automatically discover the servers connected to a particular switch. This provide us a capability for network awareness, topology awareness. And once you have that management IP of the switch, you can find all the server connected to that. Normally, you would select all because that's reflect the case of deploying OpenStack cluster to a bunch of racks. But in this case, this is a demo purpose. We select this discovery question. This discovery, we just walk through the SNMP MIPS. Every vendor will provide. And most of the vendor implement the standard MIPS on the switch side. Whenever you have a server link up, the switch will catch it. No, it's a control plane interface. We can discuss this offline if you are interested. So this one, basically, this is the visitor step for the operator to input all the credentials. And after finishing this step, the next step is network configuration. For network configuration, we know that this is the most error-prone step for setting up a cluster. The operator can provide a set of IP range for different networks. For this one, managing playing and public network are need to config explicitly. But tenant network and the storage network can be used as is, because that's not what end user will see. So this one, you provide public network IP range. By the way, this UI itself is just the showcase of how much RESTful API capability we can consume. I will show you slides. Basically, this RESTful API approach allows third-party UI to work with our backend. So this is a step for operators to automatically... That's the... How to say? I think that's the OpenStack private network. Yeah, yes, the quantum network for the client. Yes. So this one is showing you the step, automatically assign the host name to the servers, because if you think of the process of... Oops. If you think of the process of assigning IP to the servers, that's if you assign a couple of hundreds of nodes, that's a pretty time-consuming step. Oh, I'll answer that question. When I talk about this particular function module, I'll answer that question. So we have this... As I said, we have this network topology information. We can show the real-time progress of deployment along with the network topology view. And not only that, we can also show you the list view, just the regular view of your server. I think it takes a little longer in the server 3, which is the controller node, because it needs to install the database, rev-mq of this actual packages. So, yep, it's complete. And as I said, this is the graph view, topology view, and we have a list view. And after that, we are done with the deployment. Let's play with the deployed system a little bit. You log into a project, you created a couple of network provided by Neutron. When we use this project, it's called Quantum. So this is the second network, I think, and we have this network up, and we created a virtual machine and connect them to the network. So this is the topology. You can keep playing around. I'll stop playing the video here, and let's proceed. So basically, that's the high-level idea of what a campus is trying to solve. And actually, although it's a general-purpose system we'll try to build, I'm showing you the real system we have deployed. So let's think of why we would like to build a general-purpose deployment system, because in our view, that a life of deployment should be viewed pretty much similar to each other. From a hardware vendor perspective, you provide a pull-off for resources that are connected through the network gears, and you need to somehow deploy the host OS or some kind of hypervisor. And after that, you need to deploy a pull-off for processes, no matter if that's self or open-stack, along with the correct configuration. By correctly config every process in this pull, you are able to form a distributed system. That's how a distributed system works. And if you look at that, different layer of resources has pretty much provided us the programmability. SNMP has the interface to control the networking gear, IPMI to control the server. And for OS provisioning, we have this OS provisioning toolset like a cobbler-reser. And for the process deployment and the configuration updates, you have a Sheffan puppet and Ansible. You can list a bunch of them. So Compass is not to rebuild any toolset in this toolchain. We are trying to glue them together, build another layer of software so that you do away with the repeated boilerplate code, have the operator to focus the problem they want to solve. What kind of system you really want to deploy? And by doing that, we hope we can liberate some burden from the operator. So as I said, from day one, we really wanted to build a general purpose deployment system. So having that, we have a strong philosophy about what Compass should be. We provide programmability. We provide extensibility. Programmability, meaning, instead of building a web application, we build a REST API server. Extensibility, there are several interpretations of this from different angles. First of all, we build functional modules with plug-in architecture. As I said, to extend our support to HP switch, we add 200 lines of code. And we carefully design the boundary what Compass is not to be. We want to build a system not to reinvent any view. We want to work with these existing mature tools. So that's why we just write 5,000 lines code. We are able to have this streamlined process. And this is a kind of internal architectural view of the system. As I said, the UI I showed you is totally written in, you know, client-side Java script MVC framework. And we have a RESTful API to provide the programmability to the end user. We have this hardware discover module, the package deployment module, and OS provision module. We have planned to add more modules because we know we can even automate some process. So as I said, all this vendor-specific or a two-set specific code live in this layer. And by having that, we achieve this 200 lines of code to support other devices. And also, if you think of this, right, nowadays we are working with Chef servers to do this configuration management. It's a similar concept. You can write a similar plugin for other configuration management tool. Our belief is that having that capability enables the enterprise environment. That think of this, in some enterprise environment, you already purchased the puppet lessons. You already purchased the Ansible servers support. If we are able to enable those two sets to work in your system, that would be great. So same thing for the OS provisioning tool. I do not want to spend too much time on this. So programmable, meaning we want to model all the software deployment steps into RESTful resources. This includes machines, switches, clusters, hosts. And I think something deserve mentioning here is a concept we call the adapter. Adapter is a plugin module to discover, config, and deploy hosts. So for example, I will give you a concrete example. For example, OpenStack, installed, configured by Chef, is an adapter. If in your system you have a puppet, you have a puppet, whatever you called, I don't know, puppet's term. But if you have some kind of cookbook stuff in the puppet world, and that's another adapter to allow us to install the OpenStack. I think I will leave this to have a Q&A session. I'd like to discuss with you guys for this concept. And this is all about RESTful concept. We try to mimic OpenStack's API standard as much as possible. I don't want to spend too much time on this. And then let's look at what after having this system, the end user will be using this system. So basically, a deployment of your target system becomes a list for RESTful call. Think of when you want to deploy an OpenStack cluster, then you will say, help me find all the available machine. And in our previous example, select all the machine I want to deploy the system onto. And then help me find an adapter. At this moment, I want to deploy an OpenStack. And having that, the adapter provides you a list of roles, a group of functionalities. You say, OK, I understand that for this particular target system, you have this bunch of roles. And then for most of the existing tools, I think, when people say the installation process is streamlined, they are basically mapping to our model. Let's deploy this system automatically by whatever the system described it should be. But for us, we can say, oh, I know if I'm a user, I can program the system. Saying, oh, I want to node number one to host the controller. And node number 234 to host the compute. That's just to give you a flavor. And then after that, this is to pull the progress. And hopefully you are done with that. So another key concept we want to keep in this whole system design implementation is to not to be a conglomerate. We want to have a plug-in architecture. So at this moment, as I said, we use Chef as our configuration management engine. But if people have this requirement for working on Puppet and Ansible, we love to hear your feedback. And also, we have a plug-in architecture for hardware. We provide, as I showed you, we provide a networking gear-based environmental discovery so that we can have this topology awareness. So another key, my personal interest right now is I think people see this OCP concept. It's another hardware, open sourcing hardware. If any audience is interested in this concept or from that community, I love to discuss with you. So another concept of not to provide a conglomerate system is that we believe we need to learn something from the several decades of programming experience. We need to separate a role in a particular system. We need to have library writer, and we need to have an application writer. Library writer meaning, so in our case, in this compass context, you need to provide the policy. For example, you don't want to put, if you have an HRA configuration in your OpenStack cluster, you don't want to put two databases onto the same server. That's policy. And also, you provide all the snippet and the kickstart file for your OS level description. And having that, basically, that's like libc, right? You write libc, you allow the application writer to use it. And then the application writer becomes issuing a bunch of REST for API, API call. So that's why we shouldn't allow, we shouldn't have a system and ask the user to repeat themself constantly. And in the demo case, we provided a library, right? And you can imagine for Hadoop, for SAF, which is a particular of interest to this OpenStack community, you can provide your library. So I think the last part, I'd like to talk about a little bit our vision, right? We already deployed the OpenStack, what we want to do next. I think right now, at this three axis, we successfully deployed the OpenStack on top of CentOS. And actually, for Ubuntu, I think it's the second system we are supporting. We have some last-minute tweak. But I think that can claim that's a successful step. And then the next step is we should be able to support Hadoop, SAF, or any other complex distributed system. And we should be able to enable the support of other host OS, or other a hypervisor. And we should be able to work on different hardware. And if you think of this, as we said, do not repeat yourself. Once you write something, write a library here, you should be able, once somebody else has already write the modules for this, your library should work across all this platform. So instead of you multiply the code base, you add them as a linear code growth. So another extensibility is the two-chain extensibility. At this moment, we support Chef. We use Coppler as our OS provisioning tool. We certainly see the other tools is in the market. And as I said, especially in the enterprise scenario, you're already committed to a particular tool set. We shouldn't require the user to switch their tool set because they chose using Compass. And also, same thing, for OS provisioning, there are lots of other alternative there. So I think that concludes my talk. As I said, our ultimate goal is to try to build a general-purpose deployment system. We have deployed OpenStack Cloud infrastructure smoothly. This is basically provide the programmability to the end users. We really come here to give this talk really to seek the opportunity to collaborate. We hope with this vision, if you guys share the vision, we can discuss and build something last long. And by the way, we will open source this soon under Apache 2.0 license. I welcome any questions and discussions. Please. I think it should be done by the end for November. It should be later than that. Yeah, you're welcome. So great. I think the question is, if something goes wrong with the deployment process, can you make it up? Or can you redo something? So that's a great question. I think there are several steps. As I said, this UI is just a proof of concept saying, oh, you can consume this API cost. But actually, this video is made, I think, a couple of months ago. So I think our current UI allows the user to go back. If you have some configuration, in the later process, you feel that's not the correct configuration. You can always go back. Great, great question. I think we decided early on what this project is now to be. This project itself would not do the monitoring, do anything. The reason is not we are not capable of doing that. Because actually, we are doing that. But if you really think of what a project could live long, you should commit yourself now to do something so that we can work with other tools together to build a great system. Thank you. Thank you for coming.