 All right, can everybody hear me? We'll get started here. So my name is Peter Yamasaki. I am Director of Product Management. See if we can get that first slide up here. I'm Director of Product Management at AMD, our data center server systems business unit. So you might ask, why does AMD care about OpenStack? Where our business unit is a recent acquisition by AMD, we are formerly known as C-micro, and we were building fabric compute servers. So this is a new business unit and direction for AMD. And we'll talk about how OpenStack applies. And what we want to talk about in this presentation is using OpenStack to provision and manage physical servers, not just VMs. So let's define provisioning of physical servers. So a common term that's out there today, I think, canonical, morantis, and several others have been using this term, and I think it's a great term. Metal is a service. So how do we define metal as a service? Well, it's dynamic provisioning of raw hardware servers, and it's associated compute network and storage resources to provision dedicated hardware as a service. It's an extension of infrastructure as a service. So you can put that under the infrastructure as a service umbrella. So before I jump on, I'd just like to ask, I think we have a lot of people here. We have OpenStack developers. We have vendors like ourselves developing hardware and infrastructure to be used by OpenStack. And we have providers and enterprises who are actually building OpenStack clouds. So I'd like to just get a show of hands here. How many people are actually building clouds for their companies or building public clouds? All right, so more than half of the people in the room. That's great. And out of you, how many of you guys see value and have a need to also provision physical servers, not just VMs? Wow, more than I expected. I thought it was a niche. So why metal is a service? A lot of people have said, and I've been in a lot of arguments, isn't everything better in a VM? It's really easy to manage a VM. You can pause it. You can v-motion it. It is very flexible. And there's a lot of advantages to that. And I think if the SLAs and the quality service you could deliver with a VM was enough, I think that's a good direction to go. But I think the market is still demanding physical servers. So let's talk about some of the reasons. And everybody has different reasons. Number one is the hypervisor overhead. By provisioning a physical machine, you eliminate the overhead of that hypervisor. If you go back five years, I think that overhead was much greater, that overhead has diminished. But there still is an overhead by having a hypervisor there. And that overhead hits your compute, your networking, and your storage. SLAs. Sometimes you just need to provide very high else SLAs to your end user. And it's much easier when they need that capacity to give them a whole machine to themselves, rather than having to manage different workloads and different tenants on that same physical host. Security. Some markets have really strict requirements. They're very paranoid of having data and compute coexisting on the same machine. CapEx. I think when we look in the open source model with KVM and Zen, the cost is not there. But in some areas, you still do. You have to pay licensing costs for the hypervisor. So if you're going to give somebody the whole machine, why pay the overhead for that hypervisor? Dedicated hosting. Quite frankly, customers, there's still a very big market for just providing dedicated servers to your customers. And another one is microservers. We're coming into an era where Adam and Arm processors are just as important as Xeon and OptorOn processors. And that's an emerging market. And I think that's going to pertain to here, because on microservers, the overhead of the hypervisor is going to be even more. So let's talk about Middles of Service and OpenStack. Raw server provisioning is not new. Data center management tools have been doing this for over a decade. So what is it that we're introducing here? Well, we want to bring these advantages to OpenStack. So if you bring this into OpenStack, what do you get? Single pane of glass. So imagine through Horizon that you could manage all your physical and virtual servers under a single window. And you could provide the same management capabilities to both your physical and virtual hosts. You could provide environments for your tenants, where they have both their VMs and their physical servers that can coexist. And you can also take advantage of new infrastructure that's out there. And this is one of the things that pertains to us. I won't talk too much about C microservers, but the use case is important, because there's some new technologies that are coming to market with advanced blade solutions and fabric compute servers that can really take advantage of raw server provisioning. So just to frame things a little bit before we get to the technical details, let me state the problem statement that we're dealing with when we're provisioning metal as a service. So the server that we have today is we call it the C micro SM15K. It's a 10U system. And we call it a fabric compute system. It integrates compute, networking, and storage in one platform. But what we do is we disaggregate compute, storage, and network into separate resource pools. So at the core of the system, there's a fabric that ties these pools together. We have a compute pool. In our case today, it consists of either 64 optoron Z on servers or 256 atom servers. But this is not the whole server. That compute pool is simply the servers and the memory. There's no storage or networking resources with that. And when you actually build a server, you build that by combining these pools together. So we have a shared storage pool. You can scale up to four petabytes. But what you do is you carve out volumes, essentially virtual disks that you assign to the servers. And then we have a networking pool. In our case, it's 160 gigabytes of uplink bandwidth. There's QOS and shared fabric bandwidth, VLANs. And you have up to eight nicks that you can connect to each of the servers and apply quality of service to those nicks. So metal is a service. The way it pertains to us is the way we create our servers. When you create a server, you pick the compute card. What size processor do you want and memory? Those will be fixed based on the configuration of the system. But then dynamically carving out your storage and carving out your networking. So what we can do in this platform is very much in the same way that you provision a VM, create your virtual block device, assign your networking resources. We can do that in hardware. So what we're going to talk about is one of the techniques that we've used to provision metals of service in this platform. So what I'd like to do now is we dive into the next technical layer, is introduce Gursuren, one of our lead architects for cloud automation in our company. So my name is Gursuren, and I'm working for this company, C-Micro, for about two years. And we did a lot of work which brings the C-Micro servers to cloud. So what happened is like originally we come up with the micro server concept. And these servers were very tiny processors with the atom processor there. And we had different resources to pool together to make finally configurable unique server. So making use of the whole set of infrastructure which our company provides, it has all the components associated with our servers. We have a fabric controller. And on the other side, we have, I'll go to the next slide later. OK, so I'll continue with this particular slide to give you a little more insight. So what we do in this case like provision servers based on the requirements. So currently, I mean, in this particular presentation, we are using software for OpenStack that be used for 10RU system. And the next generation system is the 15K system, SN15K system. And this is the current version. So I'll go to the next slide now. So here we have different components which our OpenStack supports. We have NOVA. NOVA is the one which is originally used for computation purposes. And it makes use of servers, memory, storage, and all that together. Then we have Swift in object service. We can configure the packages. We can configure our image based on the different packages which can be registered with Swift. It can be called from there. Then we have image service itself. Like, you have a set of images. You want to control them. And you want to use those images to boot our systems. So either they could be pixie booted or could be put on a hard drive directly. And you have a direct control of servers. Then we have the next service, Horizon, the dashboard. Dashboards are usually very popular with current systems because these are the way front user and user just sees our OpenStack or the whole cloud infrastructure. So using a dashboard, a good dashboard, such as provided in OpenStack by OpenCommunity, gives us quite a bit of understanding of our resources in terms of VMs. And since the concept of VM is pretty well known, and it's a East like Pete has mentioned, to provision any servers as fast as possible servers. Then we went to the Keystone Identity service. That is, you need to have authentication before you go in cloud. So how authentication works is like go register as a user ID in the password. And you get a token provided by you. I'll provide it to you from the cloud. And using this token, you can get the different services. So I'll go to the next slide. This is a kind of infrastructure put all over the internet, and you see how you can provision using the OpenStack services. So you see at the top that virtual guests you can create and they can be managed using a standard set of OpenStack APIs. So no APIs are pretty common for that. They can control any of those VMs underneath. And transparently, you have messaging services, AMQ. That is RevitMQ. And it's being interfaced with different sort of blocks. Here you see the object is stored, then your volume services, and how these volume services are scattered throughout the internet. And they could be connected from any point, from any place transparently. So what we did is we have used our infrastructure, like Metal as a Service, and we put this particular layer on their lib word that is a virtualized layer. And we presume that all interfaces are given to this particular layer. So OpenStack APIs, they're being called through our NOVA managed script, which calls underneath the different OpenStack APIs. And they get subsequent calls to our infrastructure APIs. So this is about how to provision an image. In our system, we have pre-configured images. Like you can set a particular image at any place from a standard template. You can add more and more packages using Swift and then control that. And once the image is configured, you can put in a template or put in a snapshot. And you can connect to that particular image, or you can pixie-boot it from any servers that we have in our Jesse. So these are the steps that you have to register first. And then the image services, you need to go for the image services component and go grab the appropriate configuration. And once the configuration is defined, you can configure it and get the image ready for the whole set. Now, this is an overall texture of a typical cloud system. We have a dashboard on the left-hand side. We have Auth Manager for authentication. These are standard protocols used for LDAP or any other database that you can connect like that. And then you have a standard set of APIs. And on the right, you have Scheduler, MQ, Volume, and all these services connected. So on the right-hand side, you see Word. That is a libWord. Or you can see C micro services, like the C micro Zen APIs layer sitting. Through that, we have a direct access to a computer's direct access to the other side. So I'll go to the next slide. So in this slide, we show some of the basic services architecture. So for provisioning, or for the standard cloud requirements, we need certain things to be done. And after we configure those things in terms of services, we are set to go for that. So let's say our customer orders a specific chassis and a specific set of hardware within the chassis. We need to configure it. And that particular set could be configured using some sort of services that we support underneath. So we have a pool of services, like C micro services, Davenpool. You can configure any of the services. Like you have a Xen RPC control that is Xen RPC control. And then we have Xen server. Then you have all these blocks, like CLI. You have persistence data. And then you have SMP, syslog, and server, and all that. So these are the basic services they get exposed once your system is configured. Here is a set of configuration parameters we need to set in order to set our equipment. So we have a few parameters that you need to set, like C micros, an API connection, the URL. Then you have to set the username and password and a few more flags down there. So once that is done, then you need to install our, you have an open stack machine, which needs to have a patch which needs to be installed at the word layer. And once the installation is over, then you need to set up those parameters. So now we are set to go after that. And open stacks also support lots of scripts, like custom defined scripts. You can provision any servers based on the requirement. Like the requirement could be the different volumes, different configurations, and all that. So users generally write the custom scripts. And these scripts could be defined here, like you can create a server. You can request a specific size of memory. You can request a particular component for the networking. And then storage, and likewise. And you can request the booted using the standard mechanism that we currently support, that is Pixi Boot. And your volume could be anywhere. It could be booted today using Pixi Boot there. Similarly, you have the other scripts like deletion. Once you have created certain VMs, you need to delete those VMs. The VMs here, I'm talking about the physical servers. So we provision all physical servers like VMs. So the next one is you have these are the different applications the user can use for doing different operations. A typical system here, you look at the bare metal configuration, how it sits on the open stack. So this figure shows how the interfacing is done from the top level. We go to the bare metal interface, and it goes underneath to Zen server APIs that we supported underneath the layer. And it gets exposed to open stack directly. The lower part of it is we have the system management, our chassis, networking, storage, and pool. Everything is sitting at one place. This is our next configuration. We try to put together all these services like we did in the past for the work compute. We want to do this for quantum, sender, and Swift. In this slide, we show a typical provisioning. And you, from the starting, a single request comes first from the end user and how the provisioning is done. So this slide shows the different steps it has taken and completed the operation. And the total time it has taken 37 seconds to provision any server. So it's pretty fast. And our time is even much less than this. We have around 10, 12 seconds to provision complete from the request until the end. But when you put with the open stack layer, the whole thing, open stack takes its own time to run different scripts, block a lot of things underneath. So it takes more time on the top of it. So it becomes 37 at the end. So here, we have a typical block diagram that shows our architecture, like what services are running to support. And this all lies underneath our management control layer. That means it's nothing to do outside the world. Outside the world doesn't know anything. Once the services are configured within the chassis, you're good to go. So it's a kind of once end user places an order for hardware. Everything is configured underneath. And the open stack machine links with this particular chassis through management controller. And everything is visible. And just making those commands makes you available, like your all resources, servers, memory, storage, and all those components. So from the top level, we support a different controller that is like Apache. We have light HTTP that's much faster than Apache. Then we have underneath the XML RPC layer. Then we have Zen plugin. We support all Zen APIs underneath. So we have a standard set of Zen APIs. We do not support the all Zen APIs based on our requirement on our site. And we have a Zen proxy service. We need to invoke that in order to bring the all set of RPC protocol underneath working. Then we have a sync manager that manages the sync mode. Then we have a sync manager that controls the sync mode. And at the bottom, you see System Manager and all other services today. We have different fast cache mechanism that supports a very huge data, very, very fast. So you won't feel like any difference. We used to support about 752 servers underneath in one chassis, a kind of configuration, typical configuration for the item service service is given. So each one and every server has a huge amount of data associated. You can expose a lot of data by one call, single call. And those calls become slow if you don't have this cache mechanism. So using cache mechanism, we made them extremely fast. So instantaneously, you get the whole data available. So let me go through some of the basic services which are APIs. They brought categories of APIs one has to look in in order to put all hardware, virtualization layer. So we have action APIs that is virtual machine management. We can allocate VM. We can deploy VM. We can take actions on a particular VM. Migration for the VM is not allowed right now because of physicalization reasons. And then we go for, you can save a VM on a disk. Save means you can configure a physical server and you can save the configuration there. Then VM information, you can grab VM information. You can get the pool information. So we support all configuration like a pool of servers is sitting which could come from different chassis. Each and every chassis has a large number of servers underneath. So multiple configurations. In one chassis, we support the different kind of processors like we'd have Xeon processor, Atom processor, Optron processor, or any such processor underneath. So then we go for actions management like for hosts. So you can allocate a host. You have a large set of APIs that are available. They can control the host. And a host could be a single chassis host. Or you can have multiple hardware jasses underneath there. So you can have multiple hosts there. So you can allocate a host. You can get the information on the host. You can delete the host. You can enable a host or disable a host. And you can get the pool information. Then we go for network management. We do the latest version of 15K chassis. We have quite a bit of network management available. So we have a virtual network location, virtual network information you can grab. You can delete the virtual network information. You can publish that, all sorts of operations that you do underneath. Then go for the user management for location. For users, a user can lock into our system with a user ID in the password. And after the lock-on is done, you get a token ID. The token ID is standard UUID, which we grab from our system and it's unique within our system. So we provide that to you. And using that token, you can go and lock in. Another thing that we have then done in our system which you won't see, every 10 seconds, sorry, every 10 minutes, this token expires if you don't use it. So the last time you use it, if it's more than 10 seconds, you need to grab one more time token. So you need to be careful that whether you have used it for the last 10 minutes or not. Then we have image management. As I mentioned earlier, you do all these, like image allocation, information, deletion. And you can delete the image, enable population, all that. Then we have cluster management. You can define different clusters where each and every cluster could be a heterogeneous or homogeneous cluster separately. So here are the basic classes that we support. That's a physical server using the virtualized APIs, virtual layer. So we have a virtual machine that is actually a physical machine underneath. Then we have VBD, virtual block drive device, and all such classes which are defined. These are the standard classes that we use with our own APIs. Now, this is a sample of a script. You can run different scripts defined by you based on your needs. And these scripts could be directly invoked through NovaManage, in terms of calling a script, like create and provision a VM. So it'll see on a particular host how many VMs are already there. So it lists all the servers that it's found through a search operation. So initially, it does do a search on the servers. How many servers are already existing? Then it finds the three servers there. And then it will print those. And then go to the create the next one. Let's say server four creates it. So VM opaque reference is given with the UUID. So that becomes the unique to your VM. And then it'll go and find any of the networks which are associated with that. So it'll create a VIF, the virtual interface. Then it'll do the VM, attach storage depository. It'll see what storage you want to attach with your VM. And then it'll create the virtual disk image. And it'll provide a reference to that. And finally, it'll create a VBD, virtual block device. That block device gets attached to the server. So all these operations it has done now. And at the end, the server is completely provisioned in all these hardware and the components along with that. Then it'll go and turn on a power on the server for you. What does it do in the power on? It'll do pixie booting. And once the server is pixie booted, it that time it's ready to use for you. So you go grab the MAC address, or you can go find out anything within the server. So it takes about 21.62 seconds to do all these operations. This is a typical script. As you are probably aware, we write scripts in OpenStack. OpenStack is all written mostly in Python. So it is a good idea to use Python for your scripting also. So this particular script is the same script which ran and provisioned that VM, created a VM, look at the resources. So the URL that's given, the SMTSC14, and it connects to the management network on that and grabs the handle. And this is the class which you want to control underneath. You have defined the size of the memory. You have 1024 megabit memory allocated for you. And then you can define the MAC address there and all such parameters like that in a class. And you can go and call different OpenStack APIs using directly. So it makes all these calls underneath. So this is the script which it ran earlier. So here are the steps that you have to follow in order to do that. So it is easy like you can see that first the request comes from the customer that you need to have a particular configuration of your chassis. Jesse, let's say you need maybe 64 beyond servers there. You need so much of memory there. You need so much of storage there. So all that is configured. And that request came to us at AMD. We configure the chassis for you and we send it over to you. So that's the standard chassis that we're talking about. Then install, you need to install this software that manages the whole underneath components hardware. So we did the installation of the software. Then you need to configure the different services based on the customer's requirement. You want to turn on an API for provisioning. You need to turn on different services in order to communicate further on the OpenStack. So you do that. Then the next is you install our client service package on the client side. And once that is installed, then you can do the configuration by setting up the parameters, as I showed in the previous slide. Then you go and run OpenStack provisioned physical servers directly. So the next step is you can make, right now, after this tech, I'm sorry, step fifth, you are done with creating the physical server for you. You can create a virtual server on the same physical server that you have just made. So what you do is you boot an image which runs another hypervisor, maybe a Zen hypervisor, or a VMware hypervisor, or whatever you choose to go by. So once that is done, once the image is booted, the hypervisor is active automatically. So you make a request, now, run those APIs, like VMware APIs, or Zen APIs, or KVM, or whatever is underneath. And they'll create further your VMs. So those VMs, now, once they are available to you, you can work together, like all physical machines and the VMs together. And since we define at a Nova and OpenStack different classes for each one of those, we define whether it's a Citrix hypervisor, or it's a VMware hypervisor, or a KVM hypervisor. Similarly, we can say a bare model machine, or a C micro hypervisor. So all these things, they go together and give you a particular service of servers, which could be together like a physical server, the virtual machine, and all that. So I'm pretty much done here for the presentation. So you can have questions. We'll be happy to field some questions here on the work that was done and some of the things we're planning to do. Yep. That's right. Actually, what we do is the top layer, everything is transparent. OpenStack, as such, is a package. It comes out from Bracket Space, or whosoever is the owner of that. You install it on a machine. Then we have our own client site software for you. You run that. They can install there. Once that is installed, you're ready to go. You need to configure it by setting certain parameters, like you have to do it for any case. Like if you're using Zen APIs, or VMware APIs, or any of such APIs, you configure those parameters for them. So you need to configure on a chassis those parameters if you want to communicate with our chassis. And everything after that is hankydo, that means you get everything exposed to you automatically. So the requests are channeled through OpenStack APIs, which are the wrappers around our C micro APIs, the Zen APIs that we call the C micro Zen APIs. And all these communications are passed underneath. And the underneath layer is XMAP Supercall that we follow. Sorry, you're saying that. Yeah, I think this picture is a good view of it. So what we're doing today is on the left. So we have our system management. And what we've actually implemented in the system management is the Zen API. And that existed, actually, before we got started on OpenStack. We had a customer who was using our atom-based server. So we had 384 atoms in this box. And they said, I really don't want to provision physical machines. But I'd really like to use my existing sort of Zen management infrastructure, Zen server management infrastructure to provision. So we actually created a Zen server API for this customer that would treat the whole box as a single host. And when it provisioned a VM, it was actually provisioning a server in the box. So it would take those pools and say, OK, I want to create a virtual block device and a disk image. It would carve out the logical volume out of our shared storage, attach it to that server, and then provision that server. And it looked like a VM to any management system. There's pros and cons with this. This was an easy way for us to go. But at the end of the day, it's still a physical server. So there's going to be some things it can do and it can't do. So when we got into OpenStack, we created sort of this bare metal Nova client. It was a separate agent. It could run anywhere. And it actually provided a Nova API upstream that would come through and it would go to the Zen server API provider to then provision the machines. What we're looking to take this next is we want to bring the Zen API directly into our system management. So we're going to absorb the bare metal Nova client into our system. And then you could provide the whole system as a single host. Or you could carve out and say, I'm going to dedicate half of the servers in this for this. But the other half of the servers are going to be running virtual machines. And I'll pre-provision those with its own Nova client so you can provision VMs as you normally do on it. So you can break up the system. And beyond just doing Nova APIs, we'd like to do quantum APIs, Cinder APIs, and Swift APIs. So things like Cinder and Swift, we can actually run on our storage controller cards and prevent the service. So our next stage is actually bringing these native things into our system management. And I think that's what we're going to try to push into the community. And we're looking at the right way to do that right now. No, it's a bare metal note. And we're actually putting the image on. There's multiple ways of doing that. Our first instance, you provision the server and then you still had to go in and pixie boot from somewhere. What we're doing with the image service is because we actually have a separate storage system, one of the quick ways, the way we got that down to the 20 seconds, 37 seconds. We actually pre-initiated the volume. So we created a volume of the various sizes. And we pre-imaged that. So we had another server sitting on the side that would attach that volume, pre-image it, and would sit in the pool, weren't ready to go. And we actually provisioned a host. We take that pre-existing volume and populate it. And then in the background, it would prep other images to be ready to go. That's the way we got those quick times. Well, right now we don't have plans. There's a lot of things that are very proprietary to the system. It's a custom platform. So we don't see a lot of architectures like that. So we don't have immediate plans for that. Into the open stack? Yeah, we're just starting to look at that. We're actually been able to bring some more people on the team. So we're exploring that. We'd like to work with other people in the community who are looking at this, work with them at the right way to bring that in so that other people can also provide the servers on other platforms as well, too. Sorry? The implementation for the customer? Right now, I can't talk details, but we're talking. Actually, let me ask you a question. We used to have a very complex chassis, very big chassis, which used to support about 750 physical servers in the need. So that pretty good configuration is the biggest configuration within a chassis. And you can have multiple chassis like that. That was the older model, 10K chassis. Then 15K chassis, we have a wider range of servers within the same chassis. So you have heterogeneous server sitting. So you have like Adam processor, Zeon processor, or Opturon, or any such process. They are working together. So is it kind of a mixed match, large configuration like that? Yeah, and to answer in terms of the size, we have customers who have thousands of servers under management in a single pod. Yeah. Solid production. Can't disclose, but relatively recent. You know, we're just starting to investigate that. I really can't answer that right now. We're looking at that. We implement a very robust layer 2 switching throughout the fabric. So we're trying to figure out the best way to do that when we start looking at open flow and quantum. So hopefully next summit, we'll have a much more solid plan on how we integrate the storage and the networking services. Right. So that's all going to be taken care of at the open stack level, because right now, every system in the current implementation is treated like a host, a nova host. So all those hosts will come under single management at the top limit. Right. So what happens is you have multiple hosts sitting, let's say different hardware, maybe a C macro chassis, maybe HP hardware, or some other vendors. So to connect them, you need to connect to a particular host that has a specific IP address details. So once you have that parameter that I showed you earlier in this slide, you configure, automatically, it starts talking to those hosts. So what OpenStack does is it's trying to explore all hardware underneath first, through it, and it puts in its own database. Once you add a system there, it reads the configuration, puts in its system, and next time when it wants to use a particular hardware, which another user is requesting, so it passes the control directly under. So it's all parallel. You have multiple hosts connected, and these are management controllers underneath. Any more questions? Yeah, so one of the plays with our systems where we're trying to really bring efficient economies of scale to the server. So we've minimalized what you need on a server. So things like IPMI have been centralized through the central management. So we basically emulate IPMI. Yeah, right now, we have a single IP address, but then we have the Customize command. So one of your arguments has to be which of the nodes you want to bring that command to. All right, well, thank you very much, everyone. If you have any more questions, feel free to come up.