 Hey, thanks for joining us today. I'm Ken Savage. This is Darren Sorrentino. And we are Solution Architects at Red Hat. Good job. Thanks for joining us at the Boston Kubernetes Summit. And we're glad you're here. If I can advance here, hold on. So Kubernetes, when we're not making up words, we're busy kind of reenacting bad 70s television shows. But what does this word mean, right? We're going to get into that. First thing we wanted to do, though, is you've heard that adage, know your audience. And we were hoping that we could get some insight into where you are all at with regard to bare metal, Kubernetes, and everything else. So if you don't mind, this is a live poll. You can use that URL or you can text as well. Bare metal as in ironic. Sorry about that. Yeah. Oh, wow. Look at that. I knew there'd be a maybe eventually. Yes, it's fighting. Yes. Cool. Very cool. Looks like an equal distribution. Yeah. A lot of yeses. Wow. Very cool. All right. That was more than I thought there would be. All right, let's check out the second question that we have. We only have three of these, by the way, so don't be alarmed. Second question is this. Are you running Kubernetes on OpenStack? This is a little bit more tough. That's kind of what we figured. That's about what I figured, yeah. A lot of maybes. So you don't know if you're running it on OpenStack? That's what you're saying? I know it starts with a K, but I don't know if I'm running it. OK, so one more quick question. Have you heard of Kubernetes before this session? We thought it was a new word. The good thing about making up words is if you Google them, your presentation comes up first. Pro tip. Very cool. So agendas really just to justify the approach that we're looking at here. Talk a little bit about the environment, how we wired it all together. Some of the pitfalls of bare metal and Kubernetes together, some small performance benchmarks, some of the lessons we learned, and then questions. If you guys have any questions midstream, please feel free to raise your hand. Be happy to try to answer. Disclaimer is simply that this is an advanced session, so there's a lot of iCharts here. Don't be thrown by that. We kind of put those in there so that you guys reference them later if you ever wanted to, if you were crazy enough to try to do this. So the plan was for us to use Red Hat OpenStack Platform 10, our favorite, and to deploy a Kubernetes master as a VM and use heat to create Kubernetes nodes, or otherwise known as Minions, using Ironic, until we got punched in the face by Mike Tyson. So that was our original plan. And why would you do this? It's still faster to do this than deploying via SneakerNet, of course. Somebody's always got to run out there and cable stuff up, right? And bare metal, we're metal freaks, so we have to marry the rich OpenStack APIs with Kubernetes and Docker. And there's a lot to do there, right? Hooking up Keystone, well, actually not, I mean, just you get a lot of infrastructure APIs with OpenStack, right? Integration with existing enterprise services, such as Identity, hooking up Keystone to AD, and then Kubernetes into Keystone, right? You can do a lot of really cool stuff that way. And of course, we have customers that want to know if and how they can kind of do this if it's a viable solution. So I'll pass it over to Darren here. He's going to talk about the environment. So our lab consisted of six Lake servers, all identically configured with two 8 processor cores, two 56 memory, we weren't trying to run any kind of production workloads on this environment, obviously. It's just a proof of concept just to see if we can actually get this thing to work. So what we had is we had a single KVM server. We leveraged by nine on that. One of the key points we wanted to make in doing this proof of concept is we all realized that everyone has their own DNS servers, unless you're like the Flat Earth Society running around with Etsy hosts everywhere. So we wanted to integrate it into a back end by nine server. So this way it actually replicated something that you might actually do at your environment. We had two OpenStack controllers. OpenStack controllers were running designate and ironic. And then we had a single compute that we can launch our master VM on, and then two additional bare metal nodes that were provisioned by Ironic for the worker and millions of communities. We used leverage designate in order to provide the DNS entries for the master as well as the worker nodes. This way they can find each other when they first come up. And obviously, Ironic to deploy the worker nodes. So quick rundown of the version we used. We used Red Hat OpenStack 10 Shocker. And then it's on Rails 7.3. We did pull the Kubernetes code right from Kubernetes.io just because a lot of developers are pulling that directly from the source. The Docker image we used was the rel repository Docker. We baked it into the image, but I'll get a little bit into that a bit later. So designate, as you heard earlier today with all of the OpenStack projects, all the OpenStack projects can stand on their own. Likewise, designate can stand on its own as DNS as a service. So in here, as Ken mentioned, a little bit of an eye chart as far as the configuration that we used in order to configure it for a bind nine back end. At the bottom, there's a reference link to the OpenStack.org documentation on designate. So you can actually look at our configuration, take a look at the recommended installation procedures, and make a little bit of sense of this after the presentation. The slides will be available on the Summit site at the end of the summit. So having designate up as a service is not enough. You actually have to integrate it with OpenStack. So this way, when you launch an instance, it'll actually populate the DNS entry with the IP address. In order to do that, there's two integration points. There's one with Neutron. These are the settings in Neutron.conf that we had to change in order to get this to actually integrate with designate. And in addition to that, we also had to make some changes to the OVS ML2 plugin to add the DNS as the extension driver for OVS. So as a high-level overview, you launch an instance within OpenStack. Neutron would actually create the port for that instance. It would populate a property within that port that would have the IP address and the fully qualified host name. And then it would call out to designate, which would then generate a DNS record for that and then update the bind9 server in the back end. So the ironic deployment that we deployed, we deployed that as part of our initial deployment using a triple O process. Triple O makes it really easy to install ironic in your cloud, just a matter of configuring two YAML files and doing your deployment. The one thing that you'll notice on the bottom there is a cleaning network UUID. That UUID actually doesn't get populated to post-deployment. The purpose of populating that is that when you do a future upgrade using triple O, because triple O does support in-place upgrades, it doesn't overwrite the cleaning UUID in your Neutron conf later on. So this is the ironic YAML basically calling in the puppet scripts in order to actually instantiate ironic on the controllers. Again, reference information down bottom for the integration. So there's many ways to skin a cat. So we decided what we would do in order to get this deployment to work is we made a purpose-built image for Kubernetes. The image actually can be launched on either virtual or bare metal, same image for both. Within the image, we baked in Docker and NTP. NTP is very important to have in your Kubernetes cluster and ensure that they're all using the same time sync. So basically what happens is when it boots, there's a system D integration script that we wrote. It pulls down the bits from Kubernetes. It looks at the host name. If the host name actually has kube-master in it, so you can call it whatever you want. But as long as the kube-master in it, it knows that it's the master node. Everybody else that you launched thinks he's a worker node. So what happens is he'll come up. He'll register with DNS if he's the master node. If he's a worker node, he'll start querying DNS for the kube-master node. And then what he'll do is once the kube-master node comes up in DNS and he finds it, now he starts making a TCP connection out to Kubernetes on the master node and waiting for the service process to come up. Once the service process has come up, he uses a static token ID to tie himself into that Kubernetes master server. So we ran into an issue with bare metal for whatever reason. Passing a script through user data would run sometimes and not other times. We wanted to dig into that a little bit to debug it. But due to the time restrictions, we didn't really have a chance to dig into what that issue was. So what our workaround was is that we baked this script into system D and actually mounted the config data from the config drive and then pulled the user data information out as variables for the process. So again, a real eye chart here. It'll look much better back at your home, sitting back relaxing, watching and looking at it on screen at home. So what we're also going to do is when we publish that, we're going to update it with a GitHub link that actually you'll be able to pull the YAML files down right from GitHub. But this is the heat template we leveraged to actually deploy the master node and the minion nodes. So the key on this here is it's all driven off of the minion of the flavors and as far as where it's going to be deployed. So I'm going to turn it back over to Ken to talk a little bit about our performance metrics that we saw in doing the deployment. Yeah, so what happens here is you can kind of see by these benchmarks that bare metals really slow. And this is the punch in our face by Mike Tyson, really. And we had planned to kind of demo this, but we found that with the gear that we were using, which is pretty good gear, you're talking about 15 minutes to pull a node up. So our thinking was, how would you do this? In a production environment, maybe you'd set a threshold below where you'd want to start getting more infrastructure happening, right? And then allow for that time to spin itself out, and then you'll have what you need when you need it, hopefully. The downside of that is other things can happen that could skew that. And you can see Kubernetes up, definitely taking a lot longer with bare metal. Cluster ready is definitely taking a lot longer. Now, we've also looked into some tweaks with ironic to what it does is it kind of pixie boots and then reboots the server. And that's a lot of what takes so long. And we know there's a way to pixie boot and kind of cheroot into that kernel and do it without rebooting. We didn't get a chance to try that, though. Some of the bare metal pitfalls, right? Slota provision, most of them take several minutes. In our case, 15 minutes just to pull one server up, right? So it'd be difficult to do this in an autoscale environment. And as Darren mentioned, unreliable user data execution clouded it on the bare metal nodes. It really wasn't doing the user data stuff for us at all. And then finally, no easy autoscaling due to the lack of agents on the bare metal end of things. So if you think about it, when you create a VM, you have a hypervisor. Solometer runs, gives you telemetry. You can set alarms and do all kinds of cool stuff based on that with heat and autoscale, right? When you're doing the bare metal, there's nothing there. The other option you have is SNMP. That's an option. You can do that SNMP in our experience. For instance, the default refresh is like 10 minutes. You may not know that you need it until way after you need it. You've got to do a lot of tweaking to get SNMP to the point where you're going to autoscale with bare metal nodes, right? Some of the Kubernetes pitfalls, you've got to always maintain a unified time source. We always do that with Red Hat OpenStack anyway. If you don't do that with OpenStack, things go wonky, especially on the controller end of things. And the same is true of Kubernetes. Networking gotchas. And I have to thank Darren for that mousetrap with the Kubernetes logo on it. I love that. You're maintaining all the complexity of OpenStack isolation, network isolation, along with all the Kubernetes stuff. You've got two things going there. When things go wrong, you don't necessarily know which of those is at fault. And that's tough to deal with. We experienced that big time. And then finally, node names can't be greater than 63 characters. And what he likes to do is create these huge host names. And we ran into that. One of the first ones we did was 65, I think. Kubernetes has a limit of 63. Yeah. So we did find a workaround for that. You want to tell them? I don't know if it is. Thanks. So what happens is your host name winds up being a combination of your stack name and your resource name. So what we actually did was what you'll see in the heat template when you go pull it down is we used the random string function data type within heat to come up with a 10 character name. And then we basically took Minion dash and hard coded a random string that gets generated when the heat stack runs. So by doing that, we were able to ensure that our host names were less than 63 characters. 63 characters also needs to include the fully qualified domain name. So the whole thing needs to be under 63 characters. Otherwise, it fails to join the cluster. And heat's limit, by the way, is 128. So heat just happily goes along creating these huge host names. Some of the lessons we learned, right? Is it worth the effort? Yeah, I mean, as we've been kind of saying here, bare metal is slow. Kubernetes definitely has a lot of functionality that works well from an app-centric kind of point of view in this scenario. Integration is key. And that's the kind of the thing you have to work at when you do this, when you take this approach. And as I mentioned, testing is really difficult. When things go haywire, you don't always know where it's happening. You have to really have your stuff together to figure that out, right? What we would do going forward, we think VMs are typically good enough. You can get a lot of density out of VMs. OK, VMs are level zero hypervisor. You can get a lot of performance and density out of it, right? Fast autoscaling probably has to be VMs. You're not going to get that out of bare metal at this point in time. Create extensively heat templates. You want to do that anyway. That should be your MO, no matter what. And it's especially true here, because you'll be tweaking them quite a bit. Use OpenStack where it excels. And in this case, we used it, at least we tried to use it for where it excels with regard to integration and infrastructure, and especially Aronic, right? And then finally, use OpenShift. And that's not just a plug for OpenShift. We did experience some of the barbs and razors of just pulling it down from the upstream. And we ran into a bug, for instance. We used QBADM to deploy, which broke the UI. And we had all kinds of issues like that, where we had to go bug hunting to figure out why things weren't working. Wouldn't have probably happened with OpenShift. Any questions? Suggestions? Anybody have any ideas of how to do this better than we tried? Everyone, anyone awake? I know there's a happy hour going on right now. You're all probably thinking about that, so. Yeah, so there's a lot more Red Hat talks today and tomorrow. Just wanted to throw a few of these up here for you guys, in case you wanted to check any of them out. These are all Red Hat talks. And if you guys didn't want to go up to the microphone in front of everybody, feel free to come up afterwards. And I'll ask questions.