 All right, am I good to start? Awesome. Hi, thanks for coming. We're going to talk about how to learn OpenStack. And there's kind of a couple reasons I decided to talk about this today. But first, understand that this is kind of the first part of an unofficial two-part presentation. There's a little bit more tomorrow. Today's kind of about learning OpenStack. Tomorrow I'm going to talk about how to make OpenStack a reality in your company. This is something that I kind of wish existed when I first got into the OpenStack community a bunch of years ago. Because it was really kind of hard to find a roadmap to navigate how to get started learning this stuff. So I figured I'd make a presentation. And it turns out that it's really hard to compress weeks of stuff into a 40-minute presentation. So I had to take a lot of stuff out. This is a really high-level roadmap. You're going to find a lot of things on your way as you go through this journey. And this is just kind of one recommendation that I have. This was helpful to me. But certainly by no means this is the only way to learn OpenStack. So with no further ado, one does not simply learn OpenStack. Tons of progress has been made in this area. There are classes and the certified OpenStack administrator designation that's now available. So there's been a lot of progress, but it's still really complex. And if you show up at the summit thinking, hey, I'm going to learn everything there is about OpenStack and try to pick out your sessions, it can get really easy to get lost. And the reality is real world production environments are totally different from proof of concept. And that's totally different from spinning some VMs up on your laptop. Distributions make this a lot easier. But the easier it is to install it, the harder it is to learn if you're not putting this stuff together. So that's kind of what this is all about. And again, this is just one way to do it. You're going to figure out a lot of details on your own as you go through this process. So quick survey. If you think of the process in learning OpenStack starting with a how to, to a proof of concept and then a production environment, these are kind of the four stages of what I would call learning by building. There's kind of, or three stages. There's kind of a fourth, which is, I haven't started yet. So here's the quick survey. Who in here hasn't even started? Who hasn't touched OpenStack yet? Okay, we got one. And then how many would say they're kind of at the how to stage? Okay, perfect. And then anybody doing POCs in their business currently? So a couple. And then anybody running production OpenStack? Okay, awesome. That's really helpful. So personally, I think the best way to learn this stuff is by building stuff and then breaking it. And this is how we encourage our teams at our company to go and learn this stuff. So this is going to be kind of the roadmap to take you from beginner spinning up virtual machines on a laptop to running production grade cloud. So as I said, I can't possibly compress weeks of stuff into a 40 minute presentation. So I'm going to use visual aids and also some temporal displacement technology, which we'll get to later. Step one is really dipping our toes in water with DevStack. So who in here has used DevStack, tried it? Anybody? Okay, cool. So this was one of those things. I had to take a little content out. I was going to actually walk you through that. And it turns out it takes more than just a couple of minutes. So what it is is basically a scripted installation of OpenStack that runs on a single virtual machine. It does it all automatically for you. And at the end of kind of a 20 minute process, you've got this environment up and running that you can go play around in. You don't really have to do very much to get it working. So here's what you need to actually do it. Laptop, everybody's got a laptop. Virtual box, a download of the Ubuntu ISO and DevStack. That's literally all you need to start playing with DevStack. It's super easy. Here's all you have to download, two things. And there's a page at the end that'll give you the opportunity to take a picture if you want this. But it's easy to find. It's really simple. You're going to install VirtualBox on your laptop. You're going to create an Ubuntu VM using almost all the standard stuff, with the exception of setting up two network interfaces. And then you're going to install DevStack on the virtual machine. So I had a whole bunch of step-by-steps in here and screenshots that I took it out for time. So this link here at the bottom, I don't know this person, but he's got a really good step-by-step on how to actually do this. And I did test it a couple nights ago to make sure that it still works and it does. Here's what it looks like. It's really easy to follow. The one thing that I'll note is that he gives you two choices. You can do it using a virtual machine inside of VirtualBox, or you can do it using Vagrant. For the purpose of learning for the first time, I would suggest setting it up in VirtualBox because you can do some networking stuff that is more challenging with Vagrant. And again, thanks, Ronald Bradford, whoever you are. Really appreciate it. All right, so once that's up and running, what you have, again, is an operating open stack environment. Now it's built as an all-in-one. The whole thing is running on one machine, which you would never do in production, but it's fun to kind of play with. So a couple things you get out of this. You get really comfortable with the CLI. So you can go and spin up virtual machines and create networks and do all the stuff that you would do in a production environment. And you can get comfortable with the command line interface. You've also got a working horizon dashboard. Hopefully everybody knows what that is. And this is important not because it's complex to use, but because you can see how it's supposed to work when everything is actually working. This becomes really important later when you build it by hand and you go through the step-by-step process and it starts working and you get an error and you don't know if you're supposed to get the error. So it's important to get comfortable with it in an environment that you know is working correctly. All right. So step two, then, is the how-to. I don't know if everybody can see that. Who's assembled IKEA furniture, right? This looks familiar. So this is where the temporal displacement technology is going to come into play, by the way. So you're now comfortable with what it's supposed to look like. But again, you're not using a DevStack installation for anything in production. So you're going to leave that running as a VM on your machine and go on to the next step and build it somewhere else. You can refer back to it. It's convenient to have there. And when it's running in VirtualBox, you can sit on a plane on the flight home and keep working on it. You don't need the internet once it's installed. OK. So the how-to, what I'm talking about is literally the step-by-step walkthrough that's on the OpenStack.org website. It's under the tutorial section. This, first of all, it's based on the vanilla or trunk base of code, right? So it's not a distribution from Canonical or Morantis or Red Hat. It's literally just the plain vanilla code that OpenStack distributes. This is important because you're not trying to learn of a specific vendor technology. We're going to get to that in the production piece. It takes a lot more than just a few minutes. So this can take a week or more if you're doing it for the first time. That's kind of the bad news. The good news is that you're learning how to build the relationships between all of the services and underlying applications as you go along. So you're actually learning a lot more by doing this. So here's what you need for this. Ideally, you've got some hardware that you're building it on. You don't have to do it this way. You can set up a set of VMs on VirtualBox the same way that you did with DevStack if you really want to. But I always recommend doing it on hardware. These are Intel Nuke boxes. You've probably seen some out in the marketplace if you've walked around and gotten a few t-shirts. But they're great for development purposes. They're great for testing. They're super cheap. You need them and ideally a VLAN capable switch. It's not a requirement, but I recommend it. Because, again, the aim here is to learn how to build stuff for production. Most importantly, you need two nicks on them. So some of these have two nicks on the tiny little motherboard that's in them. Otherwise, you can put in an ethernet dongle that's supported by Ubuntu. The last thing you need is a lot of patience, because especially for the first time, this is going to take a lot of time. So same things. You already downloaded that Ubuntu image. The documentation, there is a PDF version, so you can do this stuff all offline. And then a good text editor. I put my favorite in here. It's Adam. His cross-platform runs on anything. Why do I say that? So Adam actually gives you the ability to create nested file structures and code level commenting and colors. Super, super fun to use. And you're going to document a ton of stuff as you go through this process. This is super, super important. We don't want you to end up chasing bugs that you think, when you did something wrong, if that comes later. And a note on versions. You'll note that I've recommended Ubuntu 14.04. Instead of 16.04, we're probably at the precipice where you could do this on 16.04. But again, I don't want anybody to chase bugs that just haven't been thoroughly worked out of the system yet. So here's what that how to actually looks like. It's really easy to follow. Now, again, you're doing everything manually, right? So you're going to go through this step by step by step and run through the configuration options. And you're actually going to input. You're going to create new relationships that in a normal production environment, you would have done all at once. So you're actually going back and redoing some of the work that you did earlier on in the process. And that is by design. They set it up that way so that you can actually learn how those relationships all work together. But it's really easy to follow. Now, I mentioned the documenting that you're going to spend time on as you go through this process. So what I like to do is basically create a log of what I'm installing, the commands I'm putting in, the changes I'm making to the configuration files, the services that I'm restarting, all the things that are in that how to document. The reason you're going to document that and make notes to yourself is because you're going to come back and refer to that when you start to change it into a highly available environment and a proof of concept. When you've got that all laid out in one place, you can refer back to it. You'll leave notes to yourself say, go back and change this. What does that look like? So this is the, by the way, this is what Adam looks like. And you can see it kind of highlights things in auto indents and all that kind of fun stuff. But again, you're going to need notes. I think I left something in there for myself right there, up at the top, noted that it changed in a subsequent, the location of those files changed in a subsequent release. That saves yourself a lot of grief in the future. So a couple notes on the OpenStack tutorial. One of the things that it does well is it gives you lots of options for how you want to configure your network. One of the things it does poorly is it gives you a lot of options on how you want to configure your network. So as a first-time person, going through that tutorial, it's sometimes hard to translate that into your own environment. So at a really simple level, you've got an internet connection, two inside networks, a switch, and at least three servers. You can use those little intel nukes that we're talking about. One of the decisions that it's going to ask you to make is what the networking model is that you want to use. You're going to use provider networks with the self-service option. I think it's options 2 plus 3 in the official documentation. So then that's going to refer to this diagram. And essentially what you see here is the red network is an internal management network. They have it on a 10.0 address space. And then the green one is a public-facing what they refer to as a provider network. These are addresses on the open internet. So what I've done is I've translated this into an environment that you might have at your office or at home. Because if you're not really comfortable with this, it can be a little bit of trial and error when you're actually setting up the services. So the red and green here translate to those two networks. So I've got internet. I've got my router providing NAT services to the inside. And then I've carved out two different networks. I used two different address blocks just to make it clear. But essentially, the red and green correspond on this to the diagram that I just showed you. So that red network that's connecting the three machines is that internal management. And the green, even though it's still using private IP space, is what you're considering the public address space. And that will be accessible to your router and the rest of your network. So this is the list of all of the services that you can install in the tutorial. They don't take you through everything, but they take you through most. You don't need them all to get a cloud up and running, though. And in fact, I would advocate not trying to install them all on your first try. What's probably going to happen is you'll get the basic environment running. You'll find some things that you didn't really want to do the way that you did. You're going to leave some notes in that documentation and go back and do it again the next time and do it a little differently. So going through all of these, you can spend a day on each one. It really will take a lot of time, again, the first time. Here comes that temporal displacement technology, since we can't fit all of that into a 40-minute presentation. We're going to hop in our open stack time machine, go through a bunch of work, Marty McFly, then we're going to go down to launch our first virtual machine. So I skipped some stuff. I skipped the shared file service, object storage, orchestration, telemetry, database. Those are all things that you can go back and add once you're up and running if you want to. There's a reason I'm being selective, and it gets to what we're trying to accomplish in a proof of concept when we're building this in our business as a new installation. So when we go to launch an instance, there's a pretty good chance it's not going to work on the first go. And so then you get to play detective. Look in VAR log and start tailing all the files that are in there, and start figuring out what the area of messages mean. Open stack is super, super chatty in any kind of size cloud of any scale. You're going to get hundreds of messages per second. So there are ways to adjust the logging level. In all of the service configuration files, you're going to set it to verbose if you follow that tutorial, which means you're going to get tons of data. If you tail all of them in real time, it's just going to fly past you. So it does take some detective work to see what's actually happening. And again, remember the Dev stack that you still have running on your laptop VM. You can go back to that. Those log files are in the same place. And you can compare what you're seeing in your built environment to what came out of the box with Dev stack. So eventually, you get that first virtual machine launched. Looks like this when you hop on that Syros console. But after that week of work that you spent on it, this is what it feels like. Thank you, Google Images. Next. So now what have we learned? So because we built all of those services one by one independently, we have a good understanding of how the services are working together and the back end applications that are supporting them. And we know where to look for clues when something goes wrong. And we've learned a lot more about the CLI because it takes you through the whole process building these all on the command line. So we know a whole bunch more stuff now. Now there's some downsides to this. You might think, well, we've got this environment. Instead of building it on those little Intel nukes, maybe you built it on a rack of servers that you've got sitting in your data center. And so you think, yes, all right. I've got something I can put my business users on. Don't do it. Resist the urge. That how-to is not designed to be a scalable environment. It's not built for performance. And there is no redundancy in it at all. It's not what you would use in a production environment. So next steps, take it down, start it over, re-document it, try to work off of your notes instead of referring to the OpenStack.org how-to pages. You're going to learn more, and you're going to make more comments in there and note things that you probably did, not the way that you would on the second go. Again, make sure the documentation is fully complete. And I'll show you what that looks like in a minute because we're going to need that. Now, so a proof of concept really is all about addressing a certain business need within your company. And it's got to be built with that purpose in mind. So we've got a bunch of stuff to learn before we can actually turn users onto this. The first thing that we need to learn is Seth Storage. And again, this doesn't mean you have to. This isn't the only way to do it. But when we build production-grade OpenStack environments, we use Seth as our back-end storage. And again, this is one of those things I can't have a bunch of slides in here. But what you want to start by doing is changing out Glance to be backed by Seth. If you try to put it in as the volume service from day one, it's going to be a bit of a struggle. If it's your first time using Seth, Seth is much easier to install than OpenStack. But if it's your first time, start with something simple. Glance is really easy to get up and running in a Seth environment. Once you do that, you know that OpenStack and Seth are talking, and then you can start moving on to the volume services. So then we need to address the scalability and reliability components. To make services highly available in OpenStack, we're going to wrap those services in other services, which I call service burritos. And I'm going to talk about those components and where they kind of fit. Now, you can't really build this on top of the existing configuration that you already built. And this is why that documentation is going to come in really handy. So probably everybody has seen this chart at one point. This is the, it's not an eye doctor chart. It's the chart that shows all of the OpenStack services, the core components, and how they work together. Now, what you've just built in that how to is essentially this with a compute node and a storage node off the bottom. This is essentially the control services. So to make it highly available, I said, we're going to wrap this in other things. So the first thing you need to understand is that that's backed by two applications that are outside of OpenStack. The first is a database. We use MariaDB. There are other options, but MariaDB is commonly, is well supported. And that's what you're going to install following the tutorial. And then the second is RabbitMQ. Again, there are other choices. But Rabbit is what they use in the tutorial. Rabbit is the messaging back end. It's the way that all of the services pass messages to each other in a really simplistic way. And then that sits on a physical server. So the first thing that we need to do is add some machines. And we have to have three. So the reason that we have to have three is because the first thing we're going to replicate is the database. So MariaDB has an extension called Galera, or there's software vendor called Galera. It's an open source product. It's a patch that you apply to MariaDB. And now you can create a cluster. You'll note that up here at the top, I said private VLAN for MariaDB. And this is where those VLAN capable switches come into play. You're going to want to separate your database traffic and your messaging traffic onto separate VLANs from management and the actual data playing for the virtual machines. The second thing then is Rabbit. Again, same thing, a separate VLAN for it. So MariaDB and Rabbit cluster in totally different ways. MariaDB through Galera has to have three because there needs to be a quorum. You don't want to have two, because if there's a network partition and they come back, one of them isn't in charge, unless one actually died and was smart enough to save its state. So that's why most controller designs have three. Rabbit doesn't have that requirement. There are different ways to have Rabbit cluster, but you're going to use the auto healing cluster methodology. So that handles the back end database and all of the open stack services. Now, we've got incoming requests that are coming from the machines and from users. And so we have to load balance those. First thing first, all the services on the three machines are tied to or bound to local IP addresses on the machines. And so we use HA Proxy as the vehicle to load balance between the three machines. HA Proxy is very robust, but it's simple to get up and running. Basically, you're creating pools of services with a front end interface and multiple back end machines. You're going to front end everything on there with HA Proxy with the exception of Rabbit. Rabbit does not support having a load balancer sitting in front of it. So in your service configuration files, you're going to specify all three Rabbit servers. And then in front of HA Proxy, we're going to use KEPA LiveD. And KEPA LiveD is an open source implementation of VRRP, which is virtual router redundancy protocol. And basically, you're going to attach a virtual IP to KEPA LiveD that then floats between the three. So from the perspective of your other services, you're only seeing one IP address and one service endpoint. And then the combination of KEPA LiveD and HA Proxy are redirecting it to the three different servers based on response time and availability. Lastly, that virtual IP is tied to the Linux Bridge networking in the kernel. In the tutorial, they did change it actually from Kilo to Liberty. It's no longer using OpenVswitch by default. It's using Linux Bridge, which I recommend for performance reasons. And now that is going into the third private VLAN, which is your OpenStack network, where all of your machines are talking to each other. And they have the service API endpoints. There is another way of doing this using Corosync and Pacemaker. It's fundamentally similar. I'm not going to touch on that right now because this is simpler to get up and running. So that covered the controllers. So we know how it all fits together. So you're going to go back to your build notes and start looking for places where things have changed or need to change in an HA environment. For example, things like the service endpoints in the configuration files changed are now changing to that virtual IP that sits out in front. You're going to update a host file where you've specified the host names for the different services. And you're going to update HA Proxy as you go along with these. So here's an example of a documentation note that I made as I went through and did this process. OpenStack tutorial will tell you to create your endpoint and call it just Keystone on port 5000. So what I said is, OK, I'm going to call my HA Proxy front-ended version of Keystone Keystone-HA. So I went through my build notes. And I found all the places that I made reference to that service API endpoint. And I changed it. Once you go through that process, then you're going to start over and install it from scratch, now using your notes with all of those services built. And it's not going to go smoothly, but the better note taking you take on the front-end, the easier it will go when you go to actually build this in a highly available configuration. Then you're going to do some testing. So one of our customers put us through the most exhaustive testing I've ever seen. We would actually create a looping script that was creating something like 10 or instantiating 10 virtual machines per second. And then we would literally pull the power out on a controller or reboot a switch or something like that in the middle of this and measure how many instantiations failed. So you can do things like that. Then there's also some test suites like Tempest and Rally that will help you with this. I'm not going to go too much into that. But HA Proxy, when you install it, has this nice stat screen that's actually kind of handy. It's built-in. You just have to enable it. And it updates every time there's a service change. So in this one, you can see that the database is down on node 2, lights up red. It's actually pretty handy to see what's going on within your controller stack. And it's built in. OK, so now we're ready for the proof of concept. Step three, so now we know how all this stuff works. But I'm going to ask you to throw that all out the window. Because we just learned how all the services are built and work together and how to make them highly available. You should at this point now have a really good understanding of how this stuff is working. But the reality is we probably don't want to go and build it this way for a production environment. And I said, this is the proof of concept. Why is it important to build a proof of concept as if it were a production? So first thing, there's a couple of questions that we have to ask ourselves. The first is, is your company in the business of building and operating infrastructure? There's a good chance it's not. There are a lot of technology companies that are inherently tech companies, but not an infrastructure company, Uber and Airbnb. And I could think of it as others. So unless you're a hosting company, probably not in the infrastructure space specifically. That's important. So therefore, we're assuming we're building for a business requirement, and we're assuming we're not in the business of supporting technology infrastructure. So therefore, our proof of concept is going to be used. We're going to invite actual users to try it out. Ultimately, we want to drive adoption. So therefore, we're building it as if we were doing a production. And again, the whole experience that we just went through to build this thing by hand is not the way you want to actually do it. So now we're going to look at different deployment methods. Morantis, Red Hat, Sousa, Canonical, Apologies to anybody I'm forgetting. But there are lots of different ways of doing it. We happen to really like the canonical juju charms. Morantis has fuel. There's lots of different ways to do it. Again, the point is you don't want to build this stuff by hand. You want automated deployment and operations in any kind of production environment. And since we're building a proof of concept to eventually be production, that's how we're building it. The reason that we went through the process of building it by hand is so that we understand, under the covers, how those distributions are actually integrating the different components. And then next is the hardware question. So again, because we're building this in a production environment, we're going to make the assumption that we're no longer using those little Intel nukes, and we're going to use hardware that we would actually use in our environment. I would encourage you to take a look at both canonical and Red Hat have hardware reference architectures that talk a lot about this. And I'm happy to share ours as well, if anybody wants that after the presentation. So as we think about what we can build into the proof of concept, you'll remember back to that list of all the services that you can deploy, I said, start with the basics. Don't add on a bunch of stuff. There's a reason. Every time you add a new service, whether it's used or not, it means that there's more complexity and you have to support it. So you go to upgrade it down the line. That means there's one more service that you have to think about upgrading, even if people really aren't using it or just kind of dabbling at it. So I really encourage, for a proof of concepts standpoint, make it simple, and then conduct user surveys. Let your users know that there's going to be a process by which they can request new features and new services. And then when you see sufficient demand for something, implement that, but do it thoughtfully. OK, so we're going to get started by building a small environment, again, using a distribution. I'm not going to go into the pros and cons of all the distributions, but it's important then when once you've kind of got that initial proof of concept environment up and running, test some of your own business's applications in the environment and make sure that they're working the way that they should and that the constituents are happy. All right, so last thing we need to think about from the proof of concept standpoint are updates and upgrades. This is another place that things have gotten a lot better in the most recent releases of OpenStack. But it used to be super, super painful to even just do basic system level updates, not even whole release upgrades of OpenStack. Going from Juno to Kilo was just excruciatingly painful. And every release before that was more painful. So getting back to why we're using a distribution for a proof of concept, all of the major vendors have solved this problem largely and made it so that you can deploy new updates in an automated fashion. So last step, now we're going to our mission critical production. So we're now supporting business critical workload, downtime's not an option. And ultimately, what we're doing is tied to our jobs because it's supporting revenue. So now things start to get important. A couple other things that we need to think about now that we're in that operational cloud environment. First is monitoring. You have to make sure that things are healthy and working the way that they should be. Tracking, how much usage, what people are actually consuming on the cloud. Measuring performance, you want to make sure that things are performing the way that they should. And then you have to continue updating and upgrading. And then lastly, on the automation, again, super important, a lot of what we did in the kind of how to has now been automated with the major distributions. It's important to understand how it works because it does kind of serve as that foundation for the operational grade production environment. So I promised a slide. You can take a picture to get the summary points. Here it is. This kind of covers everything I've talked about. So with that, I will open it up to any questions. Oh, no, no, they typically have a chip SSD slot that you can, that they come with inside. Some of them have spinning disks. But yeah, really? OK. Yeah, I mean, maybe they don't all, right? But I think most of them have a slot that you can put one of the micro SSD things. And that's how I've done it. But you don't have to. You could do it on a USB thumb drive, I suppose. No. So I've found, for whatever reason, that the tutorial, so there are separate branches of the tutorial for the different releases. There's a Debian, there's a SUSE, there's a Red Hat, CentOS. I've generally found that the Ubuntu One is the most current and accurate, not always. I know that the canonical people spend a lot of time kind of contributing back in the form of the tutorial, I think, that's my theory. So no, it's a different process. You'll use certain packages. It'll be different on Debian. But fundamentally, it's the same thing. You're going to, as you think about wrapping the services and services, you're going to use the same applications for it. But the tutorial itself is a different variant. Yeah. So you can make your own scripts. I wouldn't recommend it. A couple of years ago, I would have had a different answer. But again, Morantis, Red Hat, Canonical, they've all got automated installation scripts that work really well now. For example, for a relatively small cloud, Canonical has their autopilot product, which is essentially plug and play, boot a system, and it's going to do everything else. It's GUI-based. It uses the same software sets and services that I've described throughout this process. But it's plug and play automated. When it comes time to update the cloud, you're going to say, all right, I'm updating. I'm going from Liberty to Mitaka. It should be fairly, fairly smooth to do that. But not all is. That's partly why you want to go through this process and learn where to look for problems. Yeah, totally. So again, each distribution has their different way of doing this. But Canonical, for example, has the landscape service. Again, it's a GUI-based thing that you say, OK, go and apply updates to all my machines. And it goes and does it. Hopefully everything goes the way it should. Yeah, and that's just stuff that you don't want to be doing by hand anymore. I'm just super labor-intensive. Did you ask, can you use SEF on the same nodes as the controllers? Yeah, so you can. You probably wouldn't do that in production because you'd want to scale out your SEF as the backend storage cluster grows. But for example, in one of our reference architectures, we have what we call our converged reference architecture. Canonical has something similar where basically you have a single box that has three different types of drives in it. So it's got some SSDs for the host OS. It has some SSDs in a RAID 5 for ephemeral storage on the VMs. Then there's a set of big fast or big slow disk for SEF OSDs. And that all sits in one box. So the box has controllers, all the API services, hypervisor, and storage, all in one. And you can kind of scale that out. You could scale out anything you want, really. But generally, we do that in kind of five to 10 node size clouds, at which point we bifurcate the controllers and separate storage and compute. But yeah, there's no reason that you couldn't run all of the services on one system or on three. Any other questions? So some contact information here. If you have more questions or if you want a copy of this presentation, feel free to reach out. And if you enjoyed this, give me some stars on the app. I hear that's the thing now. So thanks for coming.