 Hello everyone, and I will start with a question actually why doesn't it what doesn't it work, okay? Yeah, so We all keep hearing about this containers docker images think, you know, it's all great. It's all cloud scale It's all microservices but There's this I mean, I think it knows so many people Forgets to with to ask actually pretty important question. Why do we even bother about it? I we know that everyone talks about it, but why? So my name is me how this is can and this is yeah gay. We are all team coland will start by We try to answer this question for you so First of all deployment with containers is quite a reliable thing So I don't know if you ever use containers, but deploying with them is Pretty much consists two distinct steps one is building the containers the other one is deploying them and What nice about building containers? it's don't only call the packages installing all the Installing all the things in the operating system and that's most volatile part of any deployment because this is this is part when we actually depend on our Uplink on our networking on the other people's repositories and With containers we can do it before Before we even start deploying we can make sure that we install everything and all the repositories were there during installation what also what that also means that for example you deploy a hundred nodes and Someone somewhere in some repository Just happened to switch version of a particular package during your deployment That will leave you in a quite strange situation when half of your nodes will have one version of package Half of the nodes will have another version of package, which is which is bad. Maybe but So if you pre-built your packages you are pretty much guaranteed to have a Same exactly version on everything and every node and the node will look exactly the way you want And moreover if you actually want to check which versions are there You just check which containers are running and that's it You have all the rumble all the information you need about your node in one single place so that also means that if you pre-built your containers and You like any good Enterprise should have you have your staging environment on which you will test out your Environment it will look exactly the same on production Versions will be the same setups will be the same Directors will be the same That's really great for and production environments and staging so That also means that upgrades are awesome with the complainers so much easier. It's Because we as I said we can build containers before we also can download these containers to the Nodes before we even touch the environment. We can prepare all the things Don't know all the packages don't know the prepare them on the nodes themselves before we even touch anything. That's running any running environment That's great the what's other the other thing that's awesome about containers is Every package has a dependencies. That's especially true if you talking about open stack if you ever seen requirements.txt that are probably over hundreds lines there, which means if you For example run Nova and Neutron container in the Nova and Neutron service in the same node They will usually they will share the same set of dependencies, which means if you want to upgrade just Nova for example That's not possible because Neutron will get this dependency mixed up. So with containers because all the all the Dependents are separated. You can upgrade just Nova validate if it's working then upgrade Neutron validate if it's working and and so on and so forth that makes That makes automation of upgrades Much easier if not possible at all because without that it's it's asking for problems it's also fast and by fast, I mean you will you'll see if you try it we deployed We have a test we have data from our deployment on OSIC cluster. You may heard of OSIC It's this thousand node cluster which for community we so aren't August we asked for the 130 nodes it took us 20 minutes to deploy full open stack on this front and refer to notes safe HAA everything you you want on production open stack. So yeah, and speed of deployment actually matters Because imagine this if your deployment will take three hours normally and Then you just start it and go do something else and after two hours 15 minutes It breaks because you make some instant configuration So you make this small adjustment to config you try again It's six hour past if you manage to get to the end of the deployment immediately so apparent after This means that of deployments even though maybe the actual process may take a couple hours It will grow up. It will add up to today's or weeks even if it's 20 minutes You can just sit in the watching going and before you go before you finish your coffee It will be done or you will be a you know that you make your misconfiguration. It will be done after hour So yeah, that's actually important thing to have so cola is a product is a Open stack project which means which means to produce to give everyone docker files with open stack in it and That means you can build your own containers. You can download our containers pre-built containers from Docker Hub. We have we have the account in Docker Hub and Containers comes with a couple of different variants. So it's Ubuntu. There is there's CentOS. There's Oracle Linux There is rel even there's a real so Pretty much there are a couple variants you can choose from and or you can build them yourself from our color repository and Then you also can we have a week added new feature in Newton Which allows you to customize pretty much almost everything in container version of packages in repositories keys everything so, yeah, but and Containers are useless unless we deploy them and that's one of things where things comes to be constricted because It's nice to build them, but we need to deploy them. We need to configure the container in a correct way. We need to Generate the configuration file for Nova service We need to know how to how to actually do the can run what what director is new we need to bend so on so forth so in color we call them orchestration tools and We have a couple of you. We have few already Ironically we start from Kubernetes back in kilo Initially, Kola was meant to be Kubernetes from day one. However, back in the day Kubernetes was not exactly this most stable of solutions It also lacked a couple of critical features. We need namely privileged containers or net equals host So we need to drop this one Next one was compose docker compose is this nice nifty little thing when you specify How you run how you want to run docker container with a YAML? However, there's no something like there's no easy way to Orchestrate, you know this container has to run after this container is finished. There's nothing like it and it's pretty but in terms of it requires you to you to pass every variable to via environment variable and Nova has over a hundred our configuration options, which means if you would like to create like full fully customizable Nova we need to support hundreds of environmental variables and We need to keep up with them because default changes in call in the Nova. We need to change default in our project The Nova adds or removes one. We need to do the same. It's impossible to maintain So compose goes bye-bye next one was Ansible and That one worked. That's the first real orchestration tool and currently it's production ready. It's there we started Ansible in Liberty and Liberty was for us first to lease Mitaka was pretty was already pretty good and Newton well try it for yourself. I If our production opens like I would through run open stack Ansible so Ansible is then there, but hey, why bother if we already have one so Well, so what if we could make the whole open stack experience even better Like if you ever actually operate in an open stack cluster, if you lose say a database node in the middle of the night Someone's going to wake me up. I'm going to have to like push a bunch of configuration files Find a replacement server run a bunch of processes and finally, you know get back to the original state So that's not there's some automation for this But it's still a lot of trouble to set up and furthermore I'm wasting a lot of resources with a standard open stack control plane because I'm dedicating like nodes to running Horizon or I'm dead have like very fixed capacity for the services like horizon. I don't really need that so we returned to Kubernetes and At this point in time I feel like the Kubernetes community has the same feeling that open stack did like a couple of years ago There's a huge pile of installers. There's CI tooling. There's documentation. There's utilities There's a lot of stuff that lives on top of Kubernetes There's an infrastructure being built around it and people are using Kubernetes right now to deploy increasingly complex and sophisticated microservice apps well So the secret is open stack started out as kind of this giant single blob But over time it's pretty much turning into this big microservice model and Kubernetes is going to be the way that you can make this a lot less painful a lot friendlier So what does Kubernetes give us? Why is this an advantage? Well first big thing? I'd like to highlight is that Kubernetes allows you to set up auto scale So there's a bunch of services inside of an open stack cluster today without any code changes That are perfectly happy to be scaled up and down new nodes get added old nodes get replaced Horizon for example, but not just horizon API servers So what I can do Oops. Oh what I can do is I can set up like say one horizon node because it's in the middle of the night Probably no one's touching my cluster, but then as I start to like get more load I can add more horizon nodes and you can set this up so that it's actually scaling on the right metric meaning like request latency Or observed load instead of just something like CPU Furthermore, I can basically if I really need a whole lot of people are hammering on my horizon if I really need that capacity I can use all my spare capacity in the cluster and I'll note here that like you might have two horizon clusters on the same node That's okay because Kubernetes has a bunch of tooling that makes it really easy to just treat these as blocks To have them come and go instead of like maybe it's a little easier for me to scale vertically make larger nodes, whatever So let's talk about deployments in Kubernetes This is actually really powerful because the Kubernetes developers realized that there's usually like a couple of different ways That you want to run your apps depending on what it how it actually works, you know Kubernetes it's easiest to just show someone a Webhead demo that doesn't have any properties to it, but this is open stack. It's complicated So I drew kind of both of the popular ways to deploy something first You can have a deployment where you just say here's I want like about three servers in this version And then whenever you upgrade the version it will wind down in a controlled fashion The pool at the old version and wind up a pool in the new version and you can auto-scale these you can add them You can delete them as necessary You can also create inside of Kubernetes. They added this in a fairly recent release You can add petsets and a pet set is for something like you know schedulers or Rabbit mq where it's like you don't want to repartition your rabbit mq all day You want to say I've got three rabbit mqs and that's it So the way that petsets work is you define here's my slots It keeps track of those slots and then any time a slot disappears and it gets rescheduled It will go back in exactly the same host name with the same storage etc that it had before and when it deploys It's always going to like take one node out of the slot upgrade it and then put the that node back Okay, so inside of Kubernetes to make all this work Kubernetes contains its own service registry The way the service registry inside of Kubernetes works it creates a DNS alias So the config file that you would create for nova just says I want to talk to this service at this DNS alias the DNS alias points to a bit because frankly I've rarely seen Anyone correctly implement like you know round robin DNS etc like you need a VIP in front of it And so horizon doesn't really know what it's talking to it'll just hit one of the nova API nodes It'll do its work The same logic. It's not just rest web services that you can use this Kubernetes service registry for Even if it's like a MariaDB connection where it stays open for a while That's still a support in the Kubernetes network model pretty much You know all the ways you're going to talk between one process and another Kubernetes pretty much makes work pretty easily Kubernetes does storage management. This is fairly important So let's say I've got a cluster here and it's got some spare capacity But I've got some you know say MariaDB or some other services that have disk Okay, what happens if one of those nodes disappears? Well as long as you have SAF or another sort of distributed storage cluster Kubernetes will realize within you know a configurable timeout that One of those nodes went away and it has some containers that needs to reschedule So what's going to happen is? it's going to move it's going to create a new container for each of those processes on the Host with excess capacity and then it's going to map the storage back in so it's as if nothing ever went wrong you can Move them around very easily Okay, so what about my existing tools a lot of people in the open stack world have already expended a bunch of time Trying to get to the point where they could deliver an HA cluster of open stack Well, the biggest advantage here is that you have unifies a lot of your tooling instead of thinking about this is the way that this Service scales. This is the way this service has heartbeats, you know keep alive D. Etc This is how this service gets just dispatched to like HAProxy or service registry pretty much you have one set of tooling It's built into kubernetes. It's really good and it's being developed by a large open-source community And why bother with you know cola? Why didn't we start from scratch? Well, it's because a lot of our services it was pretty much all we had to do was take the existing cola containers tweak a few things so they'd work in the kubernetes environment and we were done It wasn't like we were reinventing the world to make cola kubernetes work So now I'm gonna have Sergei who is the person who finally got like the final piece Of nova and neutron and everything working and he's gonna give a live demo of cola kubernetes For the demo We had a discussion internally and see what could impress you guys I mean, this is a very highly technical crowd. You've seen it all tried at all Done it all and what could impress you and after some consideration we decided to We'll go we're gonna go with the powering of one of the kubernetes compute node and To make it even more interesting that Kubernetes compute node will be hosting some essential services MariaDB being one of them And then probably for people who operates the open stack at least once in their life professional lifetime They went through power outage when server running some database went of the air and then you had to deal with the recovery recovering the MariaDB so With this demonstration we want to show you that your And sleep efficiency will be increased if you start using the kubernetes services Now I will need just a few seconds to bring back up my setup and Start showing things Okay How do you That's kind of small. Okay. How do I control it though? Oh, okay, so I have to watch it. Okay, great. All right so Just a couple of words about the test bed Our original plan was to use nukes and basically to have a live demonstration here but then we thought Talking about nukes on IRC and smuggling nukes from states to Spain and from Spain to back to states could be Could be very very Serious offense and you can make to FBI watch list right away so we decided to go with the plain and simple Cisco UCS server and thanks to Steven Dake who is in the audience for giving us this opportunity to use his hardware Otherwise we would be in trouble Now so we have five nodes five UCS boxes running Center 7 with the kubernetes version 1 4 1 4 3 and on top of that we are running cola kubernetes and We are using cola generated images version 3 0 0 basically Newton So if you look if you look at the screen you will see a bunch of pots and You definitely recognize some some of those names Some of them are not known to you because we had to develop them for specific functionality during the development cycle and I'll try to mention Yeah, I'll try to mention some of them now so Let's go Yeah, if I knew I would bring a different glass Because it's kind of hard to see okay control. Okay. All right, so The the way our test best Test bed is set up. We have a API server or kind of a master kubernetes server on the control zero one It runs all the API Controller manager processes and the scheduler all four other nodes are basically the compute kubernetes compute nodes You shouldn't mix them up with the open stack compute nodes because these are completely two different types of animals So Now let me show you Okay. Oh, all right. Great. Thanks. All right. So as you can see all all of all five nodes are up and running with the Status ray now I need to find a compute kubernetes compute node Which is hosting Maria DB service just give me a second. All right. I want to spend just one more second When you when you check before you saw some weird Alphanumeric letters added to the pod name and that kind of shows you that this port was deployed either with the replication controller or the deployment but some ports they have name dash zero and That's the that's how you can tell this pod was deployed using the pet set It's very important. It was very important for us to use the pet set because otherwise When you do for example Nova service list you you might end up with a bunch of schedulers because every time when the control when the Container of poetry starts new name get generated So the new name gets registered via RPC with Nova and then the list was just growing enormously But going with the pet set solved one of the problem being naming the problem now So, okay, so the name is Maria DB. They're zero now Describe pod zero and then I want to see what's the node All right. So as a as you can see Maria DB currently is running on the file car control zero three now Before I go and power off that node I'd like to show you basically that way we do have open stack running Unfortunate. I mean during this cycle our focus was developing and adding features Adding important components and we didn't have enough cycles to develop some nice GUI So everything is done here basically is CLI based Definitely after in the next cycle will add some nice GUI gizmos To be more attractive, but I mean I hope most of you guys are developers who is very accustomed to using the CLI So you'll forgive me for that Now so I'll show you open stack Yeah, so By running this very simple command basically we are touching a couple of most important components keystone and the Maria DB So we have approved that our open stack installation is running now. I need to go Okay, I need to go here for my KVM. We're using the KVM console So that's true. Okay, let me see this one Yeah, our KVM got disconnected. Let's not connect it here. Let's just do oh No, no, don't do reboot do power off No, I mean we could definitely do the reboot it will It will it will work. Power off is easier I mean, I don't want to deal with the the node coming back and then interfering with the process of basically migrating Even though we've tested with the power off. No, okay Now, okay, so that's gone we go to the pod monitors and Will watch so basically the idea is to Watch live what's gonna happen with those ports? Kubernetes employs a keep alive mechanism which is controlled by the controller manager and it sends the keep alive to every compute node Depending on the settings to him to decrease the fail over time We had to tune down a little bit that keep alive So right now we have 10 seconds keep alive with the 30 seconds that timer. So somewhere between now and Here this moment will see some activity as you can see some ports get into the terminating state and So that's when the basically system detects something went wrong We need to evacuate those services from one fail the compute node to the new compute node And then the second stage Second state is a port initialization Well Currently Kubernetes doesn't have a native fencing support for these purposes We developed our own port which runs on top of that on top of coa coa Kubernetes And checks for the state of every node and if it detects a node in the not ready state It goes and it kills all ports which were hosted by the failed node in this case We improve the fail over time still it's not ideal, but we managed to save quite a few seconds for that now As you can see some ports are in the container creating some ports are in the initialization state and the difference here is because We have some ports which are using persistent storage like Maria DB a rabbit MQ glands elastic search they all use the Um back safe back ten volumes and for the Kubernetes it takes time basically to go and to clear that lock which was Created for the previous failed node. That's why ports which are not using Persistent storage as you can see they are already in the running state But ports which were using this safe storage. It takes a little bit longer So, hopefully if everything goes as planned No Maria DB is still in the container creating Yeah, yeah, yeah, it's like No, no, it will it will yeah, it will come up. You're right. Yeah. So, um, it's right now Yeah, it's running now But we have to wait we have to wait a little bit when all ports are back up So as soon as all ports are in the running state from the Kubernetes perspective the failover has happened successfully On top of that we will yeah, we will need extra time to For the open stack to rebuild the TCP sessions. I mean definitely the time which Took for a failover was more than TCP normal TCP session would stay up and the most likely does The couple pending yeah a couple of pending So, I mean those are I Mean we could we could definitely check but the more important the most important for us was the Maria DB and the rabbit MQ and they are in the Why I don't see the whole screen. I mean it's just too small because it's just a small This is watch so yeah, I don't think we're gonna Yeah, just Okay, okay. Yeah, that's fine. Yeah, let me do it. Yeah, it's my keyboard. Yeah, exactly. I know that you are a Mac guy so You have I'm on PC. Yeah, you have French keyboard. It's called a chop here I'm sorry. I'm sorry. I'm from Montreal and Actually the French Keyboard is a mandatory by law. Yeah, just do it No, I'm serious Okay, so Maria DB is running perfect and now So let's do a quick check keep CTL Get nodes. So we should see that we still have one node is not ready because we powered it off and Then let's check where the Maria DB is running now Yeah, okay, so now it's running on the control zero two and the last check Need to do Yeah There you go, it takes a little bit of time Well, thank you well, I mean what you saw is definitely not ideal and Kubernetes community is aware that there is a lot of place for improvement and there are some initiatives Which will definitely reduce this time if not completely eliminated in the in the near future and on our side was Oh, we're also trying to optimize as much as possible this so We strongly believe that the Kubernetes is one of the More promising orchestration engine and so please join us. So thank you Sergey. That was awesome. Thank you And As Sergey mentioned, Kubernetes needs some improving color Kubernetes also needs some improving and We cannot do it. Just just with three of us or for a couple more people that currently involved in color. So guys and girls everyone here, please Join us on IRC. You can find us with the on the color opens that color on three nodes I am in call. This is wire head. This is best back. It's best back And feel free to catch up with us if you'd like to contribute We are we are proud by a with a diversity of corporations known And there isn't any corporation in our community that has more than 20 percent of commits and reviews That means we are safe way to commit to know if any corporate company Figures out that they don't want to run opens like anymore. This is safe project to be on and And you'll find our community very old very welcoming very open and I hope to see you guys there Thank you. Any questions Could you please come over to the microphone? It's So you were talking about some containers using Seth is there a driver you're using special driver So Kubernetes on its own has in its Repository RBD driver. It's needs work to be honest I mean there are features like for example the fencing pot with that Sergey created. We needed to Make self-working with it pretty much that so but there is native self support in Kubernetes It's help, but there is one. Okay, and then there are a couple of storage engines to like Gluster. Yeah, and Ice I see NFS and so on issues. We chose Seth because that's the way we're doing things in cola. Yep Okay, and another question regarding Maria DB. Did you use a Galera cluster here? No, is it just so did it be with the failure? This is single note Maria DB with external storage this is so running Running databases that are transactional databases in in Kubernetes is still open question. So I'm aware of couple different solutions We just picked up the one that we have we run single note Maria DB. It got skilled. It gets migrated It's not obviously as you can see not without our downtime. There is couple minutes of downtime. However, that's thanks to that We don't deal with network partitions with full cluster Down so it's like it's our trade of our there's no say no good way to do it yet. They working on it I added to her question. Are you guys using Kubernetes for scheduling and placing things? That have a lot of local state like a little compute and L3 agent things along those lines And if you are how do you handle? Failover and recovery sort of operations We don't okay. It's Yeah, it's this kind of services has to land on deco. I mean when compute node dies it dies Kubernetes will not migrate VMs Just I want to add for the for this L3 agent and DHCP agent and meta agent We're actually using Kubernetes demon sets. So we basically every compute node Automatically gets a set of pods and these three or and some others are Automatically gets launched when you identify a specific Kubernetes compute node as a open stack compute node in this case, we don't have to maintain any states I mean they're running locally. They have no visibility about others. They don't need they just deal with the local VMs So the Kubernetes like recovery and Sort of things don't really apply any sort of situations. Okay. Thank you Any questions? Does this open being able to open stack on things like open shift because that's a similar so open shift 3 is a similar sort So open shift. I am not a red-hatter. I'm probably they're much much better qualified people on this in this room to talk about it But as far as I know open shift is a Kubernetes appliance by red hat We're not deploying Kubernetes. We're deploying open stack on Kubernetes. So technically I'm not sure if it will but that might work to deploy open stack with color Kubernetes on top of open shift so Yeah, we have done experiments with firing up a Kubernetes cluster in Google Container Engine and running open stack inside of Google Container Engine instead of just on our physical nodes Yeah, I would like to know is there any prerequisites or can I really just take any Kubernetes cluster and deploy it? so Us I mean, yeah, yeah The only the only prerequisite is to have a persistent storage. I mean so it has to be there safe cluster or whatever you can Whenever or wherever you can create PV and PVC, which is persistent volume and persistent volume claim And as long as you can provide this to so that's not internal to there's also version of it because we Yeah, we're going pretty much bleeding catch Kubernetes right now. This keeps adding Features which we need and we still don't have all the features we need. So we will bleeding edge So I would say go with the latest Yeah, that's that's how we do it Okay, so so all I need is basically deploy one Yammer all in one Yammer and have a running There is so there I suggest you to look at our all-in-one guide We actually have a step-by-step guide to how to deploy Actually, it will be multi-node because it will spawn a couple of virtual machines and it will spa deploy opens They Kubernetes on this virtual machines and then it will deploy open stack I mean color Kubernetes on top of them. It's called hypercube guide is in our documentation Please try it out anything else How do you manage complex networking setups if you want to have multiple networks like we do in open stack We have network for API storage for application So it's done exactly the same way. I mean there is no there's like you create your oops You create your network is an open stack exactly the same way as you do, but the Kubernetes it runs or it has its own networking which Connects the compute nodes. So it's like you run on top of the Kubernetes provided network Yeah, so you don't use most networking. So basically we do the neutron services that need host networking run with host networking, but otherwise like you don't need to set up a Color Kubernetes cluster with like double nicks You can rely on the underlying Kubernetes network fabric to keep the external traffic and the internal traffic separate So everything is part of the same network You don't separate storage from API from Inside of the Kubernetes. There is no storage API inside. Yeah, but in open stack you can do it Yeah, I know it working when you run containers So you don't use the overlay network for Kubernetes, right current so I think well, so if you I'm not quite sure if we can set up this in color Kubernetes like the separate networking a separate to start I mean you could if you for example connect safe via when you specify safe Arrosive connection you specify IP. So when you specify of monitors So when you specify IP on monitors you can specify these IPs going over dedicated network. So So Okay, when you're when you're when you run container like API, it will have it will have just the Control control plane IP, right? Keep on API if you run storage it will have exposed control plane control plane API like my ready be and it will also have connected Connected storage with the set volume using native Kubernetes support for storage volume And then you can spend and there you can specify which in the networking which you want to which you would like to use And from neutron from neutral perspective Neutron runs on net host with the services that needs to run a net host, which means to however you specify Net host that's that's what you're gonna run and what what's on the host. So there's no magic with it there Do you does call up provide? Whatever which that is necessary. I heard by frost was mentioned So does do you have a push up phase where by frost will be set up will set up bear middle? So in new tunnel created we have by frost we don't have this equivalent for color Kubernetes This is for colon sable So we have by frost and we have playbooks that will install everything on the provision servers Like docker and all the prerequisites for colon sable We don't have something something like that in kubernetes in color Kubernetes. You pretty much need to deploy Kubernetes So you're yourself on the bright note. There are a couple tools out there outside of open stack That's just the place Kubernetes and you can deploy stuff with colon sable or however you deploy stuff and That's will and that's all the dependencies you need pretty much Yeah, I think we're running out of time. So Hello, is it any possibility to deploy color Kubernetes for Shattered case stone and several regions. It is what it was not possible with cola But maybe so it is it is possible with cola because with cola you can specify You can overwrite configs No, I can do it by myself We don't we don't have like, you know single click deployment of regions But I can it's possible to deploy multi-region at colon sable. It's just a bit more steps It's not hard, but it's possible with color Kubernetes. We didn't look at it yet It's pretty every product we started after after we actually started first commits where after went after Austin Summit So this is what we get with couple people during class release So we just didn't do it yet if you'd like to implement it Welcome Yeah Okay. Thank you very much. Oh One more thing we start our design sessions today if anyone would like to join our community That would be great place to be Thank you