 Okay, so let's get started Oh you are elegant totally you are this HTTP tiny URL dot com Octavia workshop minus tar minus BC But it's on a very slow server this is a very slow network it's huge. Yeah, it's like six gigs or seven gigs Okay, let's get started just want to go quickly over I'm German Eichberger with HP and We we have Adam Harvell the Frank space Adam value Back there with Carlos also Somewhere back there and Franklin somewhere back there There's also other people we didn't write down like Brandon and duck But you can both of them too and here with Michael with HP and Al sitting here if you need help We have Suzanne with Intel Here to help and also Stephen with a blue box and he told me it's very important. Let's say an IBM company Okay, so everybody Okay, so so everybody I know we didn't have enough USB sticks so can also download from the net But not very fast so better you guys share USB sticks when somebody is finished up and we will run them around Good so then you need a really big computer You need at least eight gigabyte of RAM because that's how big the image is so we recommend like 16 To have something performant We put VMware workstation stuff on it and the reason we use VMware is because we need Virtualization inside virtualization which the other ones often don't have and Yeah, I mean I like SSH clients So basically you copy the thing from the USB drive or you extract from USB drives and you can unzip Probably figure that out heart JX VF. It's it's in BC to format or a seven sip and then Make sure that VM there has to be TX which it should if you do it on your own And you have a Mac got to make sure that you click that thing there put the arrow So that you can run VMs inside VMs And what's our agenda today? We want to talk about architecture and stuff which Michael will do while you guys are still extracting and then we show you a little bit of operations Something about troubleshooting and then how you guys can all contribute and make Octavia even better than it is and we You can take questions throughout the whole session, but we have an extra thing at the end in case you want to wait Okay, I will turn it over to Michael with the introduction architecture Great. Thanks German. Yeah, sorry about the USB sticks last week. There were only 14 people signed up for this hands-on So we didn't have quite enough So this is a preview of what you'll see tomorrow We have an Octavia session a little Elbas session tomorrow afternoon and we're gonna kind of cover the same architecture and then go into some demos for active standby and That capability that's coming in Octavia So just to review Octavia is an Elbas V2 driver. That means we plug into neutron Elbas Which is part of the neutron code base So as you can see on the diagram neutrons that I don't know what color it's coming on these screens kind of pinkish color And Elbas is a plug-in to that API server and then inside of that we have the Octavia driver The difference with Octavia is we have what we call emphora and in the current implementation Those are service VMs and that's where the actual load balancing happens in Octavia Under the old reference driver in Elbas V2 that was HAProxy running in namespaces on your network service nodes So that's the difference here As we spin up these service VMs that actually do the load balancing so you can scale much much wider So just kind of going through the components the driver that plugs into neutron Elbas communicates to The Octavia API server so each of the components that have a gear those are standalone processes that can be run on What we call the controller, but you can also run them on separate hosts if that's how you want to deploy the environment The API uses Oslo messaging to pass Configuring commands back to the Octavia worker which uses Open-stack task flow to do the automation and provisioning of all the components inside Octavia including the emphora Of course, we have a database stores state and configuration information Going across the other processes we have the health manager this component monitors the emphora and the load balancers that are running inside them and make sure that they're all healthy and running properly If a component fails inside the emphora, this will do a failover to a new emphora It'll build a new emphora or pull an emphora from your Spares pool if you have that enabled moving along housekeeping housekeeping is just kind of Periodic processes that we need to run to maintain this environment So if you have spares Turned on if you do have a spares pool housekeeping maintains that spares pool make sure that there's enough emphora running ready to be allocated to users It also does some other background tasks like cleaning up deleted records from the database after a given period and Coming in M1 We'll also do Certificate rotation for the emphora so we use a secure communication between the controller and the emphora themselves But the housekeeping will do rotation on those certificates automatically Underlying all these components is the controller worker driver So Octavia is built in a modular fashion There are a lot of different drivers that allow operators to customize if they need to so you can swap in and out various components of this architecture to adapt to your environment so right now the controller worker is One driver that does service VM provisioning There are patches out there that will land sometime in M that Replaces that driver to facilitate containers So we're moving towards having containers for the emphora One of the issues of containers is hot plugging the networks Octavia allows you to hot plug your member networks or your back-end node networks and With containers right now you pretty much have to rebuild that container to add additional network interfaces to it So that's why there's a swap-in driver for that Along the same lines we have an emphora driver. So that's actually Implements the communication to these in individual emphora And we have a management network that's provisioned when we boot those We have two drivers today in the code base one is a rest-based API The other is an SSH Implementation we will probably be deprecating the SSH implementation in favor of the rest API going forward Certificate driver we do have TLS in here. So TLS offloading is a capability inside the M forum And so we have interfaces that go out to Barbican is our secure store for those certificates and keys and The compute driver right now We only have one compute driver and that interfaces to Nova to spin up the service VMs for the emphora and A network driver and again, we currently have one implementation We have to well the container. Yeah, it's work in progress And then a network driver that interfaces to neutron It's kind of a high-level overview of any questions and answer before we get started Any bails need a USB stick Okay Question it is all implemented inside that controller worker driver We're using tasks close. There are flows that do the provisioning of the service VMs And do the failover flows? Yeah, it's it all works out of the box It's that we made it very modular So if you want need to change something because your cloud is different you can But you don't have to manually provision your service VMs. That's part of the The process Other questions Which layers a load bands about the question. Yeah, so it's a layer three right now and for yeah there is a Patch that's in progress. It didn't quite make liberty for full layer seven. Yeah, they are sevens coming in Yeah, me talker. Yeah, so there's a patch out there. It's work in progress. Yeah, it's it's steven He claims weeks out. So you heard it there. He will do it Yeah You there questions Awesome, let's get started. Okay, then Yeah, we'll show the slides and then you guys have to try to do that stuff Okay, so let's Everybody has his finger actually who has it now up and running the VM ends up So I guess we should entertain people Yeah, let's hand out our stickers and So I saw that we're basically that the baby set up the image you can log in use our bundle password a bundle we are big Maybe do a lot of stuff if a bundle We put everything under the stack user did yeah, I think we did yeah and Basically, you know and we use the admin so we only did the admin tenant so you do everything as an admin tenant and and Yeah, and once the thing comes up and you log in the first thing if and you are basically becoming the Stack use they have to run the script command so you can get Two things and then you should probably Think you have to run rejoin and then everything should be good What we added that Yeah, I think so yeah Did everybody had a new speed stick? They're just busy with extracting starting Okay Very good, so let's see if Yeah, yeah, and everybody yeah everybody gets a sticker put it on your laptop so we can see it Yeah Yeah Active passive is coming with a one we have to have a patch which works, but it didn't make it for for liberty So we have yet of the so when we so we wanted to have Octavia be the reference implementation and that meant So every week we got closer to the deadline. We had to throw stuff over so to make the date And so so yeah, we yeah, we have it all written and there's even active active proposals out there So yeah, so it's all oh the you are I still have to upload it. I will tell you guys you are a The slide do you know I don't have to slide. I only have the That's the wrong direction wanna Want to go back to that one? That's light. Okay so are we at with Some so has it now running so we should get started the slides. Yeah, I need to upload them first Yeah, I will They're not uploaded yet. So I can make So if I learn something which is wrong, I can still fix it So they will come up after this session Okay, so we are are we at people have it now running? Let's move Let's move to our the next slide So once you are the admin user Then we can do stuff. I'm too fast. I need to go one back. Okay No problem. Yeah, it actually takes 10 minutes. So unpack the test it out Oh Okay, we're gonna we're gonna give it another two or three minutes for people to get their images extracted and running and If you haven't got it running by then, you know wave your hand if you're having problems or whatever We'll try and come and help you we have again several people here who are just here for tech support to help you get this demo working But otherwise we're gonna have to proceed or we're not gonna be able to finish this before the end of the session So so another two minutes. Oh, that's true. Okay. I don't know that you want me to have a microphone that your mom normally not Yes, it's part of the official Liberty release And though Though we are not really in the in the graded release So so they changed everything when they went to open tent, but we released it for Liberty And and if you if you use neutron albass, you actually have to use Octavia because it's not a reference implementation Yeah, I think that's 0.5 something the version we 0.53 or something like that is what the version of Octavia. Yeah, the Octavia version And it's up on pips you can pip install it or whatever and the I think the DevStack scripts automatically pull it in when you do the albass v2. Yes Well, since you're here, you can use the DevStack we made for you That's part of the image. Yeah, the whole reason for the image is that it you know It normally takes two hours to stand up a DevStack like this So we wanted we knew we didn't have two hours for you guys to wait So that's why we have a VMware image for you with a frozen or saved state So, yeah, let's go ahead and proceed. It's been two minutes next slide. Yep Okay, so So the first thing you do is know our list because we put two web servers on there which listen on port 80 and And anything more user demo that's wrong. It should be user admin These these web servers are extremely basic if you want to go ahead and curl one of those IPs You should feel free to do so and you'll see what's going to be output is basically. Hello. I am this IP Or welcome to this IP. I think is what it says But the whole point of that is you want we just have a really lightweight these web servers are simulating your application environment So, you know a load balancer, of course is load balancing stuff So this is the back end those two IPs are the back end We've already set up for you. Yes, and now we'll go ahead and do the load balancing services So so so who for whom novelist worked? Working okay, cool. We're gonna hear to awesome awesome. Yeah, so you need to remember those two IPs Yes, and then we start then we start creating a load balancer So the command to do that is neutron L bass Which indicates is a albass v2 thing albass minus load balancer create Then we give it a name L be one and private subnet is you can specify any subnet But there's only a private subnet there. Yeah, this is very basic Dev stack and stuff Then there's one trick you have to wait until it becomes Active before you can do the next thing. Otherwise, it will give you arrows If it doesn't become active race, yeah, shout out and we send one of our lab tags If it takes a while, yeah, so so basically what what he does It's starting a nova VM inside Dev stack which takes takes some time So it's firing up an OVM and it's not a tiny one like zeros is it's a full of boondoo thing Yeah, really straight now work. We are always thinking making it smaller, but it's difficult I Yeah, that's that's why these yeah, yeah, I try to send out and I try to get an email send out and tell people don't come with Only five gigs of memory Yeah, we recommend 16 or more Just this test before yeah, that's when we develop, but I think it should should work Okay We will post a presentation so right now the presentation is not online right now But German is going to do that after this presentations over if you if you want to make sure that you get a copy of it Just take a picture of the image, but okay, we put it online Yeah, we put it online that there's a link for us to post it in the schedule app and we will put it there of the presentation Yeah, we should have probably done that before but then we couldn't fix any errors like that says demo here Yeah, which supposed to be admin Okay So this is a gift in the Google thing you can do it in your computer We'll give that for you guys in a minute and make sure whatever you post that you fix this thing. That's as demo and Coming yeah Okay, so somebody has an active does it turn active for somebody so so check is basically you do this with Neutron albath load balancer list or something Somebody has an active load balancer. Yay Good, then then you have to create a listener as as written here then a pool Then a member that's one of the reasons why in Vancouver people said we should implement a function Which says give me a load balancer Yeah Brand wanted to do that what you're seeing here is actually sort of the hierarchy of how the objects fit together within The load balancing version to service. So load balancer contains the IP Yeah, the the listeners is basically the port and the pool is Where we're going to load bounce to and members are members of the pool. So those are the machines that are actually Make up your your loop balanced Yes, and if you're familiar with what the load balancing API v1 was you couldn't put Multiple ports on the same IP and we changed that for v2 So you can now have like something which listens on ad and for for free Which is very common so you can redirect people or have protected content non-protected content under the same IP address Which is very useful Well, there's lots of other hooks, but we'll talk about that tomorrow in our talk. Yeah as to why it's so much better Yeah, no, it's awesome. If you're if you're getting stuck on anything Raise your hand. We'll send a run over there to help you out Can Carlos? There were some hands up there in the back. Could you raise your hand again if you're out if you're stuck? Okay, I'm trying to get some people out here Maybe need to send Brenton forward Yeah, if you're having problems the first thing to do is to check to make sure you're running the commands in the right order because it is a hierarchical Data structure, so yeah You can't really start with making a member before you have a load balance or something like that So it's really important to keep this order once you have stuff you can do whatever you want and it just changes the idea Question good So they get a big question his question is can you do this without Octavia and yes the the Elbas v2 from the prior release uses the HAProxy driver and The what Octavia gets you is scalability and the potential for Active passive active active fail over those sorts of things and yeah and scalability What's happening here is for every load balancer you create for the listener you create you're getting a VM that is running its own HAProxy Rather than sharing HAProxy's and just adding configuration data to the Yeah, and you can of course use the same stuff and have a hardware load balancer behind if you buy an 810 from duck in the back Oh he already left yeah, so So yeah, then go go buy a net scaler Yeah, yeah, okay Is somebody had a chance that it went to the curl curl the VIP and Something came back Somebody was able to curl the VIP No Takes a while Tending great It's stuck in great. Oh, maybe took too long So somebody yeah That's that's usually my specialty just so so it can happen when you have a very slow computer Or not enough RAM that it got stuck in pending great But what a system does it it files up DVM then starts to talk to it But there's a timeout which we kind of did generous But but for small machines not generous enough so so then you can get stuck in pending great and Yeah Hey with two people survived it's like survivor That's not a good sign German, it's not a good sign. Yeah, we need to get better. This is the previous page again Sure. I don't care. It doesn't matter. It doesn't matter at all. We're working on that We're getting you the slides Yes, I know yeah, we know Okay, we put a slide Adam was kind enough to put a slides up And they are now at Put at the top The slides are up there. Oh nine. So it's so it's googled and oh nine C5 J and I hope it's a big. Oh Okay, that's not a zero. That's a large lower case. Sorry uppercase. Oh, yeah Okay, so we keep going When are we done actually Soon, yeah, okay, okay good So we created a load bar and so if that worked for you So here are some things about how to get information about him for Raise your hands up high so we can see him. There's only five people who got it done Yeah, good. You want to go to the previous slide again? Okay? Sure, you should do something like that. Yeah, probably. Oh sure If you're stuck on something, can you raise your hand? Okay Yeah, if you need some help, okay We have a request for the first page again. Okay first page again You want the slide there is that what you're after okay good, okay? Yeah, I don't know I Yeah Yeah, that's that's a little bit beyond what we're doing here, so I Think he's we're gonna have to go on Unless you want to get on there and fix it for him, so Okay, you know move it back now Okay, okay, so so so let's move on and how you guys can get information about a load balancer or about the m4 so we So the m4s are normal Nova VMs here, so I can do a nova list and you can say minus minus name and four And it will only show you the m4 because that's how we name them then you just see those And and that's a way for you to control what's going on in your system. How many m4s have been created by users whatever so These commands here aren't normally as a regular user You're not going to be doing any of these but this is this is actually hitting the there's an API that lives on the m4 itself And this is how it's accessed so But for the most part you're not going to be talking directly to the m4 is you're going to be talking to the Neutron LBAS version 2 API But this is basically when you need to troubleshoot your system And want to get information what's going on there So in each m4 we run a little agent and and those curl commands allow you to talk to this agent and find out What's running on there with how many load balans how many listeners what the details are and also Learn about those things So since it's since you want to get you guys up to become develop us We show you all the yeah This is as a cloud administrator or operator You might run these commands as a tenant you would never run these commands. You wouldn't have access to them Yeah, you got to be on the management network to do this. These are troubleshooting only So you should try them out so somebody guess we move on to the next one Okay So since they are usual So so since they are normal nova VMs you can SSH into those and figure stuff out and In the DevStack once we have SSH enabled so you can actually do troubleshooting and development If you're deployed in production, you might want to switch that off so to have one less thing for people to hack you But but but here you can basically SSH in there and we have And and what we call and on each and for we fire up something we call an agent and You can figure out a status of it by just running a normal service and for agent status You know, so check on and the H a boxy so so for each listener you create we start a separate H a boxy process So if you put one listener on it like we did in the exam, we've seen you should only find one H a boxy process Maybe that too because but yeah this But and we basically name them with listener ID and if you put two on them, then you can can check the status of those two It's another thing for troubleshooting. Yeah, this again is not something a tenant would normally do Tenants will not have access to log into the M4 as at all The the M4 is are basically hidden from the tenants in the sense that This is that's just part of the back-end service and the tenant doesn't have to worry about any of this stuff This is all for a cloud operator or a cloud administrator to troubleshoot if there's a problem These are things that you can do to get into the M4 as and poke around and see what's going on Then we wanted to show you another thing so we have and And that the face so we have failover in the current version the failover is not instantaneous like we are wanted to be so it basically it has to detect it and then schedule a new VM Yes, to to bring it up, but But but nevertheless If you if you kind of deployed it in a production environment, you're the cloud operator What happens a lot is there might be a security update for the Ubuntu image we use for the M4 when you want to kind of swap out the Images with a new M4 image and so you would have to instigate a failover and And the way you do that so we have to prepare that a bit so we have to set up health monitoring in the VM so so there's a so we have sort of still a Buck in there that we don't can't listen on the Management network with so so basically our our health manager wants a screen session in the dev stack And so it can't really listen that easily on the management network. We have to Have to make that better. So you have to put in the As a controller IP port list the DIP It has on on the on the network So so that's better when you do one if config that's the IP VM where we'll assign to it And since this is a baked image that you're all working with it should probably be that IP Right. Yeah, it's in the baked image. If you do your own stuff, you got to figure it out. Yeah So we then want to restart OCW which is the controller worker This is this is if you've never messed around with dev stack before this will be very confusing This bit but if you've done work in dev stack before then what you want to do is find the screen that has a controller worker on it And restart it after you have made this edit Yeah, and then basically you want to restart a controller worker on a sort of health manager Then we then you have to create another load balancer So you should have that in your shell Back thing history. Yes bash then Then order to basically get a failover done you would as an agent in the M4 at least on dev stack Stop the agent and then the thing will automatically fail over for you because that the way our health monitoring works is we said so so when you fire up the The M4 or VM or VM sends every couple seconds a UDP message To the health monitor and then this UDP message doesn't arrive for some time Then we assume it's dead and it's get you a replacement So that's the algorithm there. So what this is doing? This is just basically simulating a failure of the M4 by killing the agent on the M4 that sends the health check You know updates to the health monitor when that stops the health monitor goes Oh, that thing must be dead and then it schedules a replacement Schedules stuff. Yeah, it's the heartbeat system. Yeah, exactly heartbeat and can configure all those things. How long it should be You know, whatever. Oh, yeah, that's all it's all configurable, but For our purposes here that you know, just read the docs if you want to know how to configure that This is already a very complicated demo, we know so if all goes well when you enable the the health monitor and And then simulate the failure of the M4 that you have running The health monitor will notice that that's dead. It will kill the old M4 and start a new one I would first start a new one. Oh, it starts a new one. Then you sorry. Yeah, so it starts a new one Then kills. Yeah, still some new one then kills the old one And that's that's how we this is the very very most rudimentary version of high availability for this service There are again in the works We have very close the active standby which would have two M4 is running per per load balancer And what those M4 is do is they act in a active standby? Configuration using Linux heartbeat, right? It is Linux heartbeat. We're using it. Okay. Good and People liked it. You're right. I'm sorry. Yeah, and and what happens is if if The basically the M4s will monitor each other and if one of them dies Well, then the other one will take over Yeah There's also again a blueprint in the works right now to do active active mode Which is considerably more complicated than any of this stuff But what's really cool about active active is it'll allow for horizontal scaling of the actual service delivery The question is just Those virtual machines The load balancing virtual machines. How do they get patched or life cycle managed? I mean they are running for The time how long the load balance are is defined, right? So basically if you want to You want to patch them then you would have to instigate this failover we are describing here So you would have to manually or have a script Basically, then goes to each of them and shuts off the agent which doesn't shut off the load balancing And then the system will detect that and then replace it schedule a replacement for that and that will allow you to basically roll out new M4 images and and Yeah, we so it's a little bit pedestrian But we show the process and you can then use something like a fabric or if you have any of those tools To kind of automate that that replacement So yeah, the it's important to note here that the M4 is Part of the the stuff we didn't show up show you guys when it comes to setting up Octavia, which Which takes forever. That's why it takes a long time Which is why we didn't do it for this demo but part of the process of setting up Octavia to work is you have to create a Nova image that you get stored in glance which is the baked version of a the Octavia Amphora which is so so Yeah, so well, but then it's we store it in in well in glance ultimately Go ahead. So just really quick the version of the slides that are that's uploaded. I guess is not the newest I don't know where if you guys made changes locally or something but But anyway, so the only real thing to take note of on the version that's uploaded is it says something about using Demo demo as the the user with open RC ignore that just do with admin do admin admin Yeah admin admin all the whole demo is running under the admin user Right now. Yeah, doesn't have to be that isn't like it was just how we said right It's just how the image is set up I'll see if I can reupload if I can find whatever version this is of the slides and reupload Okay, okay Okay, I Have any questions from people at this time So, yeah, that's why we showed a similar this fail over simulations. You guys know how to replace those Images if they're problematic Go ahead You mentioned that you support also besides which machines Containers to or later on the code is the code has worked in and so one of the things you have to understand about Octavia right now is it basically does all the features of albass version 2 without a whole lot of extras yet because all those extras We have code that's actually under review to do that But we put it all on hold in order to get this to the form that it is now for Liberty But I expect within the next two months We're going to see a lot of these features start landing and in metaka This is going to be a much more powerful system than it is now But among among that there is code in the work to make it so that it works with containers, but it's not landed yet Well, I guess my bigger question is Why did you not focus on containers first and my question to that is what was the actual big limitation with the namespace stuff? The limitation if the namespace stuff is that you don't get the high availability or scalability Or the scalability source of the namespace when your compute node when a namespace runs diced and all the load balancer They are gone Whereas as here when you are VM dies we schedule a new one in case you have a redundant control plane and And as a scalability when you put a thousand load balancers on one compute node And it kind of boxed down to us here. You can use just buy more computers. It gets scheduled in your cloud Well, maybe I'm wrong, but don't we have for the for the router? Router agent this schedule thing where you just have multiple Router nodes and then they just scale up multiple namespaces with the routers there Couldn't we do the same thing with load balancing? So the other the other issue you're going to run into which is more subtle Is that when you start talking about? TLS and transport layer security that is a process that does not scale vertically very far Typically the most the the TLS termination stuff is done with the open SSL library and The process that does that is mostly single-threaded. It's supposed to be multi-threaded But it's mostly single-threaded and since processor speeds are not increasing You'll find that with the modern standard of having two or four eight bit keys as standard SSL key You can do approximately Between a hundred and twenty and a hundred and sixty new connections per second on a modern processor and it doesn't matter how many cores it has That's that's going to be a hard limit on the actual and that that has to do with how it does the SSL Decryption when you're doing TLS termination So the idea behind what Octavia ultimately is going to be delivering when we get to the active active mode is you can have multiple M4s that are all servicing the same Load balancer IP so you can then have truly Horizontally scalable service delivery and then you can actually have you know an SSL site which can do thousands of new connections per second You're not gonna be able to do that using the namespace driver ever it will never happen Because the TLS is broken. Yeah, that sounds like it. Well, but it's been broken for years and years And that's kind of by design. You don't want TLS to be too easy to break So you the the whole point behind TLS and the reason why they keep on increasing the key sizes that are standard is you want to keep it somewhat difficult to break that so the trend is I mean, what is it four years ago? 1024 bit keys were standard and then they decided sort of all of the SSL certificate authorities decided Nope, we're all going to do now 2048 bit keys Well, we had a five-fold decrease in performance when that happened overnight So when when these authorities then decide that 4096 bit keys are the standard that everyone has to use you can You can expect another five-fold decrease and if you think that you know 120 new connections per second is bad. Imagine when it's down to 20 and that's what you can do on a single core So the idea was you need to be able to scale this horizontally Correct there's also the namespace driver is a shared Environment so you end up with things like noisy neighbor effect with your load balancers Which this eliminates to some degree because the you're using these virtual machines or containers or whatever that have their own And they run and compute right they run and compute and compute is designed to handle this Yeah, so to limit the effect of noisy neighbors on a host. Yeah, so You don't run into a case where you happen to be provisioned to the same namespace Agent that's running five other people who are doing a ton of TLS and now you have like no No connections per second of it because mm-hmm. There's other people using your resources So those are the and then that's really the reason why Octavia was even created the way it was Okay, and then to us as a question about the containers why we didn't do that first Well, we also have the the company is sponsoring stuff have certain roadmaps and we So when the m's was just that the thing we wanted to first and contain our second No, no, no, we can use containers only reason we didn't do containers what it till S prevents us from using the The namespace driver, but containers we can also schedule on different If you do active active can schedule in different hardware and scale it out that way Yeah, yeah, we will schedule them with with whatever. Yeah, we will do it once we get there So it just it just had to do something first because we wanted to do this active active And then all of us have no BMS and now we the people sponsoring HP and wax base We are trying as a component container thing and we have to and then we're pretty good as a development team But we're not you know, we can't do this overnight So it takes a while to get these things developed and and we're trying to we're trying to also get stability here You'll notice that this this is a process that can go wrong in many ways So If you look at it, we started like Yes, thank you Suzanne, yes Oh, yeah, help us out, please and and also for an open stack project we Developed it all pretty rapidly. I think how long we're doing it now one and a half years maybe and and we got a new API version we got the whole Nova stuff in our scales with active, okay Okay Part of the open stack Kind of ecosystem. So as you contribute to Octavia, you do get ATC Status which we want So the other thing is I've been requested I didn't catch your name, I'm sorry But I've been asked to tell how people could potentially contribute So I'm gonna show that one There's a lot of stuff here in this in these slides that we're not getting to right now So the the first thing you can do if you want to contribute I would recommend start attending our weekly IRC meetings They happen at what is it? Wednesdays at twenty hundred UTC which is a being you know 1 p.m. Pacific or you got it there. I got it all here. So 2100 UTC It's one Pacific so most of us in yeah, half of us are in Pacific But we are trying to get more people have an IRC channel. You can find us open stack minus albass So we try to be there pretty active and hang out there a lot There's a design session for neutron albass version 2 and Octavia and firewalls of service and a couple of things Tomorrow morning at 11 a.m. So come to that We are also giving a more extensive talk on neutron albass version 2 and well What is in Liberty and what we're planning on doing after Liberty? That's tomorrow in I believe Well one of these rooms here you can check the schedule, but it's tomorrow 40 to 40 to 40 Thursday Afternoon, I should have okay, you're good. Yeah, so other than that, you know What we specifically need help with of course reviews and code You know so that's and you can best know how you can contribute in an effective way By also making sure you engage with other community members here the people you see walking around or people who've been doing this for a while and And and again the IRC channel and the IRC meetings are probably the best ways to get started there Did you want to go back to some of the other troubleshooting slides other tips and tricks? I should probably do some more tips and tricks Sweet we have plenty of time so maybe Yeah, we can show you guys more stuff So so there's so so another troubleshooting tip is the log files so when you get it straight we Basically only lock stuff info level and up or not even that things just our level around fatal and you can basically Uncomment debug equals false and put that through then you get all the debug logs which are huge or You can verbose then you get the info locks and In order for the lock for in order for any change to the Octavia config to become active after restart all our Bosses all our bosses is services starting with all like CWs the control worker and so on So did you actually want to go back to the you can go back to the architecture screen and show them what they're working with here So yeah, I realized that some of these things like OCW and OAPI and you know whatever that's might not be that easy to understand Exactly what we're talking about if you look at the architecture slide, which we can pull up again here This is something that Michael created which is pretty good All of those things Usually correspond to one of these things so anything that has a driver usually hasn't Basically a Damon that goes with it so Sorry Octavia API These sorry these the damage here the API is its own thing. There's the worker and this is The the most of the work and the controller gets done here We have our health manager, which is of course watching for any dead M4s or problems with the M4s and then they have a housekeeping manager which handles the cleanup and scheduling of dead M4s or Managing the spares pool if you have configured a spares pool Do you want to I guess we can go back to the other slide now? Sorry Yeah Then we did the certain lock files so they are all in opt stacked locks all starting with oh, we have OAPI Wish our API server CW the condor worker HK have Those are all housekeeping and HM health manager and here do and here Down what the locks contain or API specific API So you see the requests coming when you do a neutron albass command which one say new request you see if they're coming in That don't come in then you know where to drop a shoot OCW does the most work. So so that starts The evidence and talks everything CW stands for controller worker that the housekeeping thing Will mark basically lead M4s out of the database and later do certificate rotation in M1 so then so basically we we secured a communication medium for and control our worker with certificates and one design principle Coming from the anger people is that we should rotate those certificates very rapidly So people when they own an M4 They can't really get much further because you just rotate the certificate and then they you're talking about if an M4 has a Security issue and somebody breaks into one. Yeah, or if the certificates gets exposed and just rotate them out Yeah, we limit the exposure that we're doing that So we want to Automating that and then the health manager is the part which will check on the Basically gets the heartbeat so each and four sends out heartbeat you get them sort of in a database and do stuff The other thing we sent over with the health manager is the status of of your members or your listener as it gets Recorded and put in our database. Yes, and we need to That's one thing we need to think how to get that then put up more upstream so people can monitor that. Yep There's a lot of work to do still on this, but then we also have log files on the M4 So so if you develop and things and you want to develop against the agent Which was running there can also SSH into into the M4 and look at the agent logs and see what's going on there See if the M4 sending heartbeats that will be locked if the API I will lock the API commands and what the error messages on everything So so if you want to develop on that and you you probably need to log in there We also run a database, which is my sequel right now. We might rethink that anyway, so So we have our own database Octavia database. We have tables lots of tables in there The most important ones is the M4 table which has all the M4 information in there with IP address and everything Then we have of course low bands a listener member pool health monitor What you would expect that mirrors to some degree what we have in the albass v2 site on the on the neutron site in the database Yes Another important table is the M4 health table. I said every time we receive a Heartbeat from the for agent be stored it in there. And so it will tell you the last time you get a heartbeat from that correct, and so yeah So, you know, so if you need to troubleshoot your system you're operating suddenly everything goes crazy You can see if there's something going on there and You talked about how to contribute and more questions Yeah So right now the failover isn't that fast because we have to we have to so basically we record a heartbeat So we would have to wait and then we wait a little bit until the hot we didn't see a heartbeat So it's so it's in the minute. So maybe a minute or something Our current failure was strategy. That's why we want to do active active passive So we can fail over much much faster and like something like seconds. Yeah, so right now It's it we have to it has to be dead for long enough for us to notice And then it's the amount of time it takes to either take over an m4 from your spare spool if you have one or to spin up a new one And then configure configuring it and everything once it's up. It's actually very fast But it's a matter of waiting for Nova But with active standby with the m4 is monitoring each other we expect to have You know sub five second and it all depends what so so if you're like a cloud service provider It's probably okay to wait a minute for failover if you have a if it's a deaf environment or something like that So you so you might not want to spend the money on having two VMs running and so we give the flexibility What do you want to do? So? Yeah, so yeah We want to do to active So you're talking about like an active standby scenario where both of them go active at the same time that Yeah, it is so the nice part about load balancers as load balancers are not that stateful So recovery from that matter is easy. You just kill one of them And then you haven't really lost any data you just have a service. It's very very well It won't work when they're both out So so so the way active passive works is that we have then two load balancers running They talked to each other. They share some sticky connections And we'll do a demo tomorrow at a talk or Michael do the demo where it worked But if both would become active you still have the problem that there doesn't IP address Which has which only points to one of them. So so it doesn't really matter if the other one is active, too They might just Get confused a bit, but but you only have one which actually serves request I mean, it's obviously not a good state to be in but in honesty all honesty the router is going to send the packets one way or another so You know having said that there is a blueprint in place to try and do active active because that's really truly where we need to go with activity in order to truly deliver Horizontal service scalability in terms of the actual like TLS terminated service And that is going to be quite a bit more complicated once that's done But in that case, it's actually safe for it's desirable for multiple and for us to be serving the same IP Yeah, and see if it's very excited about it since he's No, that's it's actually a blueprint that's Basically authored by the IBM research team from hypha and they are working on it right now In fact, we're just they're doing proof of concept code. Yeah, I've already told you guys that you should be reviewing that blueprint. So But it's coming You know, there's there's a lot of stuff we have slated for metaka The good news is a lot of the stuff that we wanted to get into liberty that we put off so that we could get Octavia in As a reference driver for Elbas v2 That stuff is really close. It's seriously just weeks away and then that's gonna land So I know at blue box. We once that stuff is in and stable. Well stable enough We are going to almost certainly be running off of off of head until it's actually in the official metaka release and From our perspective a lot of our customers really need features like Active standby they need TLS termination and they need layer 7 Switching support. So those are all things that were a high priority for us Which is why they're happening is because we we put engineering resources behind it If there's a particular feature that your company needs And that it doesn't offer it and it isn't on at one of our roadmaps then come come to these meetings start contributing And and you know, we're not here to try and obstruct anyone from being able to develop new features at all We just you know, obviously we're all we all have employers We know who's paying us and we know what they want and that's what we're gonna be working on first Yeah, the other thing to keep in mind so we just so so if you go with the liberty release We really put it in so so there will be a lot of hardening which still needs to be done Yes, so so we expect so we haven't so so we are all very excited and trying to put it into production But we haven't done that yet. So we don't really know if it works the way we are envisioning it Did you have something there Suzanne and so what Suzanne is saying is that we're awesome Which is completely true about our entire team except for me. I'm a jerk So I will I will be sure to minus one year commits initially, but that's just how I show my appreciation No, it's all good. Hey, so more questions. Okay, that's actually not part of this demo, but I'm glad to ask So so so the problem with TLS certification is that there's one tiny bug right now, so it won't work but we have We have a web website which explains exactly how it works. And I okay was just working on that last week and So it's our on our wiki page when you Google Octavia SSL or LBAS V2 SSL Then that should come up. I want to see if it loads again. The internet here is a little bit spotty. Yeah You know So how to create TLS load balancer and I was on the last week and and try to fix By the way, we are certain that you're gonna be able to break this in interesting ways that we haven't thought of Please let us know when you do because we want to fix those bugs One of our biggest goals, which is again part of the reason why we put other features on hold It was we want we wanted to get this into Liberty so that people could start using it It's really important for us. We want to see people adopting this so Please let us know when you if you have any troubles with it. Let us know what those are We really appreciate any feedback you can give us Yeah, the the other thing you keep in mind when you want to do the TLS stuff You have to add Barbican to it and so you have to do this at enable Barbican Which we haven't done in your image. Yeah, your image isn't able to do the TLS stuff this We didn't do that part of this demo So yeah, but all you have to do at that you build your thing and then then you can do all the commands there They walk you walk you through and I went for that last week found the bug which we still have to fix but Okay, more questions How many services how many local how many look How many members Many members we can do There's the probably is the limit because there has to be but we haven't explored that so I think We've using a jbox is the underlying thing and I would guess you can do a Run out of RAM or something as he said Sorry, are you what was the question there? That was how many if you have a maximum number of members you can configure No, the members the members don't take up really any more hardly any resources I've configured not in this particular situation, but in in previous It's it's all HAProxy based and HAProxy can handle literally hundreds of members and be just fine Yeah, so there aren't that many installations that are using hundreds and if you are using hundreds You might want to consider having more than one boat balancer. I'm just saying so The which don't pretend it the Oh, yeah, so the M4 is here that you see are not shared between tenants And that's for security reasons because the M4 is actually if you if you use when you configure the members again The member the the screen where it talks about to configure the members you can specify a subnet You're going to connect to and the M4 will actually Create an interface on that that subnet and we don't and since that can and usually is going to be a private Tenant subnet we don't want that exposed to other tenants. So that's why the M4s are not shared. I mean If somebody wants to try and create, you know an architecture that works with multi-tenant M4s They're certainly welcome to write that blueprint. We will absolutely tear it apart But that's a good thing you might actually have a you know something that we didn't think of that Oh, we could do it that way, you know running it like network namespaces within the M4 or something crazy like that you think separate Mostly though when you consider that the you know We have to try and keep all all different tenants in mind with this stuff And if you're talking about something that's going to be very large, which again our activity is aimed at you know Making sure we can definitely meet the very large use case They're not never going to want to be shared But if you're talking about something very small then then there might be a good argument for having you know Very small tenants might want to have a share You Come on you should talk in the microphone We have but the people who are listening the recording okay, okay, so so basically the Basically in Venco we talked with the Nova team that so basically have Localization hints and one idea there is that you can put the load balance your load balancing We have on the same rack or the same via or the same server than your members to kind of cut on on database And that's something we want to do when a Nova has that we have time to look into that But that's opens up the topology thing. Yeah, the topology thing. I don't know if it came in liberty Because we are so busy getting Octavia finish. We didn't do anything else We could definitely use more engineering resources if you're not getting that impression Let me just say that one more time if you want to contribute we would absolutely appreciate it Do other people have any questions they would like to ask before we're done here be your only source sort of eight works You you specify the The network the IP is on so when you when you do load bands are great You give it a subnet that subnet it will create an IP on if you I In theory it should work with the public subnet, but I but I don't think we Be supported. It's not a plan. So so what we have planned is since Since there's different needs for load bands or anything if you do a load bands in front of your database Then you don't want to be public so then put it on this network You need the public and you put a floating IP in front of it So but in theory you could if you really wanted to be Specified a public subnet and then it will block it and maybe work. Maybe not work Well, well, I guess we'll stick around here for another 20 minutes until this thing or 15 minutes until this thing is over If anyone has any other questions otherwise Thanks for coming. Yeah, thanks for coming guys and And we should have made it more clear that you need bigger computers, so Yeah, I'm sorry about that. So yeah, we would have known we should have sent it you a few months ahead So you can ask your manager to buy you one for that