 Okay, let's do this Schumann tag sehr geehrte Damen und Herren ich heiße sie herzlich Willkommen im Hursau beacht für meine Präsentation über die Lektionen die ich gelernt habe in meiner stelle aus OpenStack Administrator bei IBM in unsere spectrum scale abteilung Or if you prefer welcome on her guests Ladies and gentlemen to room v8 Where and my presentation on lessons learned from running a multi architecture OpenStack environment at IBM in our spectrum scale division So I'm easily distracted. I don't know about you. So for both of our sakes Let's talk format in this talk. I will be presenting four lessons that I've learned over the past year They certainly aren't the only ones I've learned nor are they necessarily the biggest ones But perhaps the four I could use to create the best story with in fact You may you may already know all these lessons But each lesson should have a distinct story arc We'll start with a back story to identify a problem And then we'll name the associated lesson to be learned Or however that was learned and conclude with a solution or a resolution to the issue Hopefully we can follow this format Spoiler alert. We probably won't exactly Lessons are, you know essentially in chronological order, but by definition a back story is going to relate some events from the past Hopefully it'll turn out to be one like one big story that you guys can follow along with There'll be audience participation will raise hands. It'll be great fun. Trust me Let's start at the beginning lesson one back story so a little over a year ago I Landed in a department at IBM systems specifically in the storage business unit Why where I inherited an open stack set up that was just about to go live they had They only had a week to go before they press the go button one of the first things to go wrong was You typed or you press the launch VM button and VMs refused to launch This turned to be out turned to be out a problem with the Linux bridge agent falling over on a compute node and giving this error Too many open files Anyone here using Linux bridge? Yeah Anyway, yeah, that was the promised audience participation so That was our first lesson Linux bridge needs some more files, so How do we resolve that we increase the you limit open files allowed for a particular user and or? Computer Yeah, so that was a quick lesson to get it started and you see how these go So the this problem only showed itself on our x86 machines and that is my segue into the next lesson All right back story here is That I didn't see that problem at all on any of our power servers Um anyone here heard of power architecture a couple anyone using it Well, there you go This talk is part of the hardware enablement track of the summit and IBM or at least in you know where I work a good bit of the hardware enablement that we do is around You know architecture in that being of power Power Power is a is the instruction set architecture IBM introduced 2006 you can read yourself Yes a little side lesson I learned I had no idea IBM actually open sourced the this architecture it it's part of Foundation the open power foundation and they announced a few years ago that it's going to be part of the Linux foundation So to implement a power architecture you need a power processor And a machine that it goes in power eight processors came out in 2014 That's what I work with power nines came out supposedly in 2017 They haven't made it to any of my racks yet power 10. I theoretically exists. It's like a Unicorn or not quite but I Guess my department doesn't have anyway. I haven't seen one yet But but uh one of the reasons we need power is because we need to test AI X Which is AI X stands for advanced interactive Excutive executive Which is another Operating system that was put out by IBM way back in 1986 As based on Unix and it was released alongside GP GP FS Which is the global parallel file system which got renamed to spectrum scale, which is Where I work and why we still need to support AI X so that Lesson two which I did not know when I got there was that hey the open stack runs on all kinds of stuff it runs on x86 works on this crazy power stuff aX is in a platform and Have exactly we're going to get AI X comes in a minute Z IBM's mainframes I don't have one and I haven't done it myself, but I have it on under good authority that open stack will run there too Our lesson learned is that there are multiple architectures To get to AI X what we're going to need to do is oh I forgot to I forgot my other joke I was going to say that this was a picture of the first machine that ever ran power Which it isn't at all. I don't know what this is a picture is a picture of is when you a IBM just like can't do this huge file deck or slide deck in with all these you know images that are okay to use and Everything's you know tailored the way you should do your presentation And I think some of the some of them are pretty hilarious like this one So I was going to tell you that this is a guy that wrote power VC. He's not he but he looks like a VC venture capitalist Anyway, haha. Yeah, so a power VC is a version of open stack I don't know if version is the right word anyway IBM took open stack and kind of turned it into or added a whole bunch to it calls it power VC One of the main things it uses is power VM as opposed to KVM or any of your other hypervisors that you would normally use and what that's going to do is Let you get a Let you get AI X running on in an open stack environment because it won't as it stands You know with vanilla open stack. I don't know how you can call anything vanilla an open stack, but um With plain all open stack you can run AI you know x86 just fine and power power will run Linux and power Linux runs fine and open stack The other OS is I forgot to mention it will run our AI X and I and so that's why we need to get It's not the solution it is a solution if you need to Run AI X and I turned my phone off. I was going to do an intermission now and take a picture with everybody We'll come back to that Go ahead and skip to lesson three So listen threes backstory goes back into power a bit more and I'm sorry I realized this isn't the place to you know go into how IBM IBM systems are set up but bear with me really all I want to point out is the fact that it's Rather similar to open stack. So all the you know system one system two system three those are your compute nodes the little colors are are supposed to be Virtual machines either running Linux or AI X and then you see HMC one HMC two those are Hardware management consoles which are essentially controllers like you're not supposed to log into the systems You're supposed to log into the controller and use it to carve up the systems into VMs So when in this Open stack environment that I inherited so you know we wanted open stack to do the VMs not the HMC So someone had gone through and you know set the entire system to one big VM but what that essentially was doing was Trying to get open stack to launch a VM within a VM Does that sound like a good idea to you guys? Well, it didn't work in this case Well, it didn't it didn't work for me on these power machines So what I had to do was go through and rip out the H the HMC's and get And get VMs on there directly put there by open stack So, yeah, I want to hear more about that I'll say this so yeah That lesson was supposed to be that nesting is not necessarily a good idea I did put in the caveat that necessarily was in there because we do nesting them So GPFS or spectrum scale, you know is a clustered file system software I don't know that you necessarily want to compare it to stuff, but you can think of it in the same terms and So we wanted to eat our own dog food this open stack environments uses GPFS as its storage layer. So then we test GPFS on top of GPFS Which does work we have seen some some instances of a bit of Corruption, but I haven't there down that down to where it might be or if that's exactly where it's coming from because I Don't have a I'm working on getting a second open stack set up that doesn't have it so I could have something to compare it to So this is my only like my second time doing a presentation at an open infra and I love doing this Everybody say open it for Thank you So do we have a resolution for that one? Yeah, that sometimes works All right Last one for the day lesson for back story. This one is a real tear-jerker So all the machines in our open stack installation had dedicated card switches for all the provider networks, but all the all the connections to the underlying OS, you know to get to the machine itself were attached to a wider network and they were supposedly on a private subnet VLAN I've already given away supposedly That private VLAN was not so private People showed up started setting up things, you know registering their IPs with DNS and the machines I had were only You know, they only had their IPs listed in the Etsy hosts file on a few machines And yeah, I don't so I don't think you can get me on this one Anyone had any luck getting two machines to consistently answer to the same IP? Yeah me neither So our lesson this time is about network isolation solution in this case is Basically a bunch of grumpy people that get to re IP their machines and it gets some kind of order going on Luckily my colleagues Christof Kyle and Juergen Weist were smart enough to avoid this predicament in our open stack environment itself So, you know depending on On the right those are all you can't tell from looking at them, but their GPFS nodes And depending on its role a GPFS node is going to be looking for anywhere from up to three networks So they did some setup scripts, you know create three private networks for each User and project. They are one-to-one as we haven't set up Jump posts down the bottom is set up. So it's got an outside line to the world, but On it is a cluster tool from there What we do is launch Cluster nodes on the back end that way You know not only or is network traffic not interfering with any other users Stuff But it keeps our paperwork way down because you know IT corporate security wants to know about every machine that gets going and India It has a public address. So that's what they want to know Not quite done, but so going forward what I'll be doing is trying to get We still have a lot of testing that we still do physical testing for all of our network card Testing you know fiber channel XCV me anything that we want to test out is all done Physically I figured that I'd like to get that into our open stack. I'm not I Assume that that is going to need some more network network isolation to get Back to I keep everything safe and sane With separate storage for those two so you're not trying to compare different kind of cards access to the exact same kind of Traffic because they wouldn't match So now Yeah, now right. Don't cuz you I Missed some of my my cues and blew through that a lot faster than I thought I would I Don't know if you guys have any questions If not No, we're gone. Um So no relink I think is what um power VC uses and I can I can imagine that that might that they might steer their own course on that one GPFS of course is storage. It had I forgot to mention Look through that note. Of course, it uses. There's a spectrum scale GPFS driver that opens that can Consume to use GPFS, but as for the um, you know, I'm sorry. I don't know about the Graham IBM's grand plan For Nuva. I am I'm afraid that is above my P grade Thanks, everyone