 Projection working right? Absolutely cool. So let's go ahead So today we're gonna You have to excuse me, I absolutely hate hearing my voice over speakers. So today we're gonna cover Just real quick who we are. I'm Randy Perriman from Dell. I am the solution architect for the Dell Red Hat OpenStack solution And I'm Nick Barsett director of product management for OpenStack at Red Hat And we're gonna be covering why Dell and Red Hat have an OpenStack partnership Why Dell and Red Hat do an OpenStack solution and what our reference architecture and the solution is a quick demo of instance HA and then take your questions So why Dell Red Hat has a cloud solutions? we Have a unique co-engineered solution because it's comprehensive We use Red Hat of course as our Linux. Sorry Red Hat Linux OpenStack platform It's extended for enhancing multiple OpenStack projects and when we do that work. We do do it all code upstream Finally one of the most important pieces is that we have a tested validated solution that people can deploy with confidence and repeat The two companies together have over 15 years of partner green making enterprises successful practical scale out configurations We have proven platforms hardened secure code and we finally streamline our OpenStack moving parts What does it take to make a the solution? It takes a whole bunch of people to do that We have the Dell OpenStack solution engineering Red Hat system engineering the combined Dell and Red Hat quality assurance teams Dell storage engineering Red Hat cloud practice teams the Red Hat OpenStack and the Red Hat tools team a lot of people go together To make this solution. It's not just two or three people It's a it's a series of teams and it takes both companies working together OpenStack for the enterprise What we are trying to do is come up with a way for a stable reliable and repeatable business We validate the product from end to end. It's a fully supported software. We have it's also scalable It's a solution to allow you to become agile We do have performance tuned configurations extended life cycle support and we have the expertise to compliment your in-house expertise and so One thing that is very important to understand is deploying OpenStack always start by configuring your hardware and I would say that this is generally the most critical part of any OpenStack deployment How well are you going to get your hardware ready before you start deploying and the great thing? We have been doing with Dell in these four version is we've been learning on how to Define an architecture that makes sense and teach our customers how to Implement it a reference architecture is not just a piece of paper that is nice to display on the web, right? Well paper web doesn't work PDF anyway But it's actually a shortcut that we are offering together and it's More than a shortcut in these four version. We've been adding features every time and Really right now in version four of JetStream. We now have a version that it's Larsen hello A very flexible very modular and we've added Lots of choices in the deployment and this work is going on you can see in the next few slides that we are going to be adding to it Right now What we've been adding in version four is three very key things first What we call guests or VM high availability? What do you do when a host goes down? OpenStack services. What's the API going to do if one of the? Controller nodes disappears and The most important thing deployment automation And the patching of the deployment and the wall maintenance of the environment because it's not Import just important to set up the initial thing. It's also important to maintain it over time So as Nick mentioned, we're up to version four. We started out at version three We brought in a 10-note architecture with 10 gig networking. We made it into a turnkey system It's all based off of the Dell PowerEdge R630s R730 XDs with Dell networking gear It was based at that time on the OpenStack platform from Red Hat of number six We also introduced neutron VRP active active high availability of our controllers and Support of multiple concurrent storage back ends In version four we brought in the additional Dell networking components. We revved it up to the s40 48s and 30 48s We added in the s6000 series at the same time and rev the OpenStack platform up to number seven We also brought in instance live migration and we've added some optional servers of the Dell PowerEdge R430s and R730s and Finally as Nick mentioned in the current release We're now adding instant H8 and we are going to be adding in the midi core enterprise metanet This is the taxonomy that we base everything on we build up from the very bottom at the physical infrastructure we start with the Dell PowerEdge R630s and 730 XDs along with Dell networking We build on top of that the Red Hat Enterprise Linux and then we bring in all the OpenStack components One of the pieces that we bring in as part of our solution that which is real interesting is the Ceph storage That gives you the ability to have an object storage or a block storage Across all the different tool sets as an across your virtual machines so Who doesn't know Ceph in this room? Not that many people as far as I can tell but really why are people so attracted by Ceph? Well, the first thing is the cost efficiency you can use standard hardware and Deploy as much storage As you want you can select the type of drives that you're going to deliver and You're going to get the performance the resilience and the scalability that you need based on your growth You don't have to prepay a given size of equipment To get to it in terms of resiliency You can now have Ceph be fully resilient on one side or on multiple sides you are in a fully software defined environment so you can benefit from a very rapid innovation cycle community Driven innovations so that mean that if you really care about a specific feature you can inject it and get it into the softwares later on and Really, there is no scaling limits to the capacities of stuff So when we went to go build 401 hand for oh we use certain target use cases We looked at today's apps developers self-service storage as a service We like I said before we based it on components from Dell PowerEdge servers and Dell networking We also have optional Dell storage of the PS which is the peer storage arrays or the SC Which is the compelling arrays? We have Dell and Red Hat engineering services as part of it Dell Red Hat open stack 7 We've got the features in there is the hope mate host maintenance mode a whole bunch of networking co-engineered Meaning that we work together to make that happen We validate this solution for you before it comes into the customer's hands and then finally instance high Availability is new to us so How does instance high availability work? so before this what we had is the ability to Migrate machines away from a host But if the host dies suddenly Life migration doesn't work, right? You need something is going to discover that your host has disappeared so that your VM can be restarted elsewhere We had this functionality called evacuate which is very bad name Inside of open stack. I would love to for it to have been called resurrect because this is really what what it is that existed that Allowed to Resurrect VMs from a fail host and reschedule them elsewhere It was working so and so but with kilo we were able to enhance that and Do a little bit more than that? We were able to drive This into an automated fashion by adding pacemaker remote Which is a very massively scalable version of pacemaker nothing to do with a traditional grand father? Well, it's not that all but pacemaker we're able to monitor hundreds of instances or actually physical nodes and detect within less than 30 seconds the disappearance of a node and Forced the call to the evacuate function so that all the VMs that were running on this Host that just died would be respawn and as you're going to see this respinning is amazingly fast There is one more thing That is delivered in liberty Which is going to make that a little even a little bit faster Which is the ability to mark a host down into the Nova database so that we are Really sure the rescheduling is not going to reschedule on this host that has died But Nova doesn't know about it right now We have to wait for a full watchdog cycle before we can do the respawning Once we've got this mark host on functionality. It will be instantaneous Already covered all of it a little quicker than the slide. Sorry so Moving on to additional pieces of high availability. We built the hardware around the idea of having it highly available. We've built all the Poweredge servers with ray drives for all the OS's dual network cards with Bonded that are bonded across the cards dual power supplies Networking is built in a leaf and spine architecture so that no single switch will cause loss of a data flow with VLT and VRRP Across the layer two or layer three The network is highly available every node has all the networks required for its functionality We've separated the solution infrastructure networks into various categories such as the solution private the public external internal for management Each category shares nicks Across the nodes with VLANs covering the per networking flow. It's extensible No need to additional add additional nicks. You can just add additional VLANs as needed it every Two bonded nicks for performance, I don't like the way that's worth it. Sorry The implications though is if you're using MTU sizes It is per category that are shared and finally we lock open stack neutron into VLAN mode for this release So I think we we can we can say that what we're delivering is a very stable highly available Hardware platform. Yes on which we are delivering Virtual appliances that are highly available. So highly available on highly available is Good enterprise grade right This is the logical network of the current release As you can see everything is across either bonds using veal's VLANs If you want more information see us we can get you a copy of the reference architecture, which is online so When you look at Rail OS P7 or the real name is read at Enterprise Linux open stack platform 7 It's a mouse for isn't it? It's based on kilo We have done a lot of work on deployment of it making it easier To deploy and this is ongoing work that we've been doing with Dale. We've Already mentioned that compute host high availability is now fully there. We've got neutron and Open vSwitch Security mechanism that have been enhanced. We've have the first version allowing a full IPv6 Experience we support incremental backup We are integrating with as a Linux That's has been there for quite a while, but we are making it even more robust and covering container technologies We have great alignment with open stack API so that the security matching process covers all of them we have read at Enterprise Linux 7 as 7.1 to be more precise and soon 7.2 as the host OS we've got Enhanced H a at the database layer for Mara DB using Galera and H a proxy as the front for it We've been of course as for every release hardening everything so that it is Benefiting from our experience in delivering security and a reliable environment and as all of our Versions for the past three cycles. We are supporting them for three years So the reference architecture combines What each of our companies bring to the table from Dale? We bring the best hardware for open stack our servers our switches our storage We're based like I said before our servers are on the power edge our 630 for the computers controllers Our 730 xd's for your storage We also bring in the Dell equal logic or pure storage arrays and the Dell storage center or the compelling arrays We base it all off of the Dell networking the s40 48s and 30 48s Which are the latest revisions and then red hat brings the best software We have the rail with the high availability and load balancing tools for open stack and then jointly we engineer this together We've architected design integrate optimized it. We have a reference architecture. That's got a balanced architecture. That's For performance to dollars. It's scalable. It's extensible and we have support for it We also use automation the open stack form and installer at this time in the future We're going to be using our deal manager and director all of this brings a great solution for you so Why is this industry a leading? Well, first of all because it's an open solution from end to end There is not a single piece of the solution that is a proprietary in any way Of course, we have to Customize just the deployment of open stack for the given hardware, but these are just configuration Work that we are doing. We are not adding any specific drivers that we are not Contributing back upstream It's also an environment that is very agile agile in the sense that You can grow it and you can scale it back down based on your demand It's allowing you to become agile It's also an infrastructure that is delivering the control that you need Because when you deploy such a solution understanding what is happening with it is a very key Element and it's also very innovative in the sense that you're directly every six months Or even more than that because we are having minor releases that are feature bearing Benefiting from the feature that on which we've been working which allow you to deliver email even more features to your users So these are just various design components that we've taken into consideration In designing an enterprise grade solution for you from red hat certified solutions that are into end It's in a great enterprise grade in that it's scalable secure open one stop We take our experience and our innovation and it allows you to have innovation without risk It also addresses many enterprise use cases and addresses the gaps that you are looking to have taken care of The components once again, we start with the Dell hardware We have a Dell reference architecture which outlines the basics of how this is all put together Using red hat solutions and we also bring in the experience of Dell professional services along with red hat professional Services side-by-side to make your solutions work So what's coming up ahead? Well, as Randy mentioned a little bit earlier Integration with the Triple-O implementation that we use inside of a rail OSP called director Which uses ironic for the discovery and the deployment of the nodes which uses heat for the automated deployment and the orchestration of all the nodes whether they are controllers or storage nodes or The compute nodes And we are adding a lot of validation Validation in pre-flight mode before we start a deployment We want to make sure that the environment is Exactly set up the way we want before we go into a messy process in flight As we are moving forward step by step into the deployment We want to make sure that everything has happened has it should be by the book So we are running tests such as service spec to validate that this is happening and at the end we want to be able to guarantee the customer that the configuration that is being delivered with does integrate completely and Seamlessly and does provide the performance that he is expecting. So we have a full validation suite at the end of course if you want to extend Your cloud or retract it. You'll be able to do that As part of the same tooling. So that's what's coming up ahead In RailOS P8 We are going to Allow automatic And in place migration Meaning people running RailOS P7 will be able to get to Version 8 without having to set up another cloud or do any kind of magic This is going to apply to Minor releases as well, but that's a lot less impressive in general because there is no API version changes We are going to improve this instance of high availability that I did the way I described earlier with this Marcos down Feature that is arriving with liberty There is network quality of service That is going to be delivered That's a key request from many enterprise customer to be able to straddle which customer is using what on their cloud The ability to do non disruptive backups, you know Ability to say Hey, I want to do a backup But let's not interrupt the work that is happening in this VM in order to do so. So this is a key element also For many enterprises that want to guarantee the security of their data Replication API is going to allow for The block replication to happen in a way that is going to enable disaster recovery scenario much simpler and finally new image signing and encryption mechanism ensuring that There isn't less chances that someone is going to come and mess up with the images that you've preferred for your tenants and that's Very similar to what we've been doing for ages with our PM packages, for example That's just a few of the functionality and before we get to the QA. We have this impressive Demonstration that ran you all can see this I up here what we have up here is I have Three four nodes here the large one on the left-hand side is a controller node and then three Nova computes this cluster was built up literally in the last 24 hours It is got the latest patches to run for us to do high availability So the goal here is I'm going to throw a panic onto the once Nova compute that is running for our I'm sorry on the Nova compute that is Has we have one instance right now? So if I do a Nova list as you can see I have a single instance running right now and Yeah, and it is running on Rich nobody want no running on Nova one. So let's go over here to Nova one and Do a echo and we just panicked Nova one By the way, this is back in Austin, Texas that we are connected to so it's gonna be slow here and there Within the next minute pacemaker will notice that Nova one As you can see has gone down to showing Nova one We're gonna see we should see some KVM over either on Nova two or three in a second show up I should say within the minute a second But yeah, we've got to wait for that watchdog to happen so that Nova doesn't reschedule on this on the wrong host Soon we'll be able to get rid of that and moved over to Nova three And so what happened there is pacemaker notice that Nova ones down it Connected all of the instances, which is the one and now as you can see down here in the bottom right-hand side We now have it running on Nova three in the meanwhile pacemaker is now restarting Nova one Let's see I have one on here for it. No. Oh Anyways, we're gonna start here because it will take about five minutes for the Compute node to come back online But what's happening now is pacemaker is restarting Nova one and bringing it back up and it will then bring it back Into the pool and bring it back online in order for it to be used I can do a PCS status and You'll see Down here at the very bottom Nova one is stopped We'll leave it sit in here in a few minutes. We'll check it in the meanwhile. Is there any questions? Yes Okay, so part of the configuration does use shared storage and that's actually one of the requirements in order for this to work Properly is using shared storage because when the host goes offline It has to be able to bring up a virtual machine that same virtual machine that was running and they'll reconnect it Actually, it will work with if you don't have any search storage, but it will restart from no data So it's right less interesting. Yes Yes, it would restart the image Whatever image you used to create that Particular virtual machine it would actually recreate it on one of the other nodes for you But of course you would have lost any data that was there with shared storage. You don't lose the data Or is if you're booting from From cinder as well Yes Yes Yes Yes, you can Yes At the moment. Yes says So for the external user they will lost connectivity potentially for up to about 90 seconds At this time As we Perfect this it will hopefully get shorter They're hoping to get within 30 seconds. That's all right in fact one of the Processes you see keep popping up here is the neutron open v-switch and if that's one of the reasons why it became Very busy is because it picked up the new networking when the node moved over but don't confuse this with live migration The customer using a service will lose connection It's not going to be reestablished automatically will need to Recreate the TCP connection to his last host it's not like in the case of a live migration where we maintain the full TCP flow Right because you've lost a host in this particular case Yeah We are not the only thing that is being Restored is what was safe to disk and TCP connection status is only memory thing. We don't do memory copy yet Well, I don't know if we'll ever do that Any other questions? This is the node rebooting I was wondering what was taken so long So I found it So the reference architecture is built on one set of platform the r630 for all the Computes controllers and what we call a solution admin host, which is what is used to deploy the solution with The seph nodes are power edge our 730 xd's We do allow power edge our 430s and or power edge our 730s as optional compute nodes But not controllers at this time Maximum nodes at this time is three racks worth, which is approximately 60 nodes Which is a lot of VM already And let's see Nova one is now online. So theoretically So again, it's not pacemaker that we're well for this particular function It's not pacemaker that we're using but pacemaker remote It starts with the same name as the same philosophy, but works very differently And it's a much more scalable design for cloud project Go check the differences. They are quite significant and we Well, we are using pacemaker for the high availability of the controllers Which are a small group of server within the scalability of pacemaker all-style for the in order to be able to support 60 node or even further than this we needed to Use another project and pacemaker formal, but which is called pacemaker remote And as you can see now that it's back online pacemaker is noticed it and now everything has it being used again a Hybrid cloud we do not have support of that at this time. Is it possible? Yes so if as Somebody else mentioned earlier. You are adding a tool such as cloud forms To manage your environment, then you have this hybridity enabled with the current solution Because cloud form is a tool that allows to use multiple providers one for open stack one for AWS one for VMware at the same time and relaunched the same operation on Any of the provider Simultaneously sequentially whatever workflow you you wish to be using however, you have to Make sure that you design your deployment of your application in a way that is Reusable across multiple clouds and for this may I think that's one of the reason why we acquired Ansible last week You can change everything you want We just guarantee simplicity So if you stick with the design so the goal of the reference architecture is to allow you to be able to come up quickly And effectively we've been we've spent many hours like I said the this demo was literally put together We had it running last week Somebody hopped on there and erased one of the servers and one of my engineers in the last 24 hours Rebuilt it and the only reason we were able to do that is because we can repeat the installations on this hardware consistently and The goal is to be allow you to get up run and Be able to start working on your applications quickly and effectively in a known good open stack environment When you say someone it was actually someone from red at it not intentionally that decided to Install software that reset the environment Happens just before the demo, you know the special demo effect Fort fortunately someone knows that yesterday Yes, okay So the the very next one we're going to benefit from is the capacity to do rolling upgrades This is something that is joint work from Dell Intel and red at As part of the end run for the enterprise So basically the problem we have is when you do an upgrade right now You need to upgrade all of the nodes and the controller At the same time why because we don't support version mismatch Right and this is not acceptable in real life You don't want to have to bring everything down and everything back up Well, you don't have a choice at this point, but if you could avoid it that would be a lot better if you could do You know a great node one when I'm done. I do know to or do it by little groups What's the work that we are doing upstream is introducing versioned objects in all the Talks that can be happening between the the various component of open stack so that they that will allow us to have version mismatch one version difference between the controller and the nodes and other things like that that's Mostly landed in Liberty. We still need a few more things in Mitaka for it to be fully operational And we still have quite a few idea for the future, but let's only talk to about what is underway already So I do have up on the screen the QRQ for if you're interested in the reference architecture that we have out You can hit that and it'll let you download it quickly and easily Otherwise just search for Dell opens Dell red hat open stack reference architecture in Google and you'll find it on the Dell website No other questions. Thank you very much