 Test one Thank you for coming. It's pretty late in the conference and it's pretty good turnout for this topic My name is Keith Basil. I am a product manager with Red Hat. I cover Our new installer for OpenStack, which is called OSP director I also cover Sahara as a component within OpenStack, and I also cover Ironic As a component in OpenStack So we've seen internal to Red Hat a lot of demand for Hadoop on bare metal within the context of frame OpenStack, so We put together a very esteemed panel of experts in various fields and We are going to discuss various aspects of Hadoop on bare metal within OpenStack. So with that we'll get started. The agenda is as follows We're going to do maybe a one-slide introduction to Sahara in Ironic to level set where we are so that we're all on the same page And we're going to talk about briefly the holy grail of elasticity Basically using one one framework for both use cases And then we're going to talk about who is doing the heavy lifting upstream and then immediately go into the panel So the first part would be about five or six minutes, and then I want to spend the most of the talk In terms of implementation so you guys can get some good feedback and ask a lot of questions So with that we'll get started So what are we talking about here? Conceptually it looks something like this So we've got the Sahara elephant, which is the I mean the logo, which is the sorry the elephant is the logo for Hadoop Standing on top of OpenStack standing on top of bare metal. So this is the the thing that we're doing here. So So why do we need this the answer is data data data So we all know that we're generating a very large amount of data, you know social media financial transactions Instrumentation IOT is becoming a thing now and we've got to capture and manage all the data related to that and then those Use cases are spread across multiple industries. So finance health care telecom energy retail, etc We've got actual customer on the panel as well that can speak to probably two or three of those those areas So that's the driver All right, so Sahara and ironic just a level set here So Sahara is a component within OpenStack It provides a framework basically an API driven framework to deploy various distributions of various distributions of Hadoop So if you look at the the kind of middle tier there, you'll see Hadoop That's the upstream Apache based Hadoop kind of generic vanilla Hadoop So Sahara can deploy that today. So think of it as one click install for an Hadoop cluster on virtual machines That's already baked. We already have that today in OpenStack. Thanks to some of the guys here on the panel We've got commercial plug-in support. So if you needed a distribution of Hadoop to they need to support you can Use the HTTP, which is the Hortonworks Distribution Cloudera and we're working with map are upstream as well And essentially the way Sahara works is you pick one of these plugins You want to deploy and it creates basically a heat stack and Deploys all of the services using you know, Nova heat sender and glance to extend that cluster up for you So again today we do that on virtual machines, which is on the left But tomorrow or in the future or some very early work is being done via ironic and bare metal on the right-hand side So this is kind of what we're talking about here today with this with this topic This was a thing I saw in the Cloudera Hadoop training For those who have a Unix background once I saw this command line I totally got what Hadoop was all about right so basically you have a data source You're streaming that into something you're doing search patterns on it You sort it and then you unique it and you bring out a data set that meaning something to you So I just threw that in so that you could get a one-line understanding of Hadoop, okay? So on ironic very similar Framework where Ironic has an API it has things called conductors which manage bare metal and it puts these the data of all the nodes into a database and then it's a pluggable interface, so You can have ironic drivers for each one of those vendors up there So quanta open compute Cisco Dell HP, etc. Those are the drivers that speak and control Basically provide command and control for the bare metal so we're talking about Sahara big data And we're going to use the framework of open stack to deploy on bare metal So at a very high level this is the holy grail of elasticity because if you look at open stack as the best, you know open source Platform for infrastructure as a service. It's built upon elasticity, right? So we've got open stack on the left We've got the compute nodes We've got open stack as a framework a set of known and common APIs and then now we are going to look at racks of gear Where we can actually deploy Hadoop on to bare metal so we got one set of API's to drive both use cases and In terms of network optimization these guys love east to west traffic. So typically You know a scaled out open-stack cloud will be deployed with a spinal leaf topology, you know optimized for east to west traffic So Hadoop has the same requirement And both expect failure. So if one hardware node goes down, you're okay If you have an application that's been built as a cattle Hadoop has that natively built in with HDFS and redundancy and replication and things like that So they're very very similar almost cousins if you will in terms of their underlying requirements. So with that We'll talk about who's involved. So on the Sahara side, you've got Mirantis leading the charge You've got the ptl of Sahara on on the panel with us today Red Hat is a contributor as well on ironic You've got HP red hat and rack space rack space has done a tremendous amount of Maturation work to bring it up to speed and we've got Jim from rack space to talk about Implementation and all the nuances and details related to what rack space is done there So we're very privileged and lucky to have these two guys on the panel today So let's talk about the panelists. So we've got Henrik Henry can raise your hand. He's a senior product manager from HP focusing on Nova and ironic. We've got Sergei as I said the ptl of Sahara today Ethan Gaffer. It is the engineer at Red Hat working on Sahara Blake Caldwell He's actually an end user doing ironic today on an open stack in his facility and Jim is the the rack space guy I mentioned and Dave Eason is from Cloudera So he's directly from the Hadoop community. So with that we'll get started. So We've broken up the questions into one quickly on the market side and then we're gonna get into implementation and then We're gonna have an open mic session. So if you guys want to have questions, feel free to walk to the mic and Bring your questions to the table. So in terms of market drivers Dave as a member of the Hadoop community Where do you see the demand coming from in terms of growth for for big data? Yeah, so It's a good question Basically the demand is right across, you know, all all industries and you know for us as a company It's incredibly strong. It's no secret that Cloudera. We've been growing, you know 100% year over year both in terms of number of customers and revenues and that's been the same for each of our competitors in The space for Hortonworks for map are and for the ecosystem as well. It's been been growing incredibly strongly Going back, you know, 80 months that was all predominantly bare metal deployments because that's what Hadoop in its early stages was designed for But in the last 18 months that has transitioned to be, you know, a very significant proportion of both public and private cloud deployments Are you seeing a open stack drivers there as well? Or is it just purely the big data side in the In the private cloud Deployment open stack is the clear leader in that from our from our customers perspective the base that we support So that's definitely Driving it but just The biggest thing there we see is just the customer data volumes, you know, totally outstripping even our potential to really satisfy the market at the moment So Blake, thanks for joining us. I took a look at look at the Oak Ridge national labs website and Seems like you guys cover a lot of area areas one is clean energy national security obviously and and some of this other nuclear related sciences Understand you guys are using open stack and ironic. Can you talk to us about those particularly use cases in the unclassified way? I guess yeah, so there's one main reason that Typically drives Customers in the sense some other divisions at our lab to use our common open stack infrastructure in a lot of times It's that they have an immediate need They have a project deadline and they have a budget So they want to have hardware deployed and they want to be able to use that quickly instead of a conventional deployment You know hardware deployment lifecycle, so they end up coming to us. What can we do with with open stack? And two projects come to mind one or is if I am Institute for functional imaging of materials and It's another to data-driven need where the imaging is advanced at such a rate that now it's not just Speak to this project. It's not just the position identification of atoms within a material, but also dynamic elements so Angles of the bonds molecular dynamic pieces of that so there's a Multidimensional expansion in the amount of data that is generated by these imaging Systems and then they have a requirement to now to analyze it So the request was a Hadoop cluster. How can we make use the infrastructure to you know? Stand up in a cluster to run analysis on this information. Okay, great So I want to switch to implementation details, which is probably the reason most of you guys came here today So Hadoop can be deployed on VMs today Using Sahara obviously at what point do we need to consider going to hardware and maybe that's you know Question for Sergei or Dave in terms of where that trade-off happens. Okay, so on a Sahara site we're going to use already use the heat for the provisioning and ironic integration with Noah as a hypervisor driver makes any any Noah user ability to spin up the bare metal machines It means that for Sahara we only need to prepare specific images for bare metal and probably Add some additional testing for it. So it means that right now It's it's it's possible to deploy the bare metal Hadoop clusters using the open stack with ironic and It's it's done look at transparently under the nose and ironic implementations Yeah, and just to add at least from our perspective The question quite often comes it's virtualization versus bare metal where we as a vendor that supports Different use cases have a slightly different perspective. We actually see three tiers of use cases those with Real-time interactive SLA's applications like H base and search which actually still require the sort of performance that a bare metal deployment Would give you they're pretty much exclusively on bare metal today, but there is two other tiers, you know Interactive analytic use cases where there is actually a legitimate place for just virtualization and and even network attached storage and object stores There's there's valid use cases of that but depending on your SLAs and then actually where we've seen You know the mostly adoption for even virtualization is you know in batch Data processing data cleansing the traditional kind of map reduce workloads Which depending on your SLAs run very well in virtualized? Instances, but it's it's predominantly in that first use case the interactive Real-time workloads things like H base and search from an application perspective Which is going to be driving the need for Engines like ironic and access to bare metal to get the the performance that we need to support those use cases So in terms of deployment on that note I Know cloud error they pop you publish a reference architecture for deploying open stack. I mean I was sorry I had open stack on a brain Hadoop, but is there anything we need to know are there any nuances? I mean how does that apply inside an open stack framework and maybe Ethan if you can add to that or Dave If you want to start out with an answer would be great Yeah, so I mean the simple thing that's that's usually referenced in our traditional reference architectures It was always about getting access to direct attached storage and just a bunch of discs You know configured so you can get that access in in the virtualized context. We've had to adapt that slightly many who follow the Hadoop community would have seen recent things like the Hadoop virtualization extensions which allows a configuration file for layered network topologies in addition to the sort of more sophisticated layered VM topologies Which you know exists with with some of our virtualization vendors today. Is that where the data locality comes from? It's exactly in a virtual environment to pass to Hadoop. Yeah from the framework. I'm in a key part of the HDFS system is Data replication for purpose of redundancy and performance and so that configuration allows you to To deploy in a way where you ensure that you've got isolations Where whether your data is replicated and also for balancing across the balance of policies are impacted by that topology as well Yep, and on the Sahara stream side We've actually taken some steps just in kilo to sort of close that gap further You know, we've got instance locality through cinders instance locality filter We've got default templates that basically are executable reference architectures and we got that for CDH for HDP But you know that virtualization piece is still kind of the elephant in the room, especially on right heavy workloads for short SLAs You know, we're not there. Yeah, okay Lake in your environment since you've already deployed OpenStack Is there anything that you guys do differently in terms of architecture? so We're concerned with the well our users are very concerned with the the performance aspect and a perceived overhead From a virtualized environment. So, I mean, we're all aware that that that gap can be narrowed But sometimes when you have a customer that's demanding, you know, the bare metal It's very difficult to convince them that a virtualized environment is acceptable. So that's one hurdle. We've we've encountered and You know beyond that just being able to set up infrastructure with high interconnectivity. So A lot of times they're coming from HPC background where they want an infinite band interconnect or you know, the very least ting again or connect So being able to provide that service is you know concern of ours. Okay Ethan, I know you're involved with packaging of plugins for our specific OpenStack. I am do we is there any reason to package differently if you're going to deploy the bare metal versus the VMs? So from the plugin perspective, I mean not terribly, you know happily We use cloud era, you know, the exact same repositories for cloud era for Hortonworks In a VM as you would use, you know to a bare metal deployment anyway You know the general process of spinning up a cluster is you know, you build your image your provision and then you configure those, you know Image building and configuration pieces are going to be very very transparent across, you know any Provisioning mechanism the devil's always in the details, but you know, if you see failure in our future Sergei, you know In Sahara we're based in our image building process on an open stacks product named disk image builder and it supports building images for the bare metal Deployments for ironic and for the virtual machines as well. So it's By design supports was the approach. Okay, excellent. Alright, so let's now turn to a really important topic related to bare metal and that's multi-tenancy and security So HP hosted a session earlier in the summit Related to secure boot on bare metal Do you see that Henry Henrik as supporting us in the multi-tenancy use case Yeah, I think that's pretty much a requirement because when we when you do bare metal And you allow users to get full access to to the machine You're opening up a whole different world of attack vectors that you don't have when you encapsulate it in in a hypervisor So by doing these signed boots that Strapping and have these science steps in the boot process You're making sure that you know the bias in the firm where it hasn't been tampered with by whatever Tennis was on that machine before and I think this is an absolute requirement before you go into multi-tenancy environment Yeah, and Blake on the other side of the coin when anybody walks into your facility They're pretty much cleared vetted and trusted So you have your tenants are very different than let's say hostile public cloud environments So what's your take on the multi-tenancy and security requirements? Is I'd say our job is much easier because we have that vetting process that the users Come into the system. They've already been authenticated whether, you know, whatever standards and you know two-factor however Whatever mechanism the security requirement is and forces that the The trust level it's not it's not as critical you know as Tenant we tend to you know group the tenants together so You know, we're not dealing with the case with an unknown hostile user coming on the system and Jim Please jump in I mean because you guys probably run the most hostile Customer-based cloud in the world. Well, so one thing I wanted to mention is that you never have a fully trusted tenant, right? Their laptop that they're assasaging from could be hacked They could go rogue, you know, they could be a spy whatever, right? So I think security is important in any environment There's a lot of things that are really hard to do like Henrik said That you need to work with your vendors for Work with your vendors on to make this secure firmware signing and that kind of thing So there's been talk about this kind of pseudo bare metal thing where You give a client the machine, but you rat you containerize the machine either via VM or container proper Is can you talk to us about the on metal implementation and how you guys do that? Because it seems to me that if you give somebody root on a box they have access to all the networks in there So how do you compartmentalize that? Right, so there's a couple things we do. We don't use VMs or containers to compartmentalize things at all we do all of the firmwares on the box are signed and We own that signing process. We worked really hard with our vendors to do that and then there's beyond that there's network security and While we push down to V lands to every tenant a public net and a certain in Public net and internal DC service net We do magic with Cisco and our switches to be able to secure that prevent ARP spoofing and that kind of thing, okay? and When you guys introduced on metal you gave a talk afterwards about how to scale out Ironic so related to big data and having multiple racks multiple nodes and a very very large, you know Semi-permanent Hadoop cluster. Can you talk to us about scale and what what problems we may see by by going that route? so we've mostly solved scaling problems in ironic you could Potentially boot hundreds of machines at the same time and not have any problems with that process, right? And beyond that scaling Hadoop on top of bare metal is about network typology Your placement of your machines Those are both things we're working on upstream and ironic the network stuff is ironix number one priority this cycle And then as far as locality of machines, I believe it works today with some scheduler hints But we do want to look into that more and provide that you know as a first-class thing Yeah, so based on that topology awareness, that's absolutely critical for Sahara So the Sahara guys on the panel any comments on how you intend to solve that I know later today in the design summit There's a lot of work about that and as Jim said, you know upstream next cycle It's something we should tackle. Can you give us some insight into their direction there? Okay, so are from from the beginning of the project in Sahara. We supports the configuration file that could Describe and define the topology of open stack cluster Currently it's not dynamic, but it could be improved to be dynamically specified And Sahara uses this topology definition to configure Hadoop to know where is that date located it supports for the rack and for level awareness and so it means that All neither data could be passed to the Hadoop. Okay So we've got Ironic that has a topology awareness. We need to pass that to Sahara, but one thing that's missing and Jim Maybe you can help us out here How is the neutron integration happened it seems like your magic is based on Cisco UCS So that's based on Stuff in the Cisco tap of Rex. Oh, sorry, okay. We use open compute hardware, okay? But to speak to neutron stuff like I said, that's Ironic's number one priority at this cycle Fundamentally, we don't want bare metal to appear differently than VMs from a Nova perspective It should be able to mount arbitrary networks in there It should be able to mount black storage all the things VMs can do that, you know, doesn't require a hypervisor Okay So a lot of us rack space really had HP Marantis We all have life cycle management tools to install open stack to deploy open stack Can you guys speak to What that tool chain looks like in relation to Sahara and Ironic? I know rack space you guys are big on Ironic, obviously Rehat we're using Ironic for our deployment mechanism in the next version of our installer So an HP maybe this is for Henrik. I mean you guys have been leading Ironic. So Can you talk to us about the tool chain the tool chain support for Sahara and Ironic? Yeah, so we we were a big Proponent and contributed to triple O and as most of you probably know We're taking a slightly different approach and that triple O of course included Ironic But now in our next release we're actually taking a slight little detour So we're going more to an ansible playbook way of doing it we're still Debating and figuring out if if we're going to use Ironic in that release Or if that's going to be in in the subsequent release, but we're definitely going to Incorporate Ironic into our new installer and life cycle management tool chain as well It's just a matter of time if we can actually get it done in in the first release or it will be in the next one In circuit, I know it's probably not your area, but with fuel. Are you guys supporting? I know you're supporting Sahara today, right? Yes Are you going to do Ironic so that we can do bare metal to tenet in the next release of fuel? So as I know it's evaluating for the next releases to use Ironic for the open-stack deployment itself and The same for as I know is it if late enough supporting ironic as a hypervisor for Noah in next releases But I don't have much details about this. Okay. Okay Um We've kind of already talked about that what's next upstream one last question there and then we'll turn it over to the audience for open mic Um, we talked about Ironic's, you know, thanks for key liberty But the Sahara guys anything in terms of what's next for for Sahara for liberty Um, so probably a few highlights for liberty One of the main goals in liberty is to support the Hadoop H.A. deployments for both Cloudera and Hortonworks plugins Is full automatically out of the box configuration And it's probably new versions of other plugins and other plugins support I think it's it's most Most interesting highlights Probably you could yeah You know, we've got ha work certainly within the clusters We've also got some tightening of our our ha and reliability in the service layer itself And you know at this point Sahara's apis are pretty mature and pretty full featured A lot of it is just tightening user flow Continuing to make that user experience more seamless and easier for Users who aren't quite as expert in Hadoop Okay, excellent. So if we have questions, uh, you guys are welcome to come to the mic and we can we can Ask ask these uh these folks Hi, Ian Coley with red hat. Uh, henrik you alluded to something that I've heard kind of scuttle but about and concern about hp You kind of tap dance and said uh going in a different direction But the concern has been that triple o is being abandoned by hp Is is that can happen with ironic too or you know, basically What's your commitment to the community and and not just going off in your own proprietary ansible playbooks? So there are a lot of things there so how much time do I have? So we're definitely for the first we're definitely not abandoning triple o We're For our install experience where we're going to use it next to the ansible playbooks in a different way doing it But we're not completely abandoning triple o. We're just lowering our level of engagement there because you know, we're refocusing internally So no, we're not completely abandoning triple o and being the product manager for ironic uh with in hp and I Very personal strong commitment to ironic And like I said, we're going to base our Our next installer technology at some point on ironic So ironic is definitely going to be there in the future as well. And we have a lot of other projects that are involving ironic as well. So we're As committed to ironic as you possibly can be and then we have like so you would the number one contributor to to ironic so it's I think ironic is here to stay both within hp and and in in open stack. So yes All right next question The key things for hadoop is rack locality on the block placement. How does ironic handle rack locality affinity Right. So like I said, we don't have first class support for this today There are scheduler hints that work. You set some properties on the ironic node and you can schedule against those via nova It's a little hacky. It does work if you know, I guess if you also run ironic, right? So if ironic's transparent to you, you can't put that info there But that's a huge thing for bare metal, right is rack locality and failure domains. So we want to give that first class support It's not a high priority for us, but it will be one day How can we help? I work for Yahoo and that's one of the common needs for us Come join us in open stack ironic IRC and Let's talk I've got one follow-up question since we're kind of talking about installers and using installer to an install open stack In the case of helion and our product, we have what we call the undercloud which ironic deploys the production cloud so The question for the panel is If you're using ironic to deploy the overcloud and then you want to expose bare metal to to a tenant Are you setting up a separate ironic instance? So you have two ironics What what what are you thinking there in terms of how you segment the cloud operator infrastructure view From what's exposed to the tenants? Right, so I don't run an ironic undercloud But I would imagine you'd want that to be separate things right ironic doesn't have any concept today of a tenant Or you know linking a specific piece of hardware to a specific tenant Which you would need for that use case right? And so I think the best thing to do we run ironic on VMs, which is a bit ironic But what I would think would be the best thing to do is deploy your undercloud and then run Run another ironic on top of that to expose the tenants and Henry because it seems like you wanted to add Yes, uh, so our non proprietary ansible playbooks Um Will give it more flexibility since we're moving away from the overcloud and another cloud No, we will make it easier for for the customer or the user to kind of configure that the way they see fit rather than being Yep stuck in the undercloud over cloud Architecture gotcha. Okay. Yes, sir question Yeah, I I have a question regarding, uh The ha deployment Uh using the where were you mentioning about the kilo? Features coming over the ha deployment on the cdh clusters. I mean on the azub cluster My question is that involves several pieces like, you know, zookeeper synchronization journal nodes and things like that and usually in production Sometimes things can absolutely go wrong And you can pretty much lows and ha node and then you have to do replacements and all that so does the upcoming release Takes care of the one time provisioning of an ha cluster. Or does it also takes Take care of the because the once the ha node is down There to bring back another cluster is a very tedious task. So I want to know how mature is ha in the kilo release So we're going to support the ha deployment for clodder and horton rocks in liberty release It's now in design state We'll be using the corresponding management tools clodder manager and umbari for horton rocks It'll be configured on the installation time and I think we'll be discussing it on a design summit Today and tomorrow but in liberty we're planning to introduce a health checks framework that will check the healthness of the clusters including the ha state and reported to users So for liberty, I say that It will be one time configuration with ability to manually fix the stuff and for the next releases. I think it Most probably do something to to the like a self-heating of the ha for example last question May talk about distributed sahara deployment. You have a component of sahara api and engine so I was reading the documentations and all it says, okay, we can have several nodes that can host the api Instance of the code So where does this note falls in as it falls on? Can it be on a tenant space or can it be? It's just has to be on the control side or how does How is it? I mean, where does the api instance in distributed sahara? I mean, where does it fall? Can it be? On a tenant side or it has to be on the control plane Um, so that uh, there is no real Okay You could you could run any number of apis and engines and there is a round dropping balance in between the operations From api to to the engines. So You Probably could just deploy engines on the controller side Uh So Yeah, I mean only the engine I think really needs access to control plane resources and I mean it depends what kind of You know operator to tenant relationship your cloud has right if you have a pure under cloud over cloud relationship, uh, you know perfectly okay to run sahara in the control plane of the tenant cloud Um Beyond that though, you know, I think we do need access to control plane resources Absolutely. Yes in order to provision, you know resources through nova So what I understand is engine pretty much on the control side or the apis can also be on a tenant side. Um, yeah Thank you. Thank you very much Are there any more questions? Um, I do want to note that later today Matt ferrelly and the guy from intel are doing a really cool session on Hadoop performance bare metal versus vm. So You guys should definitely check that out In your last questions Going once Cool. Well, thank you for your attendance. Um, and see you around with the conference