 Good afternoon. Thank you so much for coming. We're going to start with introductions quickly My name My name is Craig Starrant, and I'm a software architect with Intel and for some reason are My computer seems to be loading whatever it wants. Okay Okay, my name is Peter Meitz from computer center I'm responsible for the big data portfolio at computer center in Germany And I'm Megan Rosetti. I'm with Walmart and I work on the open stack operations team And with that we are just going to jump right in maybe So we come to you from the enterprise work group. We are all part of the enterprise work group and The reference architectures were built out of that work group. In fact, they Were part of user feedback that we received Has to be technical issues So it was part of user feedback that we received in which people were looking for reference architectures to use as a guideline in order to develop different applications on open stack and That's where this started from and we're going to dive into some of what we've done with that So the enterprise work group just to give you a little history about our work group was founded actually in Atlanta at the open stack summit in Atlanta a couple of years ago and The core purpose of the group is to identify and overcome barriers to adoption for enterprise groups We meet weekly from that. We've built out the reference architectures. We've Put together a couple of books. We're doing a third actually tomorrow And what we really look at is consolidating enterprise feedback on What is what isn't working? What is seen as a barrier things in which users are really looking For help to overcome to make their activities a bit more seamless within their organization Some of the things that that we've gone through we've worked on user stories Some of that is building out brand new user stories. Some of that is updating white papers Working with the product work group on user stories actually for deliverables within rolling upgrades and HAVM as well Okay, and then these are just some links to our reference architectures that we're going to dive into in a bit more detail And touch anything, but that's good It's a good hint for me, okay, so I'm going to cover what is the workload reference architecture So like she said this got spawned out of the or that's being done by the enterprise working group and they really got spawned from requests by the community for sample workloads that they could use for training material for Guides in order to aid operators and architects for designing their own workloads and I'm going to throw in our little legal disclaimer here, you know We're we're building these based off of our experiences and they are not necessarily Representative of what our corporations run that we're working for so and When we designed them we wanted to set out to make sure that they only used open source components and that they would focus on the core open stack components to that the least additional open stack components that were required in order to support the relevant workload and Also that we would provide sample heat templates and or morono packages in order so that people could Go back and deploy these in their own environment and also that these were going to be living documents and that we would Continuously review them and update them as new open stack projects come out and get released and become stable And so the reference architecture documents basically they start off with a general overview that goes a high-level overview of the workload discussing the application layout and topology then it goes over a brief Introduction into what open stack components are utilized by the workload and then it goes into a deep dive into each of the open stack components and covering things like configuration information pitfalls that you might run into when utilizing the are setting up the workload and then it goes into a Demonstration and sample code section and in this section we since currently we only have heat templates. It'll go through the YAML files and go through in detail what each YAML file does and discuss things like optional and required parameters and Then at the end it goes through a scopes and assumptions section in this section mostly covers what other projects you could utilize in the workload and Different options that you might want to consider So the current status so currently we have two workloads that are published and so everything is linked off of I've forgotten what the page is called We'll have the reference pages. What's that? Yeah, and So currently we have the web applications and the big data as the two ones that we've released we have an e-commerce one that should be out in the next week or two and Then we're currently working on a media transcoding and distribution one And then there's future plans for some, but we're also looking for input on this So far we're looking at HPC some SAS option a Relational database enterprise level database Hybrid cloud and CICD option But really we want to point out that you know that we're looking for input on what workloads people are interested in and You know to try and help us prioritize what we're going to work on next And so we actually have a URL up here if you want to go to that You can go and submit feedback and submit You know Input of what you might be interested in seeing in your workload or you can submit feedback to the enterprise working group email list so the web applications this was the first workload that we worked on and so We based this on a lamp stack Largely because lamp still seems to be come out in the user study as one of the top stacks That's being run on open stack and so we wanted it to be a standard three-tier web architecture So have a web layer and application layer and a database layer We wanted to make sure that it included some security so firewall basically security groups filtering traffic at each of the layers We wanted to make sure that we included load balancers both at the web tier layer and the application tier layer and then also that it would support on-demand scaling and have persistent storage for the database and Basically, this is how it looks on the when it sits on top of open stack And so the user requests come in and they pass through a load balancer So it's a neutron load balancing as a service and then passes into the web layer which is an autoscaling group of servers with a web security group Assigned to it that does port filtering to allow just web-based traffic to go through and Then from there it goes through a second load balancer that distributes the traffic amongst the active application nodes again this layer also autoscales and also has security group applied to it to filter the incoming and outgoing traffic and Then from there it goes into a database layer and this layer is static where we have a master slave set up And we have database files sitting on block storage and Then in addition we have the option for Swift storage to support database backups So on the e-commerce side we Wrote it to describe three main layers the web layer the service layer and then the database layer as well There are also three sub layers written into this messaging storage and analytics And we wrote these using open-source technologies Just to give you an idea The messaging layer is the API for each service Storage layer persistent storage using sender analytics layer e-commerce Uses a lot of big data and there needs to be a lot of criteria for volume variety a lot of push-pull on the e-commerce side Both online and offline business interactions as well. And then this gives you an overview of e-commerce It's pretty detailed. There are a lot of parts to it The customer selection really motivates Where data is pulled what information is loaded? Whether it is somebody who is logged in either to Their account or they're going through checkout or their browsing and then they've added items to the basket All of those are going to pull from different areas on the back end. So it tends to be extremely detailed in these This workload is certainly created as a sample. And so in this in these guidelines we looked at Keystone glance and heat which sit over the entire e-commerce workload web servers and web site services are using Nova the database applications Are using trove the image store is using Swift The messaging layer uses a car Applications cluster which is attached to messaging uses Sahara and then persistent storage is using sender Okay big data is next Before I start I want to give you an instruction why we started this reference architecture we had as Greg pointed out feedback from our customers what they need for fulfilling their projects their requirements and What we found out at our customers that there's a very strong demand here in Europe We are operating here in Europe with computer center For big data as a service Yeah, so and as you know big data is a horizontally scaling application platform and That's a perfect fit for open stack and big data. That's a classical use case. That's a Great story. So but there's an issue as always. There's an issue There's Some some know how about open stack in the companies and there's some how some know how about big data in the companies, but They they do not come together. They are not teaming Either the one or the other is too complicated So our motivation was to build a straightforward reference architecture It makes it easy for those experts to communicate with each other and to find a quick entry a low-level entry into Big data use case on top of open stack so that was the motivation and of course We used our experience from from our customer projects how to build this architecture, okay Of course, we build it based on open source Has a lot of reasons and One main main reason which is often overlooked is licensing if you start with building as a service Application licensing is an issue. So it's better to start with open source products and we are lucky because we have Ubuntu it's a proven nooks platform and we use Ubuntu 1404 It's very common in with The usage in Hadoop clusters and we have hard works That is 100% open source. So we combine this to software Products into our reference architecture Okay, now we Want to get a little bit more deeper into the architecture. I've only two slides. So I Learned Okay first of all this is a little bit new I think for most of you probably but A Hadoop cluster is made of several nodes But these nodes are not unique. They have several roles in the in the Hadoop cluster and The first step we took was we we are divided them into several groups So for example, we have edge nodes and they have a special workload these nodes are specially made for giving access from clients or Applications into the Hadoop cluster then we have utility nodes Which is not on the screen because it's not in this In this in this version This this they can be used for anything they can be used for Kaboros that can be used for Let's say a repository software repository If you have to keep in mind that if you want to use Ubuntu or Hortonworks You need to have the software either you can get it out of the internet Or you have a local repository you so you can use for example such Utility node as a repository and you have the data nodes most common and so-called master nodes where the central services are located on so and surrounded is this by networking and What is a typical setup for big data for networking is that you have a central network that covers the data so All the nodes of the big data cluster in are in one network and this network is isolated from the enterprise network Why is it so because of security reasons? The Linux image used for a Hadoop cluster often does not match to the Linux image used in the enterprise Hadoop well Security in Hadoop is a special thing. So This is one reason the other reason is if one of the nodes fails Data starts to be replicated and you want to keep this replication in Inside of the network of the Hadoop cluster so There's one network and Of course what you need to access the data In the Hadoop cluster you need something special. That's the edge node and when we look at Here's the user user interface and he's accessing the the clusters through an edge network and He can access the cluster here Through this network. So we have an edge network that Shows us the connection into the enterprise at then we have a data center network a data network that covers the the network just for Moving the data into the cluster back and forth and then we have another Network called management network to Do all the administrative stuff and then we have something here object store Here there's an extra network because What we found out is that customers start to use Object storage as their data lake. So they put a lot of their data into an object storage layer Could be anything Could be could be swift too but One dominant way to accesses is through s3 so Hadoop naturally is can access s3 layer, so It is a logical way to store some data on the onto the s3 object storage layer and Excesses directly from the Hadoop application. So We put in an extra network that it accesses such an object storage Okay, just sum it up We just released version one You're right But for example if you use scoop a Very common service on Hadoop scoop has to have access to all Databases you connected to directly and Then it's broken too. You have to do some compromises In the end. So it's it's Hadoop is distributed computing. It's not perfect Okay But we use this as a blueprint for you And well try it test it give us feedback. That's very important and You can change it if you think well this part is not valid for me or I want to have it in a different way Get the heat template have a look at it change it When it's up to you Okay, that's brief overview about the big data Right going to jump in to wrap up and just to certainly emphasize the point these are Guidelines, they are living documents. So through different feedback through different release cycles They will be continuously updated Which is it's going to be interesting and that's also part of why we need help without question The enterprise work group certainly wants people's feedback. We want to know Literally, what do you think of these architectures? This is something that you want to see more of do you want to see more detail less detail Is it something that you want us to expand off of We went through some of the ones that we have in the pipeline. Are there others that you would prefer to see Where do things stand? Did they start out as good guidelines? your feedback in That worked, but I'd like to see more on the security settings. I'd like to see more on What if you were to use? This project in it or are there other areas that it's applicable We would love people to definitely get involved We the enterprise working group in the reference architecture working group actually meets weekly You can join the mailing list for the enterprise group, which this resides under and then also the meetings as well and The we do have a code for the slides so you can download those at any point And we do have a tiny URL for it is small It's only a few questions, but we'd really like your feedback to know is this are we on the right track? This is what we've taken from users and this is from information that we've been been asked for and it is a start and With feedback it will get better and better each process and Then we do have the sample configs link in the middle And that is for the reference architectures that are currently published and that's also where the reference future reference architectures will be published as well I guess one thing to note that the heat templates they're published in the open stack application catalog so if you want to download the heat templates you'll find them all there and they're also linked out of the reference architecture documents, so Yes, thank you. That is a good point so that is where we currently stand on reference architectures and Be happy to get into any type of questions or feedback So I'm gonna I'm gonna jump in really quickly. We do have a mic if you don't Okay, again, so the question is a reference architecture is really a complete part of a landscape But sometimes you have building blocks which you hate which these enterprise architectures consist of but these building blocks occur in multiple of these architectures and the example I brought up is the classic logging Infrastructure won't have these these ELK stacks Somewhere in stone and this is a classical example of building block where you're making reference Architecture and show how it has to be established, but it's not a complete one in the sense that it is a complete Business thing have you already thought of something like this or is this a new thought in your in your context? of adding in the centralized logging to what we have are doing a reference architecture doing doing the reference architectures more on not only on this on this overall, but having also Building blocks that you can pick out and and combine it to new reference architectures. Well, actually the big data reference is kind of Built on that principle because this is a version one As I told you the connection to the object store is not existing and at this point What is also missing Cabos authentication for example? It's it's not in there so far, but it's used in most enterprise environments. It's coming into the architecture LLL a point in time with the next version But we have the utility node. It's prepared for something like that So you can use Cabos on a utility node in the big data reference But you could also use it for something else you could use a repository local repository for Deploying software Ubuntu or hard works Into the big data reference, but you also can use it for anything you want This will be added for example into that reference and then you can start picking it and Say, okay. I like it. I don't want it. I don't need it or yeah great It's a great idea Maybe I do it some a little bit different But I think that's that's that's one thing you want to point out that it's it's more module. It's more of billing box Do you need the mic again? Yeah, I think it's really the point. Maybe you have to just make them first First first few citizens in a way that you not only have a view which Which says it's part of a you can pick it out of the of the of the reference Architecture, but you have a list and say okay, maybe it's described there But it's building block you may can use somewhere else Maybe it's an interesting task to do to to identify these things and make them Yeah, a little bit more Outstanding fine We'd certainly love your help on that Well and to even go back so that was with big data with e-commerce what we find is We put together Pretty general outline. There are so many different possible moving parts of e-commerce That depending on the needs of the customer quite a bit of this can change and it Honestly depends on how customers are looking to deploy that and part of the reason that these are Considered guidelines for right now, but we want to get them to be more of where users are Really going to them to pull out that data and that type of information so it's excellent feedback And that's exactly what we need is okay. This this is where we are This is what we've worked on how do we make it better? How do how do we give it back to the community to make it more? Easier to follow more deployable to answer some of those questions Any other questions How does your working group interact with the upstream projects? So I heard you mentioned Trove and Zacar as example components within the reference architectures Have you worked with those teams or does the work products that the working group has produced? Provide value, you know, have you heard feedback from those teams? Oh you your architecture have encountered this particular problem that needs a solution That's influenced development right now so what we With the enterprise kind of how we flow some of the information upstream is we work through the product working group instead of We don't want to create an instance in which You're sort of overwhelming PTLs with hey, have you thought about this? Have you thought about this? Could you review this for us? Have you looked at this? So we're trying to keep that kind of continuous process of taking in user feedback and creating user stories especially projects and feeding that through the project working group to make sure that there's a consistent process into the technical community Any other questions? All right. Well, thank you very much for your time and I'll bring up the tiny URL again because we would really really like your feedback. Thank you guys