 Yeah, so I think we can start. It's it's 1115 already. So Welcome all. Thank you for coming for my talk building the cloud ecosystem. I Mean from the name, it's it's a huge topic like you could have a whole conference of Developers architects coming and just talking about how ecosystems should look especially clouds and When I'm talking about an ecosystem We want to design something that will solve problems. So the use cases are large as we all know we have been seeing all the Presentations that are happening people are interested people want to solve real world problems enterprise problems and There is a valid need for a cloud ecosystem that can Actually make use of technologies Not read do reinvention of work and solve their use cases at the same time I'll try to figure this up in 35 minutes we'll keep a question answer around at the end of the presentation for like five to ten minutes So we can talk about it one more thing I wanted to mention here is More than how we'll build the ecosystem Something that I want you guys to take more from this is why build an ecosystem like we want to focus more on Why do we want to build invest in these technologies rather than yes also do consider once we figured out that yes We do have a valid use case. We do have a business case for going ahead and building such a system Then go ahead and find it out So my name is Chinmay Briefly about the cloud platform engineering at semantic so we are We're building a consolidated flight cloud platform for our customers internal customers it's a private offering right now and This is to facilitate development of all semantic products and services internally We are growing In the next say a year or so data center footprint for for our internal cloud also going to grow Something about myself. I have been in the cloud space Especially with open stack for like three years now. I had Hadoop experience as well So a lot of distributed computing background and I'm an open source enthusiast I mean that's some of the reason why we are all here open sources the way to go Setting the agenda straight so I'll be talking a little bit about the technology three majorly Which is Docker Kubernetes and Alexi's it is a very hot topic right now But I will give like an honest opinion of what I feel these technologies are Why use them one of the main topics that I want to cover here some of the semantic use cases that actually Require these technologies by me say our evaluation of these technologies like what is it that we Got from when we looked at these technologies What is it that we saw and then what makes an ecosystem and building the ecosystem? This is where I'll be giving out a few guidelines three ecosystems that we have architected and work is still going on on getting these out a summary in a future work and then question and answers So going ahead into what is Kubernetes Alexi's Docker it's not like a deep dive, but just know how so just by rays of hands how many of here are using Docker in their companies right now And these are in production environments and about Kubernetes That's interesting so Few points about what Docker containers are for those who may know those who may not Docker containers ideally is a really good platform for running applications outside of on containers Runs a stripped down version minimal OS when I say this initially Docker was started as a single application so next point being is Services Demons libraries added if only needed what this means is that it started so it runs out of a UFS and It has a core layer the process itself is the in it in a deep process a single application use Doesn't have a proper in it doesn't have cron doesn't have syslog ng if you may know Starts with a minimal one Services Demons and libraries can be added if needed so you can add so It's not needed you need a syslog ng because Docker has Docker logs So you could might as well have your outputs go down to that log and then I have it put into any lock stash or any Kibana kind of a service Single application use but not anymore There have been a lot of blog posts about people kind of fight between Is Docker only used if you have a single application use case or can I run multiple applications the same container? There are ways that you can run multiple applications You can have a system D supervisor D process which runs in as you're in it and then can manage process uptimes Make sure if a process goes down restarts it and stuff So there are ways to run multiple applications on a Docker container right now Based on a UFS as I just said that's the advanced multi-layer Unification file system. So basically everything is a layer in Docker so basically all the layers are read only offer OS and Whatever you write whatever you edit like you change a my sequel version or you do some code change on top it's done on a right only layer which is on top and Once you do a Docker commit it just creates a new image out of it So at the end you end up with these layers that have all the changes in it Makes containers easy to use so this point is more in relation to also later on. There'll be comparisons with LXC's there are a lot of people who Want Docker there are a lot of people who says who say I already have LXC support. Why do I need Docker so? from my perspective Docker has made containers very easy to use Docker has a good client. It has a good CLI system which Developers will find very personally being a developer will find very easy to pick up these command-line tools that they have Instructions that they have So that is one point Quickly going into Kubernetes because we already talked about Docker It's a cluster management for application So this goes from my perspective a level above where Docker is where it goes more into an orchestration of containers Grouping containers and pods and labels. So it's very important concept called as pods where you can group you can group these special containers which form your application and These can be deployed as a group called as pods and you can label these parts. So basically these parts could be deployed around Around machines in your data center and you can figure out based on the labels that you have given to them So it's a very very good way of Cluster management again, you can form these clusters of application servers and then have them Have applications deployed on them Declarative primitives from maintaining desired state. There is certain description that you can Make sure that have an application requires a certain set of components to be up for that application to be running those kinds of primitive self-healing mechanisms, so these are more in terms of Rescheduling or restarting containers if it goes down Maybe even copy it over from one location to another It provides very powerful mechanism to orchestrate containers Keeping your application in mind Talking a little bit about Linux containers itself because Docker initially did start with Using LXE itself. Now they have lip containers, obviously if you guys might be knowing so Linux container kernel containment Uses kernel namespaces as we all know it says it has been around for some time now The user namespace came around At a later stage So initially it started with the PID mount network namespaces but use of C groups to make to have access control on all the resources memory as such and Resource isolation for application. So this is basically Where we use virtual box virtual machines right now So it's it gives you the ability to Containize or isolate your resources for a particular application to use so So having having done that let's just go through some of the why Use these technologies. So this will go more into some of the semantics use cases as well because We since we are building a big Platform for all our internal services. We need to make sure we keep everyone happy When I say everyone happy solve their use cases. So there are use cases all the way from people wanting VMs people wanting containers Let's let's just take a look at some of them, right? So continuous deployment, I think this is one so I was just in a talk before About son where he was talking about all the big things that they're trying to do all the physics problems that they're solving but I think at the baseline of it was One thing trying to enable Scientist trying to enable developers trying to enable whoever is trying to build the product that your company is for like what your business runs on So I think this is one use case. It's not just for semantic But I think everyone wants everyone wants that fancy one button click that Helps a developer deploy his code into staging right into production. So provide standardized environments. This is where I think Containers help a lot You want the same environment to be used by all developers in your team or all scientists in your in your son lab or whatever Seamless deployment packaging across platforms. You have Multiple cloud deployments. You will have multiple staging environments. You have qa's You want to be able to reduce the time required for running something on from like a developers laptop to something that runs On a qa box where the qa is run You have something in staging which is pre-prod and then you run into production So you want to minimize this so for that you need a similar Deployment and packaging across platforms that we have and then the test ones deploy many model so to be able to guarantee that if you were to test something in One particular environment that if this code now goes into like ten different data centers That you have already deployed with it will run there. So it's like you test it in one environment You don't have to test it ten times again to deploy it in ten different data centers. So that's more of a point So this is one thing we really want The next is worse an upgrade. So with the whole open stack going with a six-monthly release cycle Upgrades is something that is very important because the thing is people have started using We we started using since Havana right now in opens I think even before there was I think it's a small POC with grizzly, but upgrading is Can be a hassle especially when your scale grows when you have many data centers It you have to be up and the whole thing of staying on the on the master on the on the Master branch it requires you to have a good amount of CI CD and then a policy figured out to how you will do upgrades So different versions of applications deployed on a single node. So this would I would like to use this in terms of a Control plane upgrade like so there are many ways of doing upgrades of control planes You can do an in-place control plane upgrade You can have a parallel control plane setup or you could use containers What you could do is on the same host which is running say your Havana No, our scheduler service if that is containerized you can have an ice house container with Ice house code base in it sitting in the same host and then you could do version switching between these on the day of the upgrade so seamless switching between containers and minimizing downtime so to be able to make sure that the Switching between and the whole upgrade process doesn't take a lot of time Ease of rollbacks during deployment failure This is also another thing if you have two containers sitting on the same machine You'll be able to seamlessly do a rollback So version upgrades is a very good use case that we are looking forward to solve Performance intensive applications in semantic there are various Applications that we see few require very data intensive fewer CPU intensive require a lot of data crunching and stuff So we want to make sure our performances are near bare metal speeds Faster boot up times which again goes to comparison between VMs and containers So I mean these use cases I think you might be getting a general feel that you require everything you require VMs for some things You require containers for some things you would require even bare metal notes for some some of these use cases resource isolation benefits of VMs so So So I mean you can see that point two and point three Requires so you need resource isolations of VMs what VMs provide but at the same time you need faster boot up time So your answer is slowly going towards containers. So Again to this point there have been people in in the company itself walking up to us and trying to say that you know, there was this code deployed on one of the machines and They had they thought that next day morning when they show up that the same application set is gonna stay as is untouched But they realized that someone some DevOps engineer Someone went at in the night and then they changed some setting in in the base in the base Compute node and then that destroyed their setting or they lost something on their part So this is something where they wanted like if they can have like a snapshot of something like again going back to docker containers I think they were saying that why don't we evaluate docker containers wherein if they had snapshot at their state and They did not care about what happened to the base and base engine of the base Compute node if they came back the next day, they figured out something had changed it just redeploy the container back in again and then they are It's like they did not find that something had gone amiss So this is one of the use cases that we want to make sure we handle overlays in SDN is another thing wherein we have a use case where we have applications deployed in a VM and they need to talk to applications deployed in bare metal nodes and What is happening is so the SDN solution that we are using so the vrouter setup that it has The VMs are sitting in only networks and for the VM to talk to a bare metal Which is in the underlay network it has all every packet has to go through an external gateway routers and that can get very expensive When the number of packets go up So what we thought instead is that since the bare metal application requires something some performance Gains of a bare metal why not use a container there have Overlay network plugged into the container and now you're talking about two Machines on the same only network talking to each other. So that's one of the Gains that we will find in terms of performance when you when we start using containers instead of bare metal nodes for some of our applications increase latency on traffic Some of the other use cases. I think so remote failure debugging. I don't think we have an immediate need for this But yes, it's a very powerful tool to have because you want to know if something goes bad in your production Your developers will be able to fix it and the first thing that a developer always ask is can I reproduce this and the answer that you normally give is Try to set it up in QA environment The whole docker containers the whole use of snapshotting of containers and trying to give an exact replica of that Environment is like a really useful thing and of course image management. So glance is currently helping us with our image management But we need it to be on like a larger scale. So there's a lot of images that are getting created Where can Kubernetes help? So some of the use cases that I'm going to talk about with Kubernetes itself is So container orchestration and application deployment tools. So since this is a very good application level orchestrator Lot of developers can make use of this when I say that it can be used to Deploy say small Hadoop development clusters or even say open stack clusters for developers to make sure that they can do internal development in manage flex up and flex down requirements and Flex the flex down is is a very key concept of cloud because that's what makes things really elastic so Trying to use resources when you need and give them away when you don't so flex up flex down is a very important thing That can be there are people who try to write applications on top of heat To make sure that they can do some cluster management and then make sure that if Based on telemetry if the user is not there just Reprovision them to someone else, but then people net us all that for you Hadoop and big data use case. So this is one big use case inside semantic itself because semantic runs a lot of Data crunching dot of lot of pattern matching and that is done on our Hadoop clusters which are trying to use our open stack infrastructure for IAS solution so Helping them out is one of a major use case for us and I see a lot of gains where Kubernetes can help deploy say like say like Control cluster lean when I say control cluster you can have a job tracker It's a small cluster for her hoops puns pun up on the infrastructure say so you have a tenant Development and say for example and then you have safe for compute nodes Say one of them you can make you can deploy a job tracker task on it You can have task trackers on all the other compute nodes and it's like a small development environment and it can be Increased ahead. I mean need not stay at a development level It could go to like a production environment. You can actually manage Hadoop Cluster using Kubernetes maybe if you are running out of if you're running this out of containers So that's a pretty interesting thought there Monitoring and tooling again. I think even the Kubernetes guys They say that they're not just container orchestrators. They do much more beyond which is like stuff like self-healing and stuff They are like a lot of monitoring processes that make sure that your containers don't go down In the current cloud world with the whole open stack You have to come up with an external monitoring service. What that means is if your VM goes down. There is either a Separate alert that goes on and then you need to make sure that you fix it So Kubernetes helps that so why not leverage something like that if you have container use cases So going into the next section is what it's not about in-depth Analysis of what these are but a few callouts that we think are very important when we want to go forward and building the the entire ecosystem that solves everything that we just talked about right so Using open stack so I did not explain open stack because obviously we all know what it is here What have we got used to so when we have been using open stack for the past one year or one and a half years for now There are a few things there are a few good things that we want to carry forward when we try to build our ecosystem And those are basically tenantizing of resources so we Want resource isolation and we want them to be owned by specific tenants So it's it's a very important concept that keeps the open stack cloud right now going It's a very important principle that we want to go take forward in whatever ecosystem we build We want token-based secure access to everything. We don't want people to just SSH onto a box and start using it just because it's there Ecosystem of amazing as a service capabilities, so that's what open stack has given us the whole notion of any any service that can be That can be looked at as a that can be front-ended by a good rest API Called on a token basis called on a need per basis and then it does whatever it needs to do I know I know I need not bother about what needs to happen in there So like glance takes care of kids takes care of images for me like neutron takes care of Networking for me. So we want those capabilities. We want to keep that structure and of course virtualization LXC's and bare metal. So these are like the three things that we want to take forward the VMs containers and bare metals Going into a bit of discussion. So again a lot of blogs about the whole Docker versus LXC I won't don't say a fight But then why use Docker when you have LXC is that kind of a conversation, but My personal viewpoint here would be Docker is equivalent to orchestrating LXC's Templatized way of orchestrating containers a docker file from my perspective is a heat template very similar to heat template You can specify a lot of things In terms of what goes into a container what kind of process runs on it Maintaining snapshots of LXC containers in a glance Against docker images. So this is another so We are assuming right now that a company or someone has open stack has glance deployed and you could have an LXC container using your LXC driver inside NOAA and then snapshot it have the image saved in glance But this system is given to you directly by docker. So docker internally just works on this principle of Snapshotting layers. So it's it's a it's a very Interesting thought there and a little bit more about this is so LXC containers can be nested AUFS is nice to have but is it needed like you can have the AUFS way of doing things Very new layer on you have copy on right, but there are a few advantages disadvantages I'm not sure if the Docker community has already worked on it, but there was recently a bug that Went through and then like you run into a lot of layers So if you would like to in select layer say 10 you were to install one version of my sequel And then later on you were you try to do another Version of what you do an upgrade you still have the bunch of layers that are there So there is like a set limit on which you can so your size of the image can grow So you don't want that to be happening. So it's not always necessary. You would want an AUFS kind of a of a solution This is another thing so the configuration drift so puppet to manage configuration. So would you want your Your configuration management to be done by puppet when you go into a container So for that you would want a classic LXC container, which is a standard and it's already always there and then you have puppet to manage configuration inside it or Docker diff would do everything and then any state that needs to be maintained can be done in a volume Using containers with Docker is more user-friendly. So I think we have just covered this point before so these are some of the things that we kept in mind before going forward with What we want to do and then technology cost is also a very important thing You just don't want to go on adding technology stacks after stacks because it requires a certain amount of cost it goes through a lot of Monitoring it has you have put you have to put two links and stuff around it So there is a certain amount of cost that goes into it Making it user aware and user friendly like I can put in throw in a cuban at a stack And then just ask my developers that you start using this It's not going to be that easy at the start So you need to make sure that this happens when you are choosing a technology Impact to availability and will it scale out? So availability is another thing that you need to make sure it's taken care of like when you spin up a cuban It is cluster. You need to make sure that the cluster stays up Same with open stack. So these are some of the very learned cost Parameters that you need to make sure that you are making Note off Will it be how will it be monitored like the current setup that you have for monitoring your open stack cluster would be the same To monitor say for your cuban it is cluster would be same for monitoring your Docker Installations so there is a lot of things that go in and then security impact So security impact is very crucial when you talk about VMs or when you talk about containers the whole Thing of no more conductors where we want to make sure that people don't have access to the database directly from the compute nodes So these are all security related stuff Even about containers like right before user namespaces if if i'm a root inside Inside a container. I might even get access as a root inside the host So some of those things need to be made sure before you embrace any technology Very very very key points that We need to make sure So what makes an ecosystem right? Some of the points that I want to talk so the normal definition is a biological community of interactive organisms and physical environments Every component serves a distinguished purpose and solves a use case uniquely So this again goes back to the as a service point that we had that you need to make sure that You only add something when it has some value or uniquely distinguished value Coexistence without interaction if it serves the purpose It's not necessarily to make two technologies talk to each other if they are solving your use cases staying alone It is completely fine to run them different if if you there is no way to figure out how to run Kubernetes with heat or with open stack then it's fine. You can have a separate Infrastructure that's running your kubernetes clusters and a separate infrastructure that is running your open stack cluster And harnessing all the goodness together and open stack is a perfect example of an ecosystem which is already in place So so some of the so as I said mentioned before so i'm going to talk about three Designs that we came up with and we are currently working on them based on Which one gets our use cases solved immediately? so the first one that we Trying to go through is with morano in heat morano is You could even call it as it's doing a lot of workflow related an application level Orchestrator So it can get us integration with keystone. So that solves our tenetizing of resources uh application catalog for defining complex workflows So it has a concept of an application catalog wherein you have these application resources that can be reused by developers So there's a whole description that you can provide as to what goes into running an application What is required to deploy it? What is required to keep borders deployed? What is required to? Describe its availability So something similar to say like my application requires say it's going to require to My sql instances it's going to require one say a web server one say back in Node so you can specify that yes if one my sql server goes down Make sure that you spin up another one or just make an alert that the application cannot run Autoscaling and self healing features are there inside morano very similar to kubernetes Integrated it has a ui That is integrated with horizon. So we tried to test that out as well And then obviously it has a open stack project development life cycle which we can make use of The the whole ecosystem would generally look like this. So keeping in mind our use cases where you want to solve Having vms having containers would be this so we would want Keystone to be authenticating with morano itself and then the morano api is what something will feed the application templates to Uh morano will be talking to heat And then we can have container templates specified directly through heat also So no one says that you can't use heat directly anymore And then a docker plugin which the heat ecosystem already has right now can be used to orchestrate Vms and containers And then morano itself then can have the rabbit mq to talk to the morano agent to make sure the correct applications are deployed on those Either a vm or a container. So this is something that morano We have tested out with but Containers we are still in the process of making sure if the morano agent can be put inside a container itself And this container obviously can be a docker container with the right base image setup So it's it's not be should not be a single application docker container Uh, this is another thing that we need to test out If we can put something inside a bare metal node itself If we can install the morano agent inside this and then have morano Kind of know what applications are running inside So this is one very realistic approach that we are currently investigating inside our Inside semantic so containers as a service. So this is the whole magnum project. So I do not want to speak a lot about this but then it's very promising at least for me There have been a lot of discussions about this whether to take this forward and how to I think on Thursday, there'll be a design session talking about it So make sure you attend to me to have your use cases on it So clusters can be isolated put tenant domain again Exposes its own cli and maybe a scheduling logic. So it requires gant Unless an until gant comes into play I don't know what's the plan for scheduling logic inside it And use of aggregates to isolate computes for container use cases This is one of what I want to propose to the to the magnum community Easier integration. So let's take a look at The diagram itself, right? So again the same thing. So we want heat to be on top So I think one of the points that I missed on this slide was Easier integration, but should it orchestrate so container as a service will solve Some of the use cases like trying to get your containers into the container as a service Service, but then should do you want it to orchestrate? So you're going to miss all the goodness of the orchestration of Murano and the Kubernetes which is coming up in the third diagram But in this case you so I've just kept a heat module on top Which will actually do some sort of orchestration. You can even put Murano on top of this I'm not sure if you will require some special things for heat to talk to Contain as a service directly, but then again the same thing being is GAN scheduler This is still not completely in in in practice And again same with compute nodes having VMs having LXC containers if Contain as a service what to be my recommendation for them would be to have solved both They have the LXC driver and the Docker driver as well And then there is a there is one small Recommendation that I have is of a resource pool scheduler wherein what can happen is Through heat you would want NOAA to till till the time GAN scheduler comes into play. This is so you can have NOAA API Dedicate compute nodes just for container use cases and then using a special API inside containers provide this Set of computes to the container as a service and it can do resource scheduling within that pool of computes so Not sure if this will get implemented But this solution is more of like in like a really research research phase So let's see what happens with this and then this is the more realistic and maybe Another similar to Murano where we can actually do some testing inside this Using heat and heat to provision VMs that run Kubernetes components Integration with Keystone docker driver for heat Template categorization VMs and containers looking at the image So in this model so even talking back about the Hadoop use case that we had initially right so this kind of solves that problem so wherein you have Heat which is doing a central orchestration and you have these kubernetes cluster template You have docker container template and open stack VM template. So I'll talk about this plugin later. It's not there yet But maybe something that we can contribute What this means is When say let's talk about the kubernetes example right so when you were to set up a kubernetes cluster inside My compute nodes so say for example, there is a tenant say in in our case There is a Hadoop tenant who wants to have like a small Hadoop cluster setup inside So you can have a tenant spin up a kubernetes So the kubernetes system would require the master components This is like a very high level thing and then it would require the minions which would which will actually run the pods So you can have heat templates written so that it can spin up these VMs and these could be VMs or these could be Bare metals or it could be containers inside which a kubernetes master process can be run There are different ways actually to do this. I think the core os guys have come up with A good solution for this variant you can have I think the etcd Demon which which makes sure that When minions are added they get added back into kubernetes cluster But the the reason for the kubernetes plugin was that if later tomorrow in this system You were to add one more compute node if you were to add a minion to that system How would the master get to know that it has a There are various ways of doing it a kubernetes plugin driver which can talk directly to the To the system which is installed to make sure that when new minions are added they get added to this System and then you can have the rest of the use cases also solved So you can have say if you don't want kubernetes to manage your docker demons docker containers for some reason You could have heat do directly through the docker plugin so Again, I think the the the so some of the similar technologies as I just mentioned so core os and etcd system d fleet Is doing a good job at that. So we are trying to evaluate that as well Open shift and crowbar very similar mesosphere running on top of apache mesos. It's doing very similar things and cola is also one so To talk about things right, so I think we touched about a lot of technologies here and Again as at the start of the talk when I said it's not just about how but also why There are Five different ways of solving a problem Always you there's no one single thing that you want to put your point on You can write blogs hate blogs about not liking one technology Not trying to say that yes, then elixies is the way to go. I don't need docker anymore I would want to put A viewpoint on this is if it solves the problem then use it It's not about technology is not about It doesn't get you to where you have it's just an enabler So if you can find the right stack for you If you can find the right ecosystem keeping in mind that all the costs are taken care of And if it solves your use cases, then that's the way to go Some guidelines to keep in mind again of what I just said use cases comes first solution follows Don't come up with the technology and then try to find a use case for it So make sure that you have a list of all use cases inside your organization Well-defined rest apis are cool when integrating services Each technology choice that you make I would say make sure that it is front-ended by a cube like by a very clear rest apis services Which you can call again the point being is you want services to make sure that they do some use case all some use case uniquely Uh technologies are better used separately sometimes That's again one point and operation on security costs of running such an ecosystem. So like I mean I need not explain this Not to reinvent the wheel again, there are a lot of people trying to do a lot of things like I see kubernetes doing very similar things to what core cube is doing to what you can enable with Something like morano to what you can do with apachi mesos with mesosphere Choices is still with you guys But then make sure that when you do evaluations like the three choices that we have we have Narrowed down it from like a huge range of choices So we want to make sure again the goodness light that we went through What have we got used to and what is it that we want to carry forward when we Are doing going to do our technology choices And usability aspect. Yes keep it simple Make sure that your developers do want to use what you have built There's no point building something that people are not going to use or find it very difficult to start using Some of the future work that we want to work on right now after having talked about this is the performance analysis of this I know I had a point on the summary that I was going to talk but we did not have enough time to come up with this So maybe in the next six months we'll have a complete performance analysis picture Specific driver developments some of the driver developments that I just Pointed out there. So we might want to do those like even the container service project We might look at it and try to provide some help in that And the service is deployed and managed at a regional and a global level So we don't want to be at a one data center at a time. We want to make sure that services Can be run at a regional level. So that is one thing we want to do when designing And open source collaboration contribution as I just talked about we want to make sure that we do this to have So, uh, thank you. That's that's all I do you have any questions?