 Okay. It's time. Hello, everyone. My name is Rachit. Okay. So, my name is Rachit. I work for IBM Software Labs, Bangalore. And today, I'm here to present a session on creating multi-node Hadoop cluster using Docker containers within four minutes. Four minutes is our average time. Like, usually, we get it clusters in two minutes also and sometimes around six minutes. And what I am, I am not a system developer. I am a software developer, which, like, we are using container technologies. We are not developing them. So, I am here as a user of the technologies, like, you might be hearing in this conference for a while now. So, what we do here is that we provide various services on cloud, and we build those services, and we use Bluemix as our marketplace, where we will find a lot of services there. And this is one of their services, which is Big Insights on Cloud, is one of these services. And so, I'm very excited to talk about, like, how we are able to achieve this within this much time using Docker containers. So, basically, what we have here is, like, a big monolithic application, which we first containerized it and build service around it. And I'll first touch upon, like, what are the, why we need such kind of cluster in such short period of time? Like, what my customers were asking? Like, what were their needs? And how we are able to do this? So that's going to be my agenda of my talk. So, let's first put, like, as a, who are my customers? My customers are, like, data scientists who want to run analytic jobs. And they want to do a lot of, like, high queries. They want to run their Spark jobs. They want to run their R scripts. So they are just, like, data scientists. They are more bothered about their algorithms, their data models. They are not bothered about the clustering technology. Or they are not bothering about how I am going to config my cluster. Basically, they are looking for Hadoop cluster or a runtime where they can just submit their jobs. And it's their wish, like, can I get a Hadoop cluster on demand? I don't want to, like, sit and learn the installation. I don't want to configure it. I don't want to, like, spend my time setting up those clusters. So let's say we, they get the clusters from various providers. Now, the problem is, like, they need upgrades. They need to maintain those clusters. They want, they need to monitor the health of those clusters. So they don't have the time and energy to spend a lot in, like, oh, I need to patch it now. That service has gone down. How should I bring it up? And they also want to scale and de-scale depending on the load. Sometimes they would, they would, they are okay working with five node cluster, but sometimes they need a 20 or a 30 node cluster. So these are the requirements, but they are not able to manage it. So just taking the requirements in what we understood from them, what they are looking for is a multi-node Hadoop cluster. And they want elasticity, like, they want to grow and shrink in size on demand. They, they want it economical. It's not like they have to spend thousands and thousands of dollars for the clusters. And they want it fully managed. Like, I'm not bothered about patching. I'm not bothered about upgrades. You do it on my behalf. They want these clusters to be repeatable. Like, and whenever they request, it should come in a consistent time. It's not like they have to wait for days or full days. I'm done with my analytic jobs. I want to run it. I can't wait for a day to see the results just to have the cluster ready. And they want minimum disruptions during the patching. It's not like we are down, have a lot of down times. We need to have 99 availability. And they are also looking for something like service composition. Like, we may offer a lot of services. Like, they, they are, if they are more into Spark jobs, they would opt for the Spark service. If they are interested in pick queries, they will opt for the pick queries, else they can choose to omit those services. So what it takes basically to have a cluster on-prem. First, you need to have, like, machines, right set of machines. Once you have those machines ordered either from your cloud infrastructure or on-prem, you need to prepare those machines. Like, for a Hadoop cluster to work, it requires a lot of kernel settings. It requires the disk to be prepared. Partitions to be done. Networks to be laid on. And we create new, like, various networks, like a private network for my cluster, a public network to access it from outside. Secure those networks. Then, basically, start the install, installation of the Hadoop. Prepare a blueprint or a configuration. Like, what are my heap size going to be? What, how many nodes I want? How many, how much memory I'm going to give to my yarn, scheduler jobs? How I'm going to configure my map reduce? And once that is ready, then they start installation. And many a times, installation fails. Main reason being, the environment is not consistent always. Let me just give you one brief example. Like, prior to our experience with the container technology, they used to be, like, we order our machines from our cloud provider and they were supposed to give us, let's say, RHL 6.7. But somebody ran YAM update and the OS moved to 6.8. And basically, our applications start complaining and it failed to install. And we had a downtime in the production. So that really impacts us and our customers. So we need environment to be consistent and so that installation succeeds all the time. So we have now this kind of service available already, which is based on containers and we are exposing it through Bluemix. So those who don't know about Bluemix, Bluemix is a cloud platform which we are providing as a service and it's developed by IBM. And it supports various several languages and services. And it's integrated with DevOps to build, deploy, manage your applications on cloud. And it has a lot of services available for you. And basically, you can access a range of services, not only from IBM, but from other vendors also who deploy their services on Bluemix. And there are runtimes also available for you. Like there are runtimes like Node.js, Java, Python. And there are various services which you can integrate and have your end-to-end solution. There are services like Cloud and DB. There are services like Watson APIs which you can leverage for your machine learning related programs. And for a developer, basically Bluemix provides environment where you can, on a single click, create those services, create their apps and get started with your business part, run your algos. So we also have this service on Bluemix which is called as Big Insights on Cloud which provides such clusters for you, Hadoop clusters. So what I'm going to show you is I'll start the creation of this cluster. We'll see what it takes to create a cluster on Bluemix. And I'm going to log into this is the Bluemix landing page. And I'll log in now. So I'll basically give the request and a cluster should be ready in a few minutes' time. So we'll see what it takes now to have a cluster now. So the Bluemix, as I said, it has a lot of services. So this is our Big Insights on Cloud service. And I have provisioned one instance of this. And once I open this instance, it will basically take me to the landing page for my service. And then I'll open the, what we call it as a cluster manager page for us, where you can see your list of clusters which you have created. And you can request for new cluster. I'll just give a name. And then I need to provide the user name which I'm going to use for accessing my cluster and password for it. And then I can basically tell the information like how many data nodes I need. I'll just say I need two nodes. And we support various versions of our offering. I'll choose one of them. So this is where I was talking about the service composition. Like, user can choose which services they want on the cluster and what they don't want. There are some mandatory service which we have the base. It's like HBase, HDFS, which we provide as a base. And then they can choose optional services like Spark, Uzi, or Flume. And there are many more services depending upon the platform they choose. If they choose to have value at platform from IBM, we'll provide text analytics, big SQL, and various other services for our users. So I have given the request for the cluster creation. And it should be ready in a few minutes. As you can see, it's in the preparing state. So meanwhile, I'll talk about how we are able to do what we are doing. So to begin further, I would like to share some stats with you guys. Like, what was the timings which used to take creation of this cluster previously and what are the timings we are able to create now? Let's say if you are ordering machines and you have bare metals ordered, you need to create a cluster. It will take you about three to four days to have a cluster ready because you need to, like, prepare those machines, check the configs, get all the prereqs right. And let's say if you even have those configs done, then it takes about one hour time just to run the installation. Because installation will do a lot of yum install Hadoop. It will set up a database. And it takes time to installation to complete. And with containers coming in, specifically Docker containers, like, we are able to do it in minutes. And we are able to, cluster is ready in two minutes. And it takes about two to three minutes to start all the services which we have opted for in our cluster. So before moving to the container technology, we experimented a lot on other technologies. Like, we started with bare metals. We had our basic offering was from using Chef as our configuration management who we had written a lot of recipes for Chef which was, like, preparing our machines, preparing our discs and then making it ready. Problem with this kind of approach was, like, the environment was not always consistent, as I gave you an example earlier. Like, sometimes whenever you do a yum update, the version changes. And we have to see the heat of those impacts in our production environment. Then we basically started exploring, like, let's say we have a pre-configured environment somewhere. Can we take a snapshot of that environment and replay it in our production environments? So we started exploring around the various imaging techniques. Xcat is one of the cluster and the cloud administration toolkit which basically comes up with the various management techniques for clusters, grids, and it's very agile, but it also has some imaging techniques available. Only problem using Xcat was, like, we have to manage a lot of stuff to get our clusters ready. So we also explored various cloud specific providers' imaging techniques, like, let's say AWS, SoftLayer have their own imaging techniques available. That works fine. Only hurdle there is, like, if you want to be a cloud agnostic, you may not want to go that route. So let's say you are offering now on AWS, you want to offer on other public clouds. That technique will not work well. And even if you want to scale it to a kind of solution for private cloud, that technique may not work well. So based on our experiments, we came up various guiding principles for us. Like, we want virtualization, we want imaging, but what are the solutions available? That solution should be easy to maintain. It should not be like I have written a very good solution, but it's very difficult to patch my existing clusters. It should be performance, like being enterprise ready clusters. It should be performance, should be near to bare metal performance. We should not see the, let's say, 30 to 40% degradation. Even 10% degradation is not acceptable. We don't want to write a lot of code on us. It will be, like, for then time to market, it would be very slow for us. We want to use what is available for an open source. And it should be cloud agnostic. Basically we want our presence on all public clouds, and we want our solution to work even on private clouds. So keeping all the things in mind, we started exploring containers two years back, and that was like Docker was just in initial stages. I basically started exploring from the imaging side first as Docker, like as a bundle pack who can replay the same infrastructure which I have in my dev or tested environment to production-ready things. And when I ran my first Docker run command, it was like magic for me. I couldn't believe it's starting in so less time. So we started exploring around containers. We were first of mindset, like we are comparing them with VMs. So we started asking us questions like, we have containers now. How do I maintain, how can I back up my containers? Like I don't want my customers to lose their data. Where should I keep my data? Should I keep it inside the container? If a container dies, the data will be gone. So how do I network those containers in a multi-host networking kind of a thing? Because two years back, the multi-node networking was not available as smoothly as it is now. So we explored a lot of options using open-v-switch, using LXC commands to have networking in place and basically get answers to all our questions. And then we arrived at questions like data should be externalized to the containers. We should not put inside the containers, my customer data, because containers will come and go. And we should be able to handle the restarts of the host machines because as a cloud provider, somebody can just put the plug off and it should not be like, okay, now host machine has restarted what happens to those containers. Those should come up automatically and they should join the cluster once again once the restart is complete. So what we have now is like we have about 500 bare-matter machines where we are spinning containers out of it. Right now we are not using any orchestrator like Kubernetes, Swarm, we are not using as of now. I will tell the reason why we are not using them. We are spinning containers using our homegrown solution and we are managing those containers. We are using overlay networks now provided by Docker for our multi-host networking. And we are using a private registry which basically maintains our images and we pull those on demand whenever there is a new image, we use those registry. And we are using local storage for data. So this local storage helps us to have a very good IO for our like most of the Hadoop jobs are IO intensive. So having local storage on the host machines really helped it. How do my clusters look like? So on our bare-metal machines there will be multiple containers and in this offering it's shared infrastructure with our users. Like from a single machine you will have containers for multiple clusters belonging to different clusters. So we have a problem of network isolation. It should not be like person A is able to hamper with the person B's cluster. So we need to be bothered about this. We have various like first we had a very big monolithic app which had everything inside it. Like it's had the LDAP server, it had a KMS server for encryption, it had various monitoring related tools inside it. And like Hadoop related services were also bundled in one single image but we started breaking down step by step. We took out LDAP out as a separate container. We took out KMS as a separate container. We took out MySQL out for our metadata as a separate container. We separated out our master and data nodes containers. And we had a long list of images which we are now using for creating a cluster. And this really helps us in a way if we want to change, let's say MySQL version or KMS version. So we don't have to do much code change or our other images are not impacted. So for networking we are using overlay network. We are using a driver called as Veeve. Docker provides various drivers for networking. Veeve has really helped us and we have done very comparative studies for performance. And overlay network overall give you 60% of your network performance which is very slow. We know as of now we are working with the latest technologies like Docker 1.1.2 is providing for Mac, VLAN and IP VLAN. We hope to improve that and have a better performance around like 3 to 5% degradation is fine, but not the 40% degradation from the base network. So we also make use of something called as port forwarding. And as I said, we have three kinds of network. One is a private network, one is a public network and we have a management network available. For public network we use port forwarding like our requests are coming from let's say on a port 8080 to log into our Umbari server. This will be routed from the public interface of the host machine to the container using port forwarding feature. For private network we rely on the network created by the overlay networks and basically all ports are open on a private network and services and containers can work on those private network easily. This diagram will help you understand better the networking how we have done it. So we A, B and C are our clusters. We spread across various host machines and basically we have some containers or nodes on one host and same cluster will be spread across. And here I will bring on point like why we are not making use of orchestrators available in the market. First reason being they are not providing me control over CPU set. So we provide exact CPU sets like CPU number 0 to 4, 6 should go to container A in host 1 and 1, 3, 5, 7 should go to container B of cluster B on host 1. This is to avoid the noisy neighbor issues. Let's say even one particular customer is running high loads of CPU intensive work. It should not impact my cluster B. So these are missing as of now in the orchestrator. We have been talking to Google for Kubernetes and so on from Docker team to provide those capabilities like have CPU sets and then we can make use of those orchestrators. Another reason being we are not making use of that is they don't support as a disk, a local disk as if first class entity and we want our data to be written on local disks and in order and some people might say that may have issues because if the host machine crashes what will happen to the data? So to overcome that we are making use of Hadoop's built-in redundancy and replication. Let's say we spread out like how we spread out nodes is like data nodes of a particular cluster should spread across machines like A host 1, host 2 and host 3. If I have a 3 node cluster it should spread out across the host machines and even if host 1 crashes or the disk on the one of the data node crashes, Hadoop can replicate and redundancy will come in from the data node which is on the host 2 or host 3. So that way we are able to recover data even if there is a full crash of the host machine or a crash of the data disk. So what are the technologies involved in our cluster creation? I will take you through. So basically when we have submitted that request previously that goes to something called our application layer, REST layer and API gateway then takes that request and then it goes to our deployment layer which basically creates those clusters and in the deployment layer it's where we start the Docker containers, we mount the various volumes, we do the port forwarding, we do IP assigning for that particular containers, we give host names and we assign network to the containers. So what are the phases involved when we have to create this? So what we do is we have pre-ordered machines like our provider gives us allocate some machines for us and we just take those machines and it is available instantly for us, those machines and we prepare them and then start the orchestra to cycle which is like to create the clusters and have them ready for our customers. So when request comes to our API gateway it has to do a lot of stuff, it has to assign resources like I have received a request let's say of four nodes, I have to now distribute those four nodes out of my pool, where are those four nodes going to sit on? That's one job of our API gateway, then it decides on the what all IPs it's going to consume and then it prepares the layout for the cluster, let's say if we have a four node cluster what all services are going to run on node A, node B and node C, so that prepares our config. When this is done we give the request to our deployer agent, deployer agent is something like a bunch of scripts for us which are sitting on each host machine and what those scripts do is like they take the request, they see what they need to do, so we pass them in a JSON format you need to do this stuff, in this description they have to start the container, they have to mount these particular volumes and they get to know what all IPs to assign, they also get to know like what scripts inside to run, so once those containers are started they need to join them together to what we call it the Umbari server which is the guiding for our cluster installation and once the nodes are ready, basically containers are started on each of those machines and they are ready for the cluster creation, we start the cluster installation, so how this imaging helping us is now is when cluster installation starts it tries to do let's say yum install Hadoop, it says it's already there, it skips that then it tries to do yum install Hive and other services, so it's already there in the images, so it starts keeping those and let's say even in case of database it says I want to prepare my metadata database, it says it's already there, you just need to fill your configs, so that's how we are able to do it faster because in image we have everything pre-installed and we are just configuring them at the runtime during the request and so that's how it's really fast cluster creation we are able to get within minutes, so I'll now see if that cluster was ready, let me just refresh it, so it's active now, so we'll take you to this cluster, so this is an Umbari page for IOP, which is the landing page for a cluster and I'll log into this cluster, so you can see all the services which I have chosen along with this park is there in this cluster and basically this now cluster is ready for any kind of deployment which I want to do as a data scientist or as a user of Hadoop related clusters, any questions? So to summarize my discussion like what we are able to achieve using containers create clusters very fast, cluster creation time is like 90 seconds and it takes about 2 minutes or 3 minutes to start it and we are able to create clusters reliably and repeatedly, like every time we request it comes up because there is no environment dependencies involved now and it's like fully managed, any service going down or any service like if it is there is some problem in the host machine, if the host machine restarts we bring those containers up automatically and they join the cluster back and if there are any disk issues that is being managed by us, it's highly secure, we have added a lot many stuff apart from the containers or the monitoring tools, we have various, like what we called as ITCS guidelines, we need to stop the root SSH, we need to stop various services from accessing it so all the security bunch as enterprise ready cluster has gone into those images and we have very good mechanism for patching mainly because of being docker containers, what we need to do is like if there is an upgrade required in our binary, we prepare new image in dev time, test it once it is ready for deployment, we push those images on host machines where the clusters are already running and during the patch window we just bring the old running containers down and start the new containers with the new image and the patch is done so that's all which we need to do during the patch window and basically we are able to do that because all of our data is externalized, that really helped us do the patching really fast and we are able to use of resources well as we are sharing the like host machines with various containers and clusters are like shared infrastructure because of the shared infrastructure the resources are being used very well and it's being cost effective, this offering is currently in beta now till October, post October it will be charged as less as $1 per node so anybody who is interested will get very good reliable Hadoop clusters at very low cost and so this is offering landing page, if you are interested you can log into the blue mix and have make use of this offering which provides various services as of now and we are improving on adding value ads to this value ads will be text analytics you will be able to do and run big SQL queries and if you are interested more you can log into blue mix to look more what are the other services available, what are the various integrations available around Hadoop cluster Hadoop is the landing zone there are other services which you can make use and have an end-to-end application ready and if you are interested in big insights on cloud you can have various demos which are available on YouTube and you may want to try tutorials how to use this cluster which we created further to do various analytics. Thank you. So I am open for questions now if you have any questions so this agenda talk yeah, right, right. And all we are managing because as I said the orchestrators which are available are not providing me that fine-grained control over hard drives even on CPU sets. So I talked about my resource manager layer during in the APA gateway that's the job of the resource manager layer to decide on this host what hard drive it is going to use what CPU set it is going to use for that container. We do plan to open source that code if somebody is looking for similar code because orchestrators are not providing that fine-grained control. I am rather yesterday I discussed the same issue with the Docker team. They are pushing me to like add that capability to Docker swarm so we may see Docker swarm having that capability very soon. Any questions? Okay. It's IOP as of now which is based on HDP, Hadoop data platform provided by Hortonworks. It's like now IBM and Hortonworks had a construction where they are providing IOP as an open data platform. And we do plan to support various flavors of IOP with services being added to IOP as plans like that. We may have other integrations also along with IOP for SPSS to have more analytics building. Yeah. Other things are like value adds for text analytics. Big SQL and we will have soon something called Zeppelin coming in and other whatever Apache will recommend to add those projects. Yeah. So we are not limiting networking as of now but we do plan to have bandwidths for the containers like this container should not consume more than like say 20% of the bandwidth but that is not there as of now. That we limit. We basically mount partitions of a particular disk only like let's say our host machines have 12 disks and we can spawn 12 containers on it. We will give partitions from a disk one to one container and partitions from disk two to another container. So that way we are managing on its own like our code manages that. Okay. Yeah. We mount six volumes to the container. That will be a number of nodes will be spread across various disks and each data node as of now has one disk only but that disk size varies. Yeah. So are like come again. No, no, no. We like all the disks are partitioned separately and each disk is like two TB each and during the Docker run we mount those disks like how much user has requested like the user requested for two TB per node and two TB disk is mounted on to that container. Yeah, that like we we have some partitions also within the disk and we will mount give that that 200 gig partition only like so let's if customers requested 400 gig so we have partition though disk of two TB to 400 gigs and mount that partition not the entire disk. Yes. That depends upon the like resource manager layer is intelligent. He will not schedule that container on the host machine which has two disks. Yeah. If you have more questions we can take offline. Yeah, sure. I would really love to answer any questions. Thank you.