 I am Swami Akant Mahakul, this is Mritul Mahajan, Harshit Mughalapalli, Ritik Kumar and Amit Kumar Tiwari, and our project is configurable and scalable IT Bombayx MOOP platform on Comodity servers. So, as you know, IT Bombayx is an instance of OpenIndex, so here we are going to deploy it on a Comodity server, which is a multi-node cluster, and we'll be using Kubernetes for it. So, OpenIndex has two kinds of installations, native and DevTag version. So here we are using the DevTag version, which uses containerization, and the main motive for it was, if we were using native version, then there comes the concept of virtualization, where for running a particular application, a VM has to be installed and it has a guest OS running on it. So, a whole OS is required for a particular application, so it's not, it's really resource consuming. Hence, we come to containers which run over an engine and they're really lightweight. So, the tools that enable us to make containers are Docker, Rocket, the many. So, Docker is the most popular one, it's open source. And so first of all, what we do is we create a Docker image of the application that we want to run and it has all its source code, all the dependencies that it will require, and then they are packaged into a container of when they run. So, a container is a running instance of that Docker image. Now, in this project, we are going to use a commodity server. So, what is a commodity computer? So, a commodity computer is a computer that is dedicated to run server side applications. And the main reason why one would like to use it is first of all, because it is really cost effective. So, if someone wants to set up a single server system with some high configuration, then that would cost a lot. So, it's better if we use already available computers and club them to create a cluster and then share their resources. It's also easy to maintain as in case one node fails, then we can simply go and diagnose that node instead of investigating the whole system as would be the case in a single server system. Then in order to scale the system, you can simply connect a node to it using Kubernetes and the next time, whenever there will be a deployment, that is whenever a container will be scheduled, so the new node will be taken into consideration. Now, avoid downtime. So, as we were using the DevStack version, all the open index services would be running as containers inside pods on nodes. So, pods and all this we'll come to know about while we discuss further. So, suppose there are three LMS containers running and one of them goes down. So, the external traffic will be directed to the rest of the LMS containers. So, there won't be any downtime. So, now I would like to hand over to Mridul and he will take it further. Since we are going to have a highly available and distributed open index instance, we'll need an orchestrator too. So, what an orchestrator would do is that it would coordinate all the nodes we have in our cluster. As an orchestrator, what we are going to use is Kubernetes and it's based on Google Box system which was being used by Google for many years and hence is massively scalable. So, we'll now briefly discuss some of the Kubernetes objects we have. Now, what a pod is, it's actually a collection of containers and those containers would be inside the same namespace. Hence, these two containers can communicate via a local host. In fact, each pod would be assigned a cluster level IP and hence all the pods in a cluster can communicate. Next, we have a Kubernetes replica set. Since we have a highly available architecture here, so multiple instances of all the services would be running and hence we need sort of a controller to control all these replicas. So, what we do is we write a manifest file wherein we declaratively declare what the state of our cluster should be and the replica set would do its best to reconcile that state. Since Kubernetes follows the principle of single responsibility, hence each Kubernetes object would try to do only one work. Hence, we have for rolling updates, we have a wrapper object around the replica set which is called a Kubernetes deployment. Since we are going to have multiple instances of all the services of Open edX running on our cluster, the endpoints wouldn't be stable. Hence, we wouldn't know the IP address of each of the pods we have. So, what we can do is we can have a stable Kubernetes service and that would act as a proxy and the pods which this service would be proxing can be decided by a label which we can write in the manifest file for Kubernetes service. Now, since each of the pods was being assigned a cluster-level IP, all the external applications cannot access it. So, for that we'll discuss about ingress controllers later. So, we'll briefly discuss about Kubernetes networking. It's based on a flat address space model that is each of the pod would be assigned a cluster-level IP and each pod can communicate without any network address translation. So, since the containers in a pod will have the same name space, they can communicate by a local host. As for the inter-pod communication, they would have a cluster-level IP and they can communicate well. But since the endpoints won't be stable, we can use a service object. And we have a container network interface which would define how we can write the plugins for Kubernetes and we tried using two Kubernetes networking plugins. One was Flannel and the second one was Calico. What Flannel does is it allocates a subnet to each of the nodes we have. And whenever a pod is scheduled on any of the nodes, the node would then assign an IP to the pod based on the subnet we have assigned and the details of the subnet would be stored inside a data store. In case two containers inside the same pod want to communicate, they can communicate via local host. In case two pods want to communicate and they have been deployed on the same node, then they can communicate via a bridge. In case two pods need to communicate but they're on separate nodes, then what we can do is we have a FlannelD daemon process running on each of the nodes and the packets would first go to the FlannelD process, FlannelD process and it would encapsulate it inside some network headers. Now that would add an additional overhead. Next we have Calico which wouldn't encapsulate the packets inside a network header and hence would reduce the overhead and would be faster. Now I'd like Amit to discuss about Kubernetes architecture. Cube server API. So I would like to describe about how Cube server API actually works. Cube server API is actually a gateway to Kubernetes cluster. So whenever we run command using Cube CETL, it actually passes through Cube server API and a Cube server API performs all its three core functionalities like API management, request processing and internal control loops. Cube server API actually implements restful API over STDP and performs STDP requests. Cube server API also is responsible for storing the API objects into a backend storage called ETCD. API server is also communicates to control manager and Cedula where Cedula is responsible for allocating node to a newly created pod that is not yet assigned a node and control manager actually runs on master node and run controller. There are different type of controllers like node controller that comes when a node goes down and there is replication controller that maintains correct number of pods. In each of the node there is Cubelet C-Advisor. I think you should say what you have used. Yeah. So every node have Cubelet C-Advisor and Cube proxy in it which leads to communicate with master node. So we have used actually Ansible to configure our all nodes. So like we have many nodes in our Cubilities cluster and each and every node have Docker and Kubernetes installed in them. So installing all the configuration file manually will be a very tedious task. So we have used Ansible for that. Like we have to specify all the IP addresses of every node in a host file and we have to write Ansible playbook where we specify all the commands that are needed for configuration of every instances like Docker and Kubernetes. So here is example of how we specify host where we will specify all the IP addresses of each node and we will write a host all that means that we want to install our configuration in every node and become true means we want all the commands to run in pseudo-mode. And in roles we specify Docker installation. The Docker installation is a file where we have to specify all the commands that need to be run for that configuration in same for the Kubernetes installation. How do you think we'll continue? Okay, now taking forward, we had earlier in dive check what we had Docker compose files. So the Docker compose files cannot be directly used with Kubernetes cluster. So what we had to do, we had to somehow figure out a way so that we can use those compose files in some way to with our cluster. So what we came across as a tool compose which is provided by Kubernetes to which we can convert the Docker compose files into Kubernetes resource files. So the process goes like this. We have a Docker compose input file. We provided it to a compose tool and then we get a Kubernetes resource file as output. But the issue was the tool was not accurate. Okay, so it did not give everything according to our requirements. So what we have to do, we have to create some position volumes and claims for our services. So basically we had some 13 services running which we were able to run. So how it worked was we created position volumes for each of the service and we had a claim file that will be attached with a node. So whenever a replica of a node is created or the claim file is called and a space from the position volume is claimed by the node. So in that in testing purpose we use a 300 MB space for the position volumes and each claim had a 50 MB of size. So whenever a pod is created of 50 MB of size is claimed from that position volume. If we need a higher space, we can scale that. So what was the deployment? Deployment was just a stateless state of our cluster which was running but it could not be exposed to the external traffic. So how it works was we had a deployment from the master node and we had pods running in the various nodes. So in the example we can see we had three nodes and we had multiple pods running on each of the node which is scheduled by the cube scheduler and the number of replicas is always checked and maintained by the replica set manager. Now moving ahead, this is an example LMS deployment file. It's too long and I think it's not clearly visible but it's just basically we have specified what kind of deployment do we need the name of the containers and the volumes and the position claims. That's it. And coming to the services, so as we told that the deployment is cannot be exposed to the external traffic. So what we have to do, we have to figure out any way to expose it to the external traffic. So that's the services files. These are basically similar ML files. You can see here. These are the ML files. We mentioned the ports in which we want to host the services and using the IP we can access it. So the diagram is we have an incoming request. So it's a service type which we use what node port. So it comes to the service and the service directs it to the particular port in which the container is running. So these were the four nodes using which we created the cluster. So the master node was, the master and other three were the worker nodes. So these were the nodes which were running. Initially, we had this position volumes which we created and we attached the volume claims to that position respective position volumes. And we were able to run all the 13 containers. These were the Kubernetes API dashboard using which we can monitor the health of the system whether all the deployments are running exactly how many number of ports are running and if you want to scale up, we can directly use this or we can use the command line interface that is kubectl. Then we wanted to test for the fault or ruin. So there was around six ports running on the node IPC. So we tried to manually remove that node from the cluster and we wanted to check whether all the ports that were running on that node were coming up in the other nodes or not. So we found out that all the ports that were running in the IPC came up in the other nodes that were node 1000 and K master and that took a span of 15 to 16 seconds and within that duration all the action and traffic that was coming to that port will be directed to the other replicas of the port. So there won't be any downtime in the cluster. So again, we achieved the deployment state of 100%. So now moving on with the performance testing, I'd like to hand over the mic to Haseed. Yeah, what is performance testing? It's a type of non-functional testing. It does not involve testing of functional parameters and we choose Jmeter as the tool because it's widely used open source tool. A simulating virtual users, we did that using a random function which generates different user names and then passwords and adding think time is a crucial part of this performance testing. Think time is like when user want to stay in that page for a while, it doesn't mean he wants to go to the next page immediately. So we add think time to make it more realistic and we have test performance tested on two versions like two installations, so open edX native installation and the Docker based DevStack installation. So the load generator system details are the R system details where the test is being performed and the server configuration comes out to be like this and the test details are the duration is one hour where we have performed the test for one hour and we have set virtual connections to 100. This is the CPU utilization, it turned out to be 89% on average and the memory utilization turned out to be 85.86 percentage and the test results, we calculated the actual memory consumption percentage using the, we subtracted the buffered and cached memory from the total used memory and divided by the total memory to get the actual memory consumption percentage which turned out to be 86 in the previous case and we tried the test for 120 to 130 users and the test failed, it gave an error called gateway time out and this is the resource utilization on the server. We monitored using the top command and you can observe that G Unicorn is taking the maximum amount of resources and this is the summary report for the open edX native and to sum up, it can handle 8,000 transactions in one hour and the throughput turned out to be 4.01 requests per second and this is for DevStack, load generator system details and the test details are the same, duration is one hour and 10 virtual connections, CPU utilization is 40%, memory utilization is 62%. So these are, these are not, this is not on Kubernetes. No. And top command on DevStack installation turned out to be like this and Python takes most of the resources, here Python indicates the Django run server in this DevStack installation, the Nginx web server and the G Unicorn were replaced by Django run server and it is able to handle only 10 virtual connections. That means it's not ready for production. It's a DevStack, so it will only handle 10 virtual connections. We tried it for 15 to 20 and it got failed. Yeah, now I'll give it to Rathul. So we checked the number of G Unicorn workers which were running on the native installation and there were more than 10 and it's recommended that there should be two times the number of cores plus one. Along with that, since the memory consumption for G Unicorn was very high, we can cap the max request that would go to the G Unicorn worker processes. So these are the suggestions for performance testing. So now I'd like to suggest some future works which are pending. Actually, there were a lot of provisioning scripts to run on the, in order to expose the Kubernetes cluster to the external traffic, we were able to run the provisioning script for some of the services like Mongo and SQL and for provisioning scripts for other services can be created and can be run. We can also test some tutor version but that's for a single server presently. That's not for a multi-node cluster. If you want to have a single node server, then we can do that. And creating an integration network that's basically exposing our Kubernetes cluster to the external traffic. We can use the IngenX reverse proxy server for doing that or any other way which can be figured out in future. And these are the references.